feat: integrate fork-choice compliance spec test suite by GrapeBaBa · Pull Request #9314 · ChainSafe/lodestar

GrapeBaBa · 2026-05-01T06:26:03Z

Summary

Wire the consensus-specs Fork Choice Compliance suite (ethereum/consensus-specs#3831) into the existing forkChoiceTest runner so it runs alongside fork_choice and sync runners under test:spec:* once fixtures are downloaded.
Add getViableHeads() to fork-choice and verify the compliance suite's viable_for_head_roots_and_weights invariant against it (the new check that motivates the suite).
Add scripts/download-compliance-fc-tests.sh for fetching the consensus-specs Compliance Tests workflow artifact (the fixtures are NOT bundled with the standard release tarball).

Marked draft because there are real fork-choice gaps (notably consensus-specs#4807) that are NOT addressed here — see Known follow-ups below.

Why a separate download path

release.yml on consensus-specs only invokes make test ... reftests=true, which builds the standard reftests bundle. Compliance tests are produced by make comptests and published only via the daily comptests.yml workflow as a transient GH Actions artifact (<config>.tar.gz). They are not in the releases/download/<tag>/<preset>.tar.gz URL space the existing download-spec-tests script reads.

The new script:

supports --tarball <path> / --run-id <id> / auto-fetch latest successful run
handles both raw tar.gz (current workflow output, archive: false) and zip-wrapped responses from actions/artifacts/<id>/zip
extracts into packages/beacon-node/spec-tests/tests/..., the same root the release tarball uses — paths don't collide because the compliance generator only emits fork_choice_compliance/ subtrees

Cache lives at \$LODESTAR_COMPLIANCE_FC_CACHE (default /var/tmp/lodestar_compliance_fc_cache).

Test-runner accommodations

bls_setting: 2 — every compliance fixture uses placeholder signatures. Pass validSignatures: testcase.meta?.bls_setting !== BigInt(1) so verification short-circuits. Standard fork_choice fixtures use bls_setting: 1 so behavior there is unchanged.
BLOCK_ERROR_ALREADY_KNOWN — compliance fixtures intentionally re-import the same block (dup_shift mutations in their meta.yaml). Spec semantics for on_block(store, known_block) is a no-op success. Production block import correctly rejects with ALREADY_KNOWN; the runner now treats that as success only when the step is valid: true. Production block import path is untouched.
Cross-epoch attestation shuffling — on_attestation decodes aggregation_bits using the state at the attestation's target checkpoint, not the head state. The runner now resolves the right shuffling via ShufflingCache + regen (mirroring the production validation path) instead of headState.epochCtx.getIndexedAttestation, which only worked when the attestation's epoch happened to be in the head's epoch cache (±1 epoch).

Two new check fields supported:

viable_for_head_roots_and_weights (consensus-specs#3831) — compared against getViableHeads(). Both sides are sorted by root since spec doesn't fix order.
head_payload_status (gloas) — mapped between our internal enum ordering (PENDING=0, EMPTY=1, FULL=2) and spec ordering (EMPTY=0, FULL=1, PENDING=2).

`getViableHeads()` design

A node is a filter_block_tree leaf iff (a) it is viable AND (b) no descendant is viable. maybeUpdateBestChildAndDescendant only sets bestChild to a child that nodeLeadsToViableHead, so bestChild === undefined ⟺ no viable descendant. Combined with nodeIsViableForHead for (a), this matches the spec exactly. Same approach as Teku's ForkChoiceStrategy.getChainHeads.

Weights are returned in Gwei (compliance fixtures use Gwei). Internal storage is in EFFECTIVE_BALANCE_INCREMENT units, multiplied at read time. Note the proposer-boost score contribution is also stored in increments and may round down on minimal preset (102 vs 102.4 ETH); attestation weight is exact, the rounding only surfaces while boost is active. Mainnet boost values are integer ETH so this is a non-issue there.

Current pass rate (fulu, small config, 1472 cases)

Tests: 1219 failed | 253 passed (17.2%)

Per-handler:

Handler	Pass	Fail	Pass%
block_cover_test	180	12	94%
shuffling_test	22	234	9%
block_tree_test	38	474	7%
block_weight_test	8	248	3%
attester_slashing_test	4	124	3%
invalid_message_test	1	127	1%

Failure breakdown:

Count	Class
~973	`Invalid proposer boost root` — see follow-up #1
~14	`Invalid viable heads` — proposer-boost score rounding on minimal preset
~1	`FORKCHOICE_ERROR_INVALID_ATTESTATION`

Known follow-ups (not in this PR)

consensus-specs#4807 (update_proposer_boost_root proposer-index check) — the dominant class of failures. Spec now requires block.proposer_index == get_beacon_proposer_index(head_state) before granting boost; this applies to all forks, not just gloas. PR feat: implement should_apply_proposer_boost for gloas #9233 implements the gloas-equivocation half of docs: update docs/usage/local.md #4807 (should_apply_proposer_boost) but explicitly leaves pre-gloas boost behavior unconditional. Closing the boost-root gap is a separate, higher-risk change that should be reviewed independently.
viable_for_head_roots_and_weights weight tolerance — minimal-preset proposer-boost score lossy increment rounding causes false-positive weight mismatches. Could relax to set-equality on roots only, or add a ±1 increment tolerance.
Gloas SSZ deserialization — running gloas compliance against the same artifact triggers SSZ container shape mismatches. The artifact is built from consensus-specs master; our @lodestar/types and spec-tests-version.json (v1.7.0-alpha.5) lag. Belongs in a separate types-bump PR.

Relation to PR #9290

#9290 takes a different infra path for the same goal: extract the runner into forkChoiceRunner.ts, register a separate compliance_fork_choice test file, extract fixtures into a sibling spec-tests-compliance/<config>/ directory. It explicitly skips the viable_for_head_roots_and_weights check, which is the main differentiator from this PR.

Both approaches surface the same underlying fork-choice gaps; merging strategy is up to maintainers — happy to rebase on top of #9290 if its infra is preferred.

CI

Intentionally not wired. The runner cleanly skips the compliance suite when fixtures are absent, so existing test:spec:* jobs are unaffected. Devs and reviewers run on demand:

pnpm download-compliance-fc-tests              # auto-fetch small via gh
pnpm download-compliance-fc-tests --tarball ~/Downloads/small.tar.gz
pnpm vitest run --project spec-minimal -t fork_choice_compliance test/spec/presets/fork_choice.test.ts

Test plan

pnpm check-types clean (beacon-node, fork-choice, state-transition)
pnpm lint clean
Existing fork_choice and sync runners unchanged in behavior — pure additive runner registration + factory annotations
fulu compliance against latest comptests small.tar.gz artifact: 253/1472 pass, no infra-side failures, all failures attributable to spec gaps in Known follow-ups

AI Assistance Disclosure

Used Claude Code.

gemini-code-assist

Code Review

This pull request introduces support for the fork-choice compliance test suite by adding a fixture download script and updating the test runner to handle cross-epoch attestations and duplicate block imports. It also implements a getViableHeads method in the fork-choice logic to allow verification of the internal filtered block tree state. Feedback was provided to use bigint instead of number for weight calculations in getViableHeads to prevent potential precision loss on Mainnet.

gemini-code-assist · 2026-05-01T06:27:37Z

+  getViableHeads(currentSlot: Slot): {root: RootHex; weight: number}[] {
+    const result: {root: RootHex; weight: number}[] = [];
+    for (const node of this.nodes) {
+      if (node.bestChild === undefined && this.nodeIsViableForHead(node, currentSlot)) {
+        result.push({root: node.blockRoot, weight: node.weight * EFFECTIVE_BALANCE_INCREMENT});
+      }
+    }
+    return result;
+  }


The current implementation of getViableHeads uses number for weights, which leads to precision loss on Mainnet.

In Lodestar, node.weight is stored in EFFECTIVE_BALANCE_INCREMENT units (1 increment = 1 ETH = 10^9 Gwei). On Mainnet, the total active balance is currently ~33M ETH. Multiplying 33,000,000 * 1,000,000,000 results in 3.3 * 10^16, which exceeds Number.MAX_SAFE_INTEGER (9,007,199,254,740,991 ≈ 9 * 10^15).

To maintain accuracy for production use cases (even if compliance tests currently use smaller weights), this should return bigint.

Suggested change

getViableHeads(currentSlot: Slot): {root: RootHex; weight: number}[] {

const result: {root: RootHex; weight: number}[] = [];

for (const node of this.nodes) {

if (node.bestChild === undefined && this.nodeIsViableForHead(node, currentSlot)) {

result.push({root: node.blockRoot, weight: node.weight * EFFECTIVE_BALANCE_INCREMENT});

}

}

return result;

}

getViableHeads(currentSlot: Slot): {root: RootHex; weight: bigint}[] {

const result: {root: RootHex; weight: bigint}[] = [];

for (const node of this.nodes) {

if (node.bestChild === undefined && this.nodeIsViableForHead(node, currentSlot)) {

result.push({root: node.blockRoot, weight: BigInt(node.weight) * BigInt(EFFECTIVE_BALANCE_INCREMENT)});

}

}

return result;

}

gemini-code-assist · 2026-05-01T06:27:37Z

+  getViableHeads(): {root: RootHex; weight: number}[] {
+    return this.protoArray.getViableHeads(this.fcStore.currentSlot);
+  }


Update the return type to bigint to match the fix in ProtoArray and avoid precision loss.

Suggested change

getViableHeads(): {root: RootHex; weight: number}[] {

return this.protoArray.getViableHeads(this.fcStore.currentSlot);

}

getViableHeads(): {root: RootHex; weight: bigint}[] {

return this.protoArray.getViableHeads(this.fcStore.currentSlot);

}

gemini-code-assist · 2026-05-01T06:27:37Z

+   * Retrieves all viable-for-head leaves of the filtered_block_tree along with their weights.
+   * Used by the fork-choice compliance test suite (`viable_for_head_roots_and_weights`).
+   */
+  getViableHeads(): {root: RootHex; weight: number}[];


Update the interface to return bigint for weights.

Suggested change

getViableHeads(): {root: RootHex; weight: number}[];

getViableHeads(): {root: RootHex; weight: bigint}[];

gemini-code-assist · 2026-05-01T06:27:37Z

+                const expected = step.checks.viable_for_head_roots_and_weights
+                  .map((entry) => ({root: entry.root, weight: bnToNum(entry.weight)}))
+                  .sort((a, b) => a.root.localeCompare(b.root));
+                const actual = (chain.forkChoice as ForkChoice)
+                  .getViableHeads()
+                  .map(({root, weight}) => ({root, weight}))
+                  .sort((a, b) => a.root.localeCompare(b.root));
+                expect(actual).toEqualWithMessage(expected, `Invalid viable heads at step ${i}`);


Since getViableHeads should return bigint to avoid precision loss, the test comparison should also operate on bigint values directly instead of converting them to number via bnToNum.

Suggested change

const expected = step.checks.viable_for_head_roots_and_weights

.map((entry) => ({root: entry.root, weight: bnToNum(entry.weight)}))

.sort((a, b) => a.root.localeCompare(b.root));

const actual = (chain.forkChoice as ForkChoice)

.getViableHeads()

.map(({root, weight}) => ({root, weight}))

.sort((a, b) => a.root.localeCompare(b.root));

expect(actual).toEqualWithMessage(expected, `Invalid viable heads at step ${i}`);

const expected = step.checks.viable_for_head_roots_and_weights

.map((entry) => ({root: entry.root, weight: entry.weight}))

.sort((a, b) => a.root.localeCompare(b.root));

const actual = (chain.forkChoice as ForkChoice)

.getViableHeads()

.sort((a, b) => a.root.localeCompare(b.root));

expect(actual).toEqualWithMessage(expected, `Invalid viable heads at step ${i}`);

The fork-choice compliance suite (consensus-specs#3831) introduces a `viable_for_head_roots_and_weights` check that asserts the leaves and weights of the internal `filter_block_tree(store, justified.root)`. Head equivalence alone is too weak to catch filtered-tree regressions. Add `IForkChoice.getViableHeads()` that returns the viable-for-head leaves of the filtered tree paired with their weights: - A node is a filter_block_tree leaf iff it is viable AND has no viable descendants. `maybeUpdateBestChildAndDescendant` only sets `bestChild` to a child that `nodeLeadsToViableHead`, so `bestChild === undefined` iff no viable descendant. Combined with `nodeIsViableForHead`, this matches the spec exactly. Same approach as Teku's `ForkChoiceStrategy.getChainHeads`. - Weight is converted from internal increment units to Gwei on read (compliance fixtures expect Gwei). Note the proposer-boost score contribution is also stored in increments and may round down on minimal preset (102 vs 102.4 ETH); attestation weight is exact, the rounding only surfaces while boost is active. Mainnet boost values are integer ETH so this is a non-issue there.

Wire the consensus-specs Fork Choice Compliance suite (ChainSafe#3831) into the existing `forkChoiceTest` runner. The on-disk layout matches the standard spec-test layout (`tests/<preset>/<fork>/fork_choice_compliance/<handler>/<suite>/<case>/`), so it slots in alongside `fork_choice` and `sync` runners. Three test-only accommodations the compliance fixtures require: 1. `bls_setting: 2` — every compliance fixture uses placeholder signatures. Pass `validSignatures: testcase.meta?.bls_setting !== BigInt(1)` to `chain.processBlock` so verification short-circuits. Standard `fork_choice` fixtures use `bls_setting: 1` so behavior there is unchanged. 2. `BLOCK_ERROR_ALREADY_KNOWN` — compliance fixtures intentionally re-import the same block (`dup_shift` mutations in their `meta.yaml`). Spec semantics for `on_block(store, known_block)` is a no-op success. Production block import correctly rejects with ALREADY_KNOWN; this runner treats that case as success only when the step is `valid: true`. 3. Cross-epoch attestation shuffling — `on_attestation` decodes aggregation_bits using the state at the attestation's target checkpoint, not the head state. The runner now resolves the right shuffling via ShufflingCache + regen (mirroring the production validation path) instead of `headState.epochCtx.getIndexedAttestation`, which only worked when the attestation's epoch happened to be in the head's epoch cache (±1 epoch) and broke on cross-epoch fork attestations surfaced by the compliance suite. Adds support for two compliance-only check fields: - `viable_for_head_roots_and_weights` (consensus-specs#3831): compared via `getViableHeads()`. Both sides are sorted by root before comparison since the spec doesn't fix order. - `head_payload_status` (gloas): mapped between our internal enum ordering (PENDING=0, EMPTY=1, FULL=2) and spec ordering (EMPTY=0, FULL=1, PENDING=2). Pass rate against the latest comptests workflow `small.tar.gz` artifact: fulu/fork_choice_compliance: 253/1472 cases pass (17.2%) Top remaining failures: - ~80% `Invalid proposer boost root` — consensus-specs#4807 introduced a `block.proposer_index == get_beacon_proposer_index(head_state)` guard in `update_proposer_boost_root` that we do not yet implement; affects all forks (not just gloas equivocation handling). Tracked for follow-up alongside ChainSafe#9233. - ~1% `Invalid viable heads` — proposer-boost rounding on minimal preset (see `getViableHeads()` weight note).

Add `scripts/download-compliance-fc-tests.sh` plus an npm script wrapper (`pnpm download-compliance-fc-tests`) for fetching consensus-specs Fork Choice Compliance test fixtures. These fixtures are NOT in the standard `download-spec-tests` bundle — the consensus-specs `release.yml` workflow does not invoke `make comptests`. Instead they are produced by the scheduled "Compliance Tests" workflow (`comptests.yml`) and published only as a transient GitHub Actions artifact (`<config>.tar.gz`). The script mirrors Prysm's `hack/compliance-fc-report.sh` resolution order and handles three modes: --tarball <path> Use a local .tar.gz file --run-id <id> Download artifact from a specific GH Actions run (default) Auto-fetch the latest successful run via `gh` The artifact is extracted into `packages/beacon-node/spec-tests/` alongside the release-tarball output. Top-level paths in the artifact are `tests/<preset>/<fork>/fork_choice_compliance/...`, which do not collide with anything emitted by the standard release tarball, so extraction is purely additive — no path remapping needed in the runner. The GH artifact API may return either raw `.tar.gz` (when `upload-artifact` used `archive: false`, current workflow behavior) or a zip-wrapped one. The script sniffs the first two magic bytes and unwraps the zip case if needed. Default config is `small` (~89 MB compressed, ~950 MB extracted, ~2900 cases). Cache lives at `$LODESTAR_COMPLIANCE_FC_CACHE` (default `/var/tmp/lodestar_compliance_fc_cache`). CI is intentionally not wired in this PR — the runner cleanly skips the compliance suite when the fixtures are absent, so existing `test:spec:*` jobs are unaffected. Devs and reviewers run on demand: pnpm download-compliance-fc-tests pnpm test:spec:minimal -t fork_choice_compliance

gemini-code-assist Bot reviewed May 1, 2026

View reviewed changes

GrapeBaBa force-pushed the gr/feat-fork-choice-compliance branch 2 times, most recently from 348726d to ae8be0c Compare May 1, 2026 07:21

GrapeBaBa added 3 commits May 1, 2026 15:43

GrapeBaBa force-pushed the gr/feat-fork-choice-compliance branch from ae8be0c to a31929d Compare May 1, 2026 07:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: integrate fork-choice compliance spec test suite#9314

feat: integrate fork-choice compliance spec test suite#9314
GrapeBaBa wants to merge 3 commits into
ChainSafe:unstablefrom
GrapeBaBa:gr/feat-fork-choice-compliance

GrapeBaBa commented May 1, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 1, 2026

Uh oh!

gemini-code-assist Bot May 1, 2026

Uh oh!

gemini-code-assist Bot May 1, 2026

Uh oh!

gemini-code-assist Bot May 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	getViableHeads(): {root: RootHex; weight: number}[];
	getViableHeads(): {root: RootHex; weight: bigint}[];

Uh oh!

Conversation

GrapeBaBa commented May 1, 2026

Summary

Why a separate download path

Test-runner accommodations

getViableHeads() design

Current pass rate (fulu, small config, 1472 cases)

Known follow-ups (not in this PR)

Relation to PR #9290

CI

Test plan

AI Assistance Disclosure

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 1, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`getViableHeads()` design