fix: builder payment quorum integer overflow at mainnet-scale stake#9350
fix: builder payment quorum integer overflow at mainnet-scale stake#9350parithosh wants to merge 1 commit into
Conversation
`getBuilderPaymentQuorumThreshold` computed `totalActiveBalanceIncrements * EFFECTIVE_BALANCE_INCREMENT` as a JS `number`. The intermediate gwei product crosses `Number.MAX_SAFE_INTEGER` (2^53 - 1) once total active stake passes ~9M ETH, silently losing precision. Other clients compute the spec-exact uint64 result, so a Gloas-enabled mainnet would see Lodestar diverge on the post-state root at the first epoch transition that promotes builder payments through the quorum check, forking Lodestar nodes off the network. Two sites overflow: - `getBuilderPaymentQuorumThreshold` (gloas.ts) - the threshold itself. - `BuilderPendingPayment.weight` accumulator in `processAttestationsAltair.ts` - per-slot gwei weight is also in the 10^16 range at mainnet-scale stake. Fix: - Switch `BuilderPendingPayment.weight` SSZ field from `UintNum64` to `UintBn64`. On-wire encoding is identical (uint64 LE); local TS type becomes `bigint`, matching the spec's domain. - Use bigint arithmetic in `getBuilderPaymentQuorumThreshold` and in the per-slot weight accumulator. The threshold function now returns `bigint`; the comparison in `processBuilderPendingPayments` flows through unchanged. Scope: Gloas-only. `BuilderPendingPayment`, the threshold function, and the gloas branch in `processAttestationsAltair` are all unreachable pre-Gloas. Bug is dormant on `glamsterdam-devnet-3` (~50k ETH stake, two orders of magnitude below the precision boundary) and on any network without Gloas activated, so this can land before any divergence risk materialises. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request migrates builder weight and quorum threshold calculations from numbers to bigints to prevent precision loss when the total active stake exceeds approximately 9 million ETH. Key changes include updating the builderWeightMap in attestation processing, modifying getBuilderPaymentQuorumThreshold to use bigint arithmetic, and updating the SSZ type for BuilderPendingPayment. New unit tests were added to verify the quorum threshold calculations across various stake levels. I have no feedback to provide.
lodekeeper
left a comment
There was a problem hiding this comment.
LGTM. Tightly scoped fix for the f64 precision boundary at mainnet-scale stake.
Spec match: arithmetic order in getBuilderPaymentQuorumThreshold mirrors the spec ((total_active_balance // SLOTS_PER_EPOCH * NUMERATOR) // DENOMINATOR); bigint floor-division agrees with Python //.
SSZ retype is the right tool. Pushing bigint into the type system means weight reads/writes stay correct by construction across all consumers. SSZ wire format is bit-identical (UintNum64 and UintBn64 both encode as uint64 LE), so state/block/gossip roots are unchanged and devnet-3 nodes at ~50k ETH stake produce identical state roots with and without this patch.
Consumer audit (all covered):
getBuilderPaymentQuorumThresholdhas one call site (processBuilderPendingPayments.ts:10) —payment.weight >= quorumcomparison flows through unchanged with both sidesbigint.payment.weightproduction reads/writes:processAttestationsAltair.ts:153,162andprocessBuilderPendingPayments.ts:14— all handled.- Default construction via
BuilderPendingPayment.defaultViewDU()inprocessProposerSlashing.ts:50andprocessParentExecutionPayload.ts:109continues to work (SSZ default forUintBn64is0n).
Per-attestation paymentWeightToAdd stays as number safely (bounded by committee_size × max-increment, well under 2^53) and is converted to bigint before the multiply by EFFECTIVE_BALANCE_INCREMENT — correct.
Non-blocking suggestion (test coverage gap): the new gloas.test.ts covers the threshold path well, but the processAttestationsAltair accumulator path has no direct unit test at mainnet-scale stake. Harder to set up (needs mock state + attestations) so reasonable to leave for a follow-up; spec tests will exercise it indirectly.
| builderWeightMap.get(builderPendingPaymentIndex) ?? | ||
| (state as CachedBeaconStateGloas).builderPendingPayments.get(builderPendingPaymentIndex).weight; | ||
| const updatedWeight = existingWeight + paymentWeightToAdd * EFFECTIVE_BALANCE_INCREMENT; | ||
| const updatedWeight = existingWeight + BigInt(paymentWeightToAdd) * BigInt(EFFECTIVE_BALANCE_INCREMENT); |
There was a problem hiding this comment.
🟢 Minor: BigInt(EFFECTIVE_BALANCE_INCREMENT) is re-evaluated every iteration of the attestation loop. Consider hoisting to a module-level constant for consistency with how other bigint constants are handled, e.g.:
| const updatedWeight = existingWeight + BigInt(paymentWeightToAdd) * BigInt(EFFECTIVE_BALANCE_INCREMENT); | |
| const updatedWeight = existingWeight + BigInt(paymentWeightToAdd) * EFFECTIVE_BALANCE_INCREMENT_BN; |
with const EFFECTIVE_BALANCE_INCREMENT_BN = BigInt(EFFECTIVE_BALANCE_INCREMENT); at module scope. Probably negligible at runtime (V8 caches small bigint constants), but cheap and idiomatic.
|
@lodekeeper The float would loose precision in the calculation if EFFECTIVE_BALANCE_INCREMENT (1_000_000_000) has a slightly different value, like a prime factor (999_999_937), or effectively anything that is not dividable by 2 multiple times, so the binary representation does not end with 9 zero bits as it does. claude review: |
|
@pk910 you're right — thanks for the correction. The math: Numerically confirmed (BigInt vs f64): So the "overflows at ~9M ETH" framing rests on the naive That said, I still think the PR is worth keeping as defence-in-depth:
@parithosh — worth amending the PR description / commit message to drop the "9M ETH overflow" claim and reframe as something like "remove dependence on f64 trailing-zero precision invariant" or "make builder payment quorum consensus-safe by construction", otherwise the misleading framing will live on in the merged history. |
Summary
getBuilderPaymentQuorumThreshold(inpackages/state-transition/src/util/gloas.ts) computestotalActiveBalanceIncrements * EFFECTIVE_BALANCE_INCREMENTas a JSnumber. The intermediate gwei product crossesNumber.MAX_SAFE_INTEGER(2^53 - 1 ≈ 9.007 × 10^15) once total active stake passes ~9 M ETH, silently losing precision. Other CL clients (Prysm, Lighthouse, Teku, Nimbus, Grandine) compute the spec-exact uint64 result, so a Gloas-enabled mainnet would see Lodestar diverge on the post-state root at the first epoch transition that promotes any builder payment near the quorum boundary, forking Lodestar nodes off the network.The same overflow class also affects the per-slot
BuilderPendingPayment.weightaccumulator inprocessAttestationsAltair.ts: cumulative attesting weight against a single slot can exceed 2^53 gwei at mainnet stake.Why now
Bug is dormant on
glamsterdam-devnet-3(~50 k ETH stake — two orders of magnitude below the 2^53 boundary, so f64 and uint64 agree to the bit) and on every other current network because Gloas isn't activated anywhere yet. It would activate on day one of Gloas mainnet with no attacker required — the network's stake itself is the trigger. Landing the fix before any Gloas activation costs nothing; landing it after costs a fork.Arithmetic
glamsterdam-devnet-3(~50 k ETH)Changes
packages/types/src/gloas/sszTypes.tsBuilderPendingPayment.weight:UintNum64→UintBn64. On-wire encoding identical (uint64 LE); local TS type becomesbigint, matching the spec's domain.packages/state-transition/src/util/gloas.tsgetBuilderPaymentQuorumThresholdrewritten to use bigint intermediates; return typenumber→bigint.packages/state-transition/src/block/processAttestationsAltair.tsMap<number, number>→Map<number, bigint>; arithmetic in bigint.packages/state-transition/src/block/processExecutionPayloadBid.tsweight: 0literal →0n.packages/state-transition/test/unit/util/gloas.test.ts(new)The comparison
payment.weight >= quoruminprocessBuilderPendingPayments.ts:14needed no code change — both sides are nowbigint, comparison flows through.Scope
Fully Gloas-only:
BuilderPendingPaymentis a Gloas-only SSZ container; not embedded pre-Gloas.processBuilderPendingPaymentsis registered only in the Gloas branch (epoch/index.ts:166-168).processAttestationsAltair.ts:146is gated behindif (fork >= ForkSeq.gloas).getBuilderPaymentQuorumThresholdandprocessExecutionPayloadBidare Gloas-only.UintBn64andUintNum64produce bit-identical SSZ bytes — state roots / block roots / gossip serialisation unchanged.Test plan
pnpm --filter @lodestar/types build— cleanpnpm --filter @lodestar/state-transition exec vitest run --project unit— 244/244 passed, includes newgloas.test.tspnpm linton@lodestar/typesand@lodestar/state-transition— cleanpnpm download-spec-tests && pnpm test:spec) — not run locally; CI will coverglamsterdam-devnet-3once merged — expected behaviour unchanged at this stakepnpm check-typesrepo-wide — not run; note thatpackages/state-transition/src/util/validator.tshas 4 pre-existing build errors onglamsterdam-devnet-3(andunstable) introduced by87cbe69c66(EIP-8061 churn), unrelated to this PRAI disclosure
Fix drafted with assistance from Claude (Opus 4.7, 1 M context). Bug originally surfaced from an audit-style code-read of the Gloas state-transition path; arithmetic, fix design, and PR text reviewed and authored interactively with parithosh.
🤖 Generated with Claude Code