Phase 7.3 PR3a-ii: happy-path interactive signing runner (multi-node, fake bus)#4076
Merged
mswilkison merged 12 commits intoJun 17, 2026
Conversation
The runner's narrow engine boundary (interactiveSigningEngine: InteractiveSessionOpen / InteractiveRound1 / NewSigningPackage / InteractiveRound2 / InteractiveAggregate) defined at the consumer under frost_native (no cgo) per the design consult - the cgo buildTaggedTBTCSignerEngine satisfies it, asserted later in the wiring layer. Plus a programmable fake engine for deterministic runner tests: per-member commitments/shares (distinct by default) and a programmable aggregate result/error for the happy and (later) the sad path. No real crypto; the engine's own suite covers that. The happy-path runner orchestration + multi-node fake-bus harness land in the next commit on this branch before the PR opens. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…node harness interactiveSigningRunner drives one node's happy-path participation over the RunnerBus: open -> collector.BeginAttempt -> round1 + broadcast commitments -> collect all -> (elected coordinator) NewSigningPackage + sign + broadcast -> RecordSigningPackage + round2 + sign ShareSubmission + broadcast + record own -> collect a share from every signer -> InteractiveAggregate -> MarkSucceeded + emit the BIP-340 signature. Design (consult + the #4075 slice-aliasing lesson): - Subscribes to the bus in the CONSTRUCTOR (before any Run broadcasts) so a node never misses a peer's first message; the harness constructs all nodes before starting them. - Every engine/collector input derives from the immutable ActiveRoastAttempt binding, never from peer messages; the node records its OWN share explicitly (no bus self-echo); envelope/payload bytes are copied at the boundary. - Elected coordinator from the binding; exactly that node builds+signs the package, everyone authenticates+retains it via the collector (Q1 boundary). - Package built from the whole responsive included set, so the aggregate needs a share from each (silent-member subsetting is the retry path's concern). Fake-engine + fake-bus harness test: N=3 run concurrently to a successful aggregate; each emits the signature and transitions its attempt to Succeeded. Passes under -race; frost_native suite + gofmt + repo-wide vet + default build green. (Fake-ignored placeholders - FROST member-identifier encoding, attempt-context fingerprint/id - are finalized at cgo wiring; the runner drives rounds with the attempt id the engine returns.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Both bots found real adversarial gaps in the happy-path runner's collection loops (the runner reads a shared mesh where any node can broadcast anything). Folded all: - collectShares (Gemini P1 framing/DoS + Codex P2 retention): now filters by msg.Attempt + includedSet, binds the authenticated transport sender to the claimed submitter (sub.SubmitterID() == msg.Sender), and authenticates + retains each peer share via RecordShareSubmission - counting ONLY accepted shares (divergent/conflicting retained for blame, not counted; keyed by msg.Sender). Previously it accepted any sender and keyed by the envelope's SubmitterID, so a node could fill an honest member's slot with garbage, drop their real share, and get them falsely blamed. - obtainSigningPackage (Gemini P1 DoS): non-coordinators now loop and accept ONLY the elected coordinator's package for this attempt; a garbage package from any node previously aborted the honest run before the real one arrived. - Root-divergence refusal (Codex P1): after RecordSigningPackage retains the package, refuse (sign nothing) when its taproot root diverges from the bound session root - signing Round2 against it would sign for the wrong tweak; the retained package is blame evidence. - Attempt isolation (Codex P2): all three collection points now ignore messages whose msg.Attempt != the bound context hash (the bus may carry concurrent or stale attempts). Tests: adversarial e2e (a garbage package from a non-elected sender + a share from a non-included member, pre-injected, are ignored and the honest run still succeeds) + a constructor-rejection table (my self-review Low). Happy path + -race + frost_native suite + gofmt + repo vet + default build green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fold Codex re-review findings on the happy-path runner. P1 (message-vs-digest binding): the constructor accepted a free-standing `message` parameter and only rejected empty input. A caller whose message diverged from the digest the attempt is bound to (ActiveRoastAttempt's MessageDigest, which the package/share envelopes commit to via the attempt hash) could have the runner open the session and FROST-sign those bytes, marking an attempt for digest A succeeded with a signature over digest B. Remove the parameter entirely and derive the signed message from the binding's MessageDigest, so the runner can never sign a message inconsistent with the attempt it is bound to. P2 (abort native attempts on early exit): once InteractiveSessionOpen succeeds the engine holds the attempt's secret nonces/session state. If Run returned early (ctx cancel, or an error before round 2 consumed the nonces) that material stayed resident. Add InteractiveSessionAbort to the engine boundary and defer a best-effort abort, suppressed only on a clean success. Tests: drop the removed "empty message" construction case; add an early-exit test (ctx expires while collecting commitments) asserting exactly one native abort fires. frost_native suite green under -race; gofmt + repo frost vet clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sharpen the attempt-context / FROST-identifier placeholder docs (Codex re-review). These fields are inert in #4076 - only the fake engine is wired, it ignores them, and no production path constructs the real cgo engine yet - but they are NOT engine-valid, and the comments now say so precisely so the deferral is auditable rather than vague: - IncludedParticipantsFingerprint / AttemptID: strict-mode validate_attempt_context recomputes both from canonical inputs (roast_included_participants_fingerprint_hex / roast_attempt_id_hex, domain-separated) and rejects a mismatch before round 1. Matching them byte-for-byte is a cross-impl derivation (the seed-divergence class); derive-in-Go vs expose-from-engine is the open fork for the real-engine attempt-context wiring increment. - memberFrostIdentifier: the engine's canonical encoding (participant_identifier_to_frost_identifier + frost_identifier_to_go_string) is the u16 as a 32-byte big-endian secp256k1 scalar, hex; the decimal placeholder would not match the member's key share. No behavior change - comments only. gofmt + frost_native vet clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Prune round-2 collector state on attempt conclusion (Codex re-review). Round2Collector's contract is that callers MUST PruneAttempt once an attempt concludes; Run began the attempt but never pruned it, so a collector reused across signing attempts retained every concluded attempt's package/share envelopes indefinitely (and stale state could interfere with a re-run of the same context). Fold PruneAttempt into the conclusion defer, run unconditionally (the attempt concludes for this runner on success OR early exit - stricter than the success-only path flagged, and matching the contract). It is a no-op when the attempt was never begun; the abort stays early-exit-only. When the blame path lands it extracts evidence into the transition bundle within Run, before this defer fires, so the prune does not race the evidence. Test: TestInteractiveSigningRunner_PrunesCollectorStateAfterSuccess - after a successful run, a re-begin of the same context hash under a DIFFERENT binding succeeds (a surviving record would return ErrRound2AttemptBindingConflict). frost_native suite green under -race; gofmt + vet clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Stop discarding equivocation evidence before the collector records it
(Codex re-review). Both paths fed the collector only the FIRST envelope per
source, so a body-different duplicate - exactly the equivocation the collector
is built to retain and emit - was dropped even though the bus deliberately
preserves body-different duplicates (it dedups only exact ones). Retention is
bounded (the collector keeps the first per source and only emits on the rest).
- Coordinator package: obtainSigningPackage returns on the first package, so
a second body-different coordinator-signed package was never recorded. Add
recordBufferedCoordinatorPackages, called from Run AFTER the authoritative
package is recorded (so duplicates record as conflicting, not
authoritative) and non-coordinators only. Drains buffered duplicates;
continuous monitoring across a real transport stays the blame path's
concern.
- Member share: collectShares short-circuited on the already-collected sender
BEFORE RecordShareSubmission, so a member's body-different second signed
share went unrecorded. Record every well-formed share first, then apply the
already-collected guard only to the aggregation count.
Tests (drive the paths deterministically via manual streams, asserting the
emitted evidence through a registered observer):
- RetainsCoordinatorPackageEquivocation -> EquivocationKindSigningPackageConflict
- RetainsMemberShareEquivocation -> EquivocationKindShareConflict, first share
still counted.
frost_native signing + roast suites green under -race; gofmt + vet clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Drain queued share duplicates after collection completes (Codex re-review). The prior fix retained a sender's body-different share only while collectShares was still collecting. But the loop exits the instant len(into) == len(included), so a duplicate queued BEHIND the share that fills the final slot was never read, never recorded, and then lost when Run aggregated and pruned - the same gap the coordinator-package path already closes with a post-return drain. Extract recordShareMessage (validate -> retain -> count first-accepted) and call it from both the collection loop and a non-blocking, buffer-bounded drain that runs once `into` is full, so queued member-equivocation evidence reaches the collector before aggregation/prune. Test: RetainsQueuedShareEquivocationAfterCollection - member 2's first share fills the final slot and its body-different duplicate is queued right behind it; the conflict is still emitted (EquivocationKindShareConflict, member 2) and the first share is still the counted one. frost_native signing suite green under -race; gofmt + vet clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bound the post-completion drains to avoid livelock (Codex re-review).
The share-drain and recordBufferedCoordinatorPackages both used a
receive-until-empty loop (for { select { case <-stream; default: return } }).
That only terminates when the channel is observed empty, so a peer that keeps
the stream non-empty - e.g. flooding body-different share submissions - starves
the default branch, and the loop neither finishes nor observes ctx. Run then
hangs instead of aggregating even though enough valid shares were collected: a
liveness DoS.
Bound both drains to the queue length at entry (for i, n := 0, len(stream); ...).
They still retain every duplicate already buffered behind the slot-filling
envelope, but a flood arriving DURING the drain is not processed (late arrivals
are the blame path's concern), so the drain always terminates promptly.
Test: DrainDoesNotLivelockUnderShareFlood - a goroutine floods the share stream
continuously while collectShares runs with a pre-filled `into` (only the drain
executes); it must return within the deadline (the unbounded loop would hang).
frost_native signing suite green under -race; gofmt + vet clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Verify the SIGNED attempt hash on received packages and shares (Codex
re-review). The runner filtered intake on msg.Attempt - the UNSIGNED outer bus
field - but the collector keys RecordSigningPackage/RecordShareSubmission by the
payload's own signed AttemptContextHash. With one Round2Collector hosting more
than one live attempt (the long-lived/reused-collector model the prune contract
assumes), a payload signed for attempt B and rewrapped in an attempt-A bus
message slipped through: the package drove A's InteractiveRound2 (signing the
wrong package), and the share was recorded under B (accepted, nil) then counted
toward A's aggregate.
Guard all three intake points with bytes.Equal(payload.AttemptContextHash,
contextHash[:]):
- obtainSigningPackage: unmarshal and verify before returning; keep waiting on
mismatch (so a cross-attempt package neither drives the flow nor DoSes it).
- recordBufferedCoordinatorPackages: skip a mismatched package in the drain.
- recordShareMessage: skip before record/count.
Tests (each makes a second attempt B live in the same collector so the payload
would otherwise be accepted - both verified to FAIL without the guard):
- RejectsCrossAttemptShare: a B-signed share wrapped as A is not counted.
- RejectsCrossAttemptPackageInDrain: a B-signed package wrapped as A is not
recorded (no B equivocation event emitted from A's drain).
frost_native signing + roast suites green under -race; gofmt + vet clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Prune the collector only on success, not on failure (Codex re-review). The conclusion defer pruned on every non-success exit. But failure paths retain signed evidence the blame/retry path must read: the root-divergence return fires AFTER RecordSigningPackage, and (once it lands) an aggregate share-verification failure leaves the recorded package/shares. Pruning there made CoordinatorPackageProofs / ClassifyCandidateCulprits report ErrRound2UnknownAttempt, losing the proof just retained. (This over-corrected the earlier prune fold, which only needed prune-on-success.) Split the defer by outcome: SUCCESS prunes the collector state (no blame/retry needs it); FAILURE / early exit aborts the engine session to drop resident secret nonces but leaves the collector intact for the caller to snapshot, propagate, and then prune. Test: PreservesEvidenceOnAggregateFailure - force every node's aggregate to fail after package/shares are recorded, then assert each collector still holds the attempt (a conflicting re-begin is rejected). Verified to FAIL under the old always-prune defer. Success-prune and early-exit-abort tests still pass. frost_native signing + roast suites green under -race; gofmt + vet clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Document the runner's transport assumptions (holistic self-review). A pass over the assembled lifecycle (success-prune vs failure-preserve defer, the two bounded equivocation drains, the cross-attempt signed-hash guards, and evidence retention) found the logic mutually consistent, but surfaced two reliance points the in-process bus happens to satisfy and the real pkg/net transport must honor: authenticated RunnerMessage.Sender (every sender-based filter depends on it) and non-blocking/backpressure delivery (the runner does not fully drain every stream, so a blocking transport would let a flooding peer stall an honest broadcaster). Also notes that unsigned round-1 commitments degrade to a round-2 retry rather than a breach under authenticated senders. Doc only - no behavior change. gofmt + vet clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
d26eb29
into
feat/frost-schnorr-migration-scaffold
16 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
The happy-path interactive signing runner — the orchestrator that finally runs the interactive FROST flow end-to-end. It drives one node's participation over the
RunnerBus(#4075) and, on success, callsMarkSucceeded(#4074) and emits the BIP-340 signature.frost_native(no cgo), exercised with a programmable fake engine + multi-node fake-bus harness — orchestration correctness without real crypto, per the consult.Two commits: (1) the narrow
interactiveSigningEngineboundary + programmable fake engine; (2) the runner orchestration + harness + happy test.Flow (
interactiveSigningRunner.Run)InteractiveSessionOpen → collector.BeginAttempt → round1 + broadcast commitments → collect all → (elected coordinator) NewSigningPackage + sign + broadcast → RecordSigningPackage + round2 + sign ShareSubmission + broadcast + record own → collect a share from every signer → InteractiveAggregate → MarkSucceeded + return signature.Design (consult + the #4075 slice-aliasing lesson)
Runbroadcasts) so a node never misses a peer's first message; the harness constructs all nodes before starting them.ActiveRoastAttemptbinding, never from peer messages; the node records its own share explicitly (no bus self-echo); envelope/payload bytes are copied at the boundary.Tests
N=3 nodes run concurrently to a successful aggregate; each emits the signature and transitions its attempt to
Succeeded. Passes under-race. A fix surfaced while writing it:NoOpSignersignsnil, which the envelopes'Marshalrejects — so the test uses a fixed non-empty signer + the accept-anything verifier (exercises the envelope/signing plumbing, not real crypto).Validation
frost_nativesuite +-race+gofmt+ repo-widefrost_nativevet + default build green.client-*CI builds default tags, notfrost_native; validated locally.Deferred (fake-ignored, finalized at cgo wiring)
The FROST member-identifier encoding and the attempt-context fingerprint/id are placeholders here (the fake ignores them, and the runner drives rounds with the attempt id the engine returns); their engine-valid derivation is settled when the real cgo engine is wired.
Next
PR3b — the sad/blame path: fault-inject a bad share →
InteractiveAggregatereturns culprits →ClassifyCandidateCulprits(engine-backed verifier) →RecordEvidence→AggregateBundle→NextAttemptexcludes the culprit and the next attempt re-elects/retries.🤖 Generated with Claude Code