sync: SVS v3 large-group revision (mhash, PARTIAL, announce+pull)#190
sync: SVS v3 large-group revision (mhash, PARTIAL, announce+pull)#190Taranum01 wants to merge 9 commits into
Conversation
Implements the approved large-group SVS revision with backward-compatible defaults (SyncVectorThreshold=0). Adds mhash and PARTIAL/announce+pull paths, 32=sv publish/pull recovery, tests, and a large-sync example. Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Stop blind pulls to unpublished 32=sv prefixes on inline FULL mhash mismatch (spec §5.6: merge inline first). Clone SvMap before encoding outside the mutex to avoid concurrent map mutation panics. Co-authored-by: Cursor <cursoragent@cursor.com>
DV prefix-table SVS uses default SyncVectorThreshold=0. Recovery was still forcing announce-only sync on mhash mismatch, breaking step-2 prefix propagation while router reachability looked fine. Gate large-group recovery behind threshold > 0 for true legacy behavior. Co-authored-by: Cursor <cursoragent@cursor.com>
DV prefix-table sync uses default threshold 0. Emit StateVector-only SvsData (no mhash/VectorType) so step-2 SVS matches pre-revision behavior on the wire while large-group features stay behind threshold > 0. Co-authored-by: Cursor <cursoragent@cursor.com>
Adds topo.min.conf (6 nodes) and make e2e-local for reliable local validation on Mac Docker/Colima. Full sprint topo remains make e2e for CI. Co-authored-by: Cursor <cursoragent@cursor.com>
Publish the revision spec alongside the std/sync implementation in PR named-data#190. Co-authored-by: Cursor <cursoragent@cursor.com>
| @@ -0,0 +1,16 @@ | |||
| # Small topology for local e2e (6 nodes, minimal link delays). | |||
There was a problem hiding this comment.
Can we not introduce new topologies in e2e tests?
There was a problem hiding this comment.
Removed in 79b79ee — topo.min.conf, make e2e-local, and the docker script were local Mac dev helpers only. CI continues to use topo.sprint.conf.
| @@ -0,0 +1,21 @@ | |||
| #!/usr/bin/env bash | |||
There was a problem hiding this comment.
Can you explain why we need this extra things?
There was a problem hiding this comment.
Removed in 79b79ee for the same reason — not needed in the upstream repo.
| @@ -0,0 +1,82 @@ | |||
| package face | |||
There was a problem hiding this comment.
I am completely lost, we are revising the SVS protocol. Why invent the multicast face here?
There was a problem hiding this comment.
Fair point. Removed std/engine/face/multicast_face.go in 79b79ee. The in-process mesh tests now use a test-only multicast hub in svs_mesh_test.go (same coverage, no new production face type).
| } | ||
| svsData = buildAnnounceSvsData(stateSnap, ref) | ||
| } else { | ||
| svsData = buildSvsDataForSend( |
There was a problem hiding this comment.
Should use arg struct to avoid long parameter list.
There was a problem hiding this comment.
Done in 79b79ee — buildSvsDataForSend now takes a svsSendInput struct.
| mtimeSnap, | ||
| ) | ||
| } | ||
| logSyncSend(s, reason, svsData) |
There was a problem hiding this comment.
Is this a leftover from your debugging?
There was a problem hiding this comment.
Yes — removed in 79b79ee. Deleted svs_debug.go and the logSyncSend / logSyncRecv calls on the send/recv path.
| } | ||
| return exceedsSyncThreshold(threshold, inlineFullSize(state)) | ||
| } | ||
|
|
There was a problem hiding this comment.
I worry those helper functions are not helping with readability and mainainability.
There was a problem hiding this comment.
Addressed in 79b79ee — merged svs_announce.go into svs_pull.go (announce+pull recovery) and svs_encode.go (inline FULL/PARTIAL build) to reduce file sprawl.
| @@ -0,0 +1,89 @@ | |||
| package sync | |||
There was a problem hiding this comment.
I don't see it is necessary to have this file -- if this is only for debugging.
| @@ -0,0 +1,157 @@ | |||
| package sync | |||
There was a problem hiding this comment.
Can you describe what this test is about?
There was a problem hiding this comment.
Documented in 79b79ee — renamed to svs_mesh_test.go with a file-level comment. It is an in-process multi-node SVS harness (real engine + SvSync per node, test-only multicast hub) that exercises small-group sync, PARTIAL publication, and announce+pull recovery without mininet/NFD.
Remove local-only e2e topology and docker runner. Drop svs_debug and production multicast_face; keep mesh integration tests with a test-only hub in svs_mesh_test.go. Merge announce helpers into svs_pull, use svsSendInput for buildSvsDataForSend, and remove debug send logging. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Thanks for the review, @tianyuan129 — addressed in 79b79ee:
Also updated the PR description:
|
Remove dead large-sync debug flag and tuName helper, clarify threshold-0 comments, trim pull-path debug logging, rename announce tests to svs_pull_test, and test pullRefFromSyncDataWire with a signed Sync Data wire. Co-authored-by: Cursor <cursoragent@cursor.com>
|
Pushed 60474b3 — minor style polish only (no logic changes): aligned `large-sync` with other SVS examples, clarified threshold-0 comments, trimmed pull-path debug logs, renamed `svs_announce_test.go` → `svs_pull_test.go`, and fixed `TestPullRefFromSyncDataWire` to use a real Sync Data wire. |
Summary
Implements the SVS v3 large-group revision (approved by Tianyuan Yu): inline FULL/PARTIAL sync, membership hash (
mhash), and announce+pull recovery via published full vectors at32=sv/<version>.size ≤ thresholdsize > thresholdmhashmismatchBackward compatible:
SyncVectorThresholddefaults to0(legacy mode: StateVector-only wire, nomhash/VectorType, no announce+pull recovery — matches pre-revision DV prefix-table SVS). Large-group modes activate whenthreshold > 0.Spec:
docs/svs-v3-revision.mdMain changes
MemberSetHash,VectorType,SvsDataRefonSvsData(std/ndn/svs/v3)ComputeMhash, PARTIAL encode, publish at32=sv, announce-only Sync, pull + merge (std/sync)std/examples/svs/large-sync(--threshold)std/sync/*_test.go, seesvs_mesh_test.go)Open item
Mixed-version interoperability (revised vs plain SVS v3 peers in one group) — spec §8; not addressed in this PR.
Test plan
go test ./std/sync/...make e2e, sprint topology)large-sync --threshold=80on/ndn/svs/{alice,bob,carol}with multicast on/ndn/svs