Skip to content
377 changes: 377 additions & 0 deletions docs/svs-v3-revision.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,377 @@
# State Vector Sync (SVS) v3: Revision Specification

This document **revises** SVS v3 for large synchronization groups. It is **not** a new protocol. Existing SVS v3 semantics apply whenever the complete State Vector fits within the configured size threshold.

---

## 1. Basic Protocol Design

### 1.1 Small groups

For most deployments, the complete State Vector fits in one Sync packet. Nodes exchange **full** State Vectors using existing SVS v3 logic (suppression, steady state, merge, `OnUpdate`).

### 1.2 Large groups

When the encoded State Vector exceeds **`SyncVectorThreshold`** (configurable application limit), nodes use three dissemination modes:

| Mode | When | On the wire |
|------|------|-------------|
| **Inline FULL** | Encoded FULL fits in threshold | `mhash` + `VectorType=FULL` + complete `StateVector` in Sync Data |
| **Inline PARTIAL** | **New publication** and FULL exceeds threshold | `mhash` + `VectorType=PARTIAL` + subset `StateVector` in Sync Data |
| **Announce + pull** | **Periodic sync** (large group), or **`mhash` mismatch** | Produce full vector Data at `32=sv/<version>`; Sync Data carries `mhash` + reference Name only |

**MemberSetHash (`mhash`)** is always carried inside `SvsData`. It is a **membership hash**, not a full-vector hash.

**Full state recovery** (fetch complete State Vector from a remote sync member) uses **announce + pull** when:

1. **`mhash` differs** from the local membership hash, or
2. **Periodic sync** runs while the local FULL encoding exceeds `SyncVectorThreshold`, or
3. An inline **`VectorType = FULL`** State Vector is outdated per Section 6.2.

Link-level **fragmentation** (NDNLPv2) is not a concern for implementers. Publishers use **existing ndnd object segmentation** APIs when retrievable full-vector Data is large.

---

## 2. Format and Naming

### 2.1 Sync Interest

**Sync Interest Name:**

```
/<sync-prefix>/v=3
```

Implementations MAY append additional name components after `v=3`. The Interest **nonce** is carried in Interest packet fields, **not** as a name component.

- Signed **Sync Data** is carried in `ApplicationParameters`.
- Interest Lifetime SHOULD be 1 second.
- Sync Interests are **not** acknowledged (unchanged).

### 2.2 Sync Data (in ApplicationParameters)

**Sync Data Name** (signing identity for the Sync message):

```
/<group>/<node>/<boot time>/<version>
```

- **`version`:** microsecond timestamp (default). A hash suffix component is deferred.

**Sync Data Content:** encoded `SvsData` (Section 3) — either **inline** or **announce-only** form.

### 2.3 Application publication Data

```
/<group>/<node>/<boot time>/seq=<n>
```

Application-level naming may vary. **Sync vector Data MUST NOT share the application publication namespace.** The `32=sv` keyword (Section 2.4) separates sync state from application Data.

### 2.4 Published full State Vector Data

Retrievable full State Vector objects use a dedicated sync namespace:

**Name:**

```
/<group>/<node>/<boot time>/32=sv/<version>
```

**Content:** signed `SvsData` in **inline FULL** form: `mhash` + `VectorType = FULL` + complete `StateVector`.

**Announce + pull procedure** (periodic sync, `mhash` recovery, join when FULL exceeds threshold):

1. **Produce** the complete full-vector Data at `/<group>/<node>/<boot>/32=sv/<version>` (use ndnd segmentation when large).
2. **Send** a Sync Interest whose AppParam Sync Data contains **announce-only** `SvsData`: `mhash` + `SvsDataRef` pointing at that published name (Section 3.1).
3. Receivers **pull** the referenced Data, validate, and merge.

Do **not** send an inline PARTIAL vector alongside an announce-only Sync message.

---

## 3. Packet Specification

### 3.1 `SvsData`

`SvsData` has two forms. **`mhash` is always present.** It is not a separate protocol message.

#### 3.1.1 Inline form (FULL or PARTIAL)

Used when the State Vector (or a publication-time PARTIAL subset) is carried inline in Sync Data, or in published full-vector Data at `32=sv/<version>`.

```
SvsData = SVS-DATA-TYPE TLV-LENGTH
MemberSetHash
VectorType
StateVector
```

| Field | TLV type | Value |
|-------|----------|-------|
| `MemberSetHash` | `0xCB` | 32-byte SHA-256 digest (`mhash`) |
| `VectorType` | `0xCD` | `0` = FULL, `1` = PARTIAL |
| `StateVector` | `0xC9` | See Section 3.2 |

#### 3.1.2 Announce-only form

Used when Sync Data only advertises a retrievable full-vector Data name (periodic sync, `mhash` recovery). **No `VectorType` or `StateVector`.**

```
SvsData = SVS-DATA-TYPE TLV-LENGTH
MemberSetHash
SvsDataRef
```

| Field | TLV type | Value |
|-------|----------|-------|
| `MemberSetHash` | `0xCB` | 32-byte SHA-256 digest (`mhash`) |
| `SvsDataRef` | `0x07` (Name) | Name of published full-vector Data: `/<group>/<node>/<boot>/32=sv/<version>` |

> **Note:** The inline layout extends ndnd v3 `SvsData` with `MemberSetHash` and `VectorType` before `StateVector`, matching the Python strawman (`mhash` at `0xCB`, vector at `0xC9`/`0xCA`).

### 3.2 `StateVector`

```
StateVector = STATE-VECTOR-TYPE TLV-LENGTH
*StateVectorEntry

StateVectorEntry = STATE-VECTOR-ENTRY-TYPE TLV-LENGTH
Name
*SeqNoEntry

SeqNoEntry = SEQ-NO-ENTRY-TYPE TLV-LENGTH
BootstrapTime
SeqNo
```

| TLV | Type (decimal) | Type (hex) |
|-----|----------------|------------|
| `STATE-VECTOR-TYPE` | 201 | `0xC9` |
| `STATE-VECTOR-ENTRY-TYPE` | 202 | `0xCA` |
| `SEQ-NO-ENTRY-TYPE` | 210 | `0xD2` |
| `BOOTSTRAP-TIME-TYPE` | 212 | `0xD4` |
| `SEQ-NO-TYPE` | 214 | `0xD6` |

**Rules (unchanged from SVS v3):**

- Sequence numbers are 1-indexed.
- Bootstrap time is seconds since Unix epoch.
- If an entry is absent, its sequence number is treated as 0 for comparison.
- If any received `BootstrapTime` is more than 86400s in the future, the entire `StateVector` SHOULD be ignored.

### 3.3 `MemberSetHash` (`mhash`)

**`mhash` is a membership hash.** It is **not** a hash of the full State Vector and **not** a hash of sequence numbers.

**Membership** is the set of participants, each identified by:

```
(Producer Name, Bootstrap Time)
```

**Computation:**

```
members = { (Name, BootstrapTime) | node knows this member in the sync group }
sort by NDN canonical order of Name, then by BootstrapTime ascending
mhash = SHA-256( concatenation of canonical TLV bytes of each (Name, BootstrapTime) pair )
```

Recompute `mhash` whenever membership changes (member added, removed, or new bootstrap time for a name).

> **Note:** The Python strawman hashes sorted producer **names only**. This revision includes **Bootstrap Time** in each membership tuple, consistent with SVS v3 identity.

**Membership data and State Vector data are separate concepts.** Membership is carried implicitly in the full State Vector. `mhash` summarizes membership for quick comparison.

### 3.4 `VectorType` (inline form only)

| Value | Name | Meaning |
|-------|------|---------|
| `0` | **FULL** | `StateVector` contains the complete advertised state (Section 4.1 ordering). |
| `1` | **PARTIAL** | `StateVector` contains a subset (Section 4.2). Used for **new publication** only when FULL exceeds threshold. |

`mhash` is present in both inline and announce-only `SvsData` messages.

---

## 4. State Vector Encoding

### 4.1 FULL State Vector

- Include all known members and their latest sequence numbers per bootstrap.
- Entries ordered in **NDN canonical order** of `Name` (unchanged SVS v3 rule).
- Set `VectorType = FULL`.

### 4.2 PARTIAL State Vector

Used **only on new publication** when `encoded_size(inline FULL SvsData) > SyncVectorThreshold`.

- Set `VectorType = PARTIAL`.
- **Entry `[0]`** MUST be the **sender's** own `StateVectorEntry`.
- **Entries `[1…n]`** MUST be in **NDN canonical order** among included peers.

An **implementation** MAY use the following selection priority:

| Priority | Include |
|----------|---------|
| 1 | Sender (always) |
| 2 | Repair targets |
| 3 | Propagation targets |
| 4 | Random inactive producers |
| 5 | Others by recency |

Stop adding entries when estimated inline `SvsData` size approaches `SyncVectorThreshold`.

### 4.3 `SyncVectorThreshold`

- Configurable implementation parameter (application packet size budget).
- Default value is implementation-defined.
- When `encoded_size(FULL) ≤ SyncVectorThreshold`, nodes use **inline FULL** only (existing SVS v3 behavior).

---

## 5. State Sync

Sections 5.1–5.4 unchanged in spirit from [SVS v3 Section 4](https://named-data.github.io/StateVectorSync/Specification.html). This revision adds Sections 5.5–5.9.

### 5.1 Sync Interest timer

- `PeriodicTimeout` default 30s (±10% jitter).
- `SuppressionPeriod` default 200ms.
- `SuppressionTimeout` exponential decay (unchanged formula).

### 5.2 Send Sync Interest on new publication

When the node generates a new publication, it immediately emits a Sync Interest and resets the timer to `PeriodicTimeout`.

| Condition | Action |
|-----------|--------|
| `encoded_size(inline FULL) ≤ SyncVectorThreshold` | Send **inline FULL** (`mhash` + `VectorType=FULL` + `StateVector`) |
| `encoded_size(inline FULL) > SyncVectorThreshold` | Send **inline PARTIAL** (`mhash` + `VectorType=PARTIAL` + subset `StateVector`) |

### 5.3 Sync Ack policy

Do not acknowledge Sync Interests.

### 5.4 Steady state and suppression (unchanged for inline FULL)

For incoming Sync Data with inline `VectorType = FULL`, apply existing SVS v3 steady-state and suppression rules.

### 5.5 PARTIAL State Vector processing

When `VectorType = PARTIAL`:

1. Parse `mhash` and `StateVector`.
2. **Do not** treat names **omitted** from the partial `StateVector` as producer removal, outdated sender (by omission alone), or sequence rollback.
3. For each **present** entry, merge newer sequence numbers into local state (unchanged merge rule).
4. If `mhash` differs from local `mhash`, perform **announce + pull** recovery (Section 5.6).

This is the primary **receive-side change** in ndnd (`svs.go`).

### 5.6 Full state recovery (announce + pull)

**Triggers:**

| # | Condition | Action |
|---|-----------|--------|
| 1 | `mhash` in received `SvsData` ≠ locally computed `mhash` | Announce + pull |
| 2 | Inline `VectorType = FULL` is outdated per Section 6.2 | Merge inline if complete; otherwise announce + pull |
| 3 | Periodic sync while local FULL exceeds `SyncVectorThreshold` | Sender: announce + pull (Section 5.8) |

**There is no separate membership-only retrieval.** Recovery always fetches the **complete State Vector** from the referenced `32=sv/<version>` Data.

**Procedure (sender on `mhash` mismatch or periodic large-group sync):**

1. Produce full-vector Data at `/<group>/<sender>/<boot>/32=sv/<version>` with inline FULL `SvsData`.
2. Send Sync Interest with announce-only `SvsData` (`mhash` + `SvsDataRef`).

**Procedure (receiver):**

1. Identify sender from Sync Data signature and/or PARTIAL entry `[0]`.
2. If Sync Data is **inline FULL** and complete: merge directly.
3. If Sync Data is **announce-only**: read `SvsDataRef`; express Interest for that name; validate; merge; update local `mhash`.
4. Continue application data fetch via SvsALO (`OnUpdate`) as today.

Use **ndnd segmentation** when fetched Data content is large.

### 5.7 New node join

1. Joining node **N** multicasts Sync Interest with **only itself**: `(Name=N, SeqNo=0)` and its current `mhash`.
2. Existing members receive the announcement.
3. **Suppression** limits duplicate responses; typically one member **A** provides recovery state.
4. If FULL fits inline: **A** responds with inline `VectorType = FULL`.
5. If FULL exceeds `SyncVectorThreshold`: **A** uses **announce + pull** (produce at `32=sv/<version>`, then announce-only Sync Data).
6. Normal synchronization proceeds through SvsALO.

### 5.8 Periodic sync in large groups

| Local FULL size | Periodic Sync behavior |
|-----------------|------------------------|
| `≤ SyncVectorThreshold` | **Inline FULL** (existing SVS v3) |
| `> SyncVectorThreshold` | **Always announce + pull** (produce full-vector Data, then announce-only Sync Data) |

Periodic sync does **not** send inline PARTIAL vectors.

### 5.9 Summary of sync triggers

| Event | `size ≤ threshold` | `size > threshold` |
|-------|--------------------|--------------------|
| **New publication** | Inline FULL | Inline PARTIAL |
| **Periodic sync** | Inline FULL | Announce + pull |
| **`mhash` mismatch** | Announce + pull (if recovery needed) | Announce + pull |

---

## 6. Comparing and Merging State Vectors

### 6.1 Merge rule

For each matching `(Name, BootstrapTime)`, retain the maximum `SeqNo`.

### 6.2 Outdated vector (inline FULL only)

State Vector `A` is outdated to `B` if:

- `A` is missing a name present in `B`, or
- `A` has a strictly smaller `SeqNo` for any entry.

For `VectorType = PARTIAL`, the missing-name rule **does not** apply to names omitted from the partial message.

---

## 7. Examples

### 7.1 Small group

Three nodes `A`, `B`, `C`. Full State Vector fits. `A` publishes; sends inline FULL Sync Interest `[A:11, B:15, C:25]`. Peers merge.

### 7.2 Large group

Group exceeds `SyncVectorThreshold`. Producer `P` publishes:

- `P` sends inline PARTIAL `SvsData { mhash, VectorType=PARTIAL, StateVector=[P:…, A:…, …] }`.
- Receiver merges present entries only.
- If `mhash` differs, `P` (or receiver per policy) triggers announce + pull (Section 5.6).

### 7.3 Large group

- `A` produces full vector at `/group/A/boot/32=sv/<version>`.
- `A` sends announce-only Sync Data `{ mhash, SvsDataRef=/group/A/boot/32=sv/<version> }`.
- Peers pull and merge.

### 7.4 New node join

- `N` sends self-only vector `[N:0]` with `mhash`.
- `A` responds with inline FULL or announce + pull.
- `N` merges and synchronizes via SvsALO.

---

## 8. Open Items

1. **Mixed-version interoperability** — how revised nodes coexist with plain SVS v3 peers in the same sync group, if at all.

---


Loading