You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add inline recovery records to the zstd stream: Reed-Solomon parity carried in skippable frames, so a .zst file can self-heal on-disk corruption (bit-rot, bad sectors) without an external .par2 sidecar. This is the RAR -rr / par2 idea native to zstd.
Standard zstd decoders (incl. C libzstd) skip these frames per RFC 8878 §3.1.2, so the feature is strictly drop-in: a .zst with recovery records decodes byte-identically on any compliant decoder, and ours additionally repairs corruption before decode.
Gated behind a Cargo feature (default off); the default-build cdylib stays strict drop-in for donor libzstd v1.5.7.
Why in structured-zstd (not at the consumer's layer)
For a standalone, unencrypted.zst, the zstd stream is the final on-disk byte layout, so parity over those bytes is exactly "ECC over the final on-disk bytes" at the only layer present. (For layered storage with encryption above zstd, e.g. lsm-tree, recovery must live at that outer layer over the ciphertext: lsm-tree's Page ECC, not this. This feature does not change or duplicate that; the two target different products and do not stack.)
No current internal consumer drives this; it is a product-surface bet ("zstd + recovery") for the broader CLI/library audience and the project's drop-in-plus positioning. It composes cleanly with the introspection and partial-decode primitives already shipped (#173FrameEmitInfo, #174 block-precise errors, #175 block-subset partial decode).
Design
Carrier: skippable frame, magic 0x184D2A5F
Allocated from the high end of the skippable range (structured-zstd native), per the updated pointwise allocation policy (see "Registry change" below). The recovery skippable-frame payload is versioned from day one (it becomes a wire contract once shipped).
Payload header (sketch):
scheme_id / version
shard_size (sector-aligned, e.g. 4 KiB, to match the physical failure unit: bad sector / bit-rot is sector-sized)
RS parameters (k data shards, m parity shards); redundancy = m / k
coverage byte-range over the preceding data frames
a strong per-shard checksum (XXH3/XXH64) so the reader knows which shards are damaged (RS erasure decoding needs erasure positions)
the parity bytes
Parity scheme
Concatenate the on-disk bytes of the data frames, split into sector-aligned shards.
Group into RS(k, m) stripes; parity shards stored in recovery frame(s).
Placement: trailing (par2-style, simplest) or interleaved every N data frames (RAR recovery-volume-style, better for streaming/locality). Start with trailing; interleaving is a follow-up.
After producing the data frames, compute parity over the concatenated bytes at the configured redundancy and emit the recovery skippable frame(s).
Configurable redundancy percent.
RS implementation
Reuse a no_std + alloc RS crate (reed-solomon-simd / reed-solomon-erasure) with a scalar fallback; do not hand-roll the field arithmetic.
API (feature-gated)
Encode: a wrapper / FrameCompressor option with_recovery(redundancy_pct) that post-processes the frame stream to append parity frames.
Decode: a RecoveringDecoder wrapper that verifies + repairs, then delegates to FrameDecoder / StreamingDecoder.
CLI: --recovery=N% on encode → inline recovery; our -d auto-repairs; C zstd -d ignores the recovery frames and decodes (when uncorrupted).
Registry change (pointwise, dual-end allocation)
Update docs/SKIPPABLE_MAGIC_ALLOCATIONS.md to drop the "consumer owns a contiguous range" model in favour of pointwise allocation of concrete frame types, growing from both ends:
lsm-tree allocates from the low end upward: 2A50 MetadataFrame, 2A51 BodyFrame (only what it uses).
structured-zstd allocates from the high end downward: 2A5F RecoveryFrame.
2A53..2A5E stay in the unallocated pool. No "future headroom" reservation.
This frees the magic for RecoveryFrame and removes lsm-tree's blanket 16-variant reservation (it actively uses 2). lsm-tree needs a heads-up that its reservation narrows from 2A50..2A5F to the two variants it uses; nothing breaks (the released/unused variants were never emitted), but the registry is a wire-contract so the change is coordinated.
Acceptance criteria
Encode emits recovery skippable frame(s) at a configurable redundancy; a standard zstd decoder (C libzstd) decodes the uncorrupted output byte-identically (drop-in proof).
Fault injection: corrupt up to m shards per stripe → reader repairs and decodes to the original bytes; assert repair fired (metric/flag).
Summary
Add inline recovery records to the zstd stream: Reed-Solomon parity carried in skippable frames, so a
.zstfile can self-heal on-disk corruption (bit-rot, bad sectors) without an external.par2sidecar. This is the RAR-rr/ par2 idea native to zstd.Standard zstd decoders (incl. C
libzstd) skip these frames per RFC 8878 §3.1.2, so the feature is strictly drop-in: a.zstwith recovery records decodes byte-identically on any compliant decoder, and ours additionally repairs corruption before decode.Gated behind a Cargo feature (default off); the default-build
cdylibstays strict drop-in for donorlibzstdv1.5.7.Why in structured-zstd (not at the consumer's layer)
For a standalone, unencrypted
.zst, the zstd stream is the final on-disk byte layout, so parity over those bytes is exactly "ECC over the final on-disk bytes" at the only layer present. (For layered storage with encryption above zstd, e.g. lsm-tree, recovery must live at that outer layer over the ciphertext: lsm-tree's Page ECC, not this. This feature does not change or duplicate that; the two target different products and do not stack.)No current internal consumer drives this; it is a product-surface bet ("zstd + recovery") for the broader CLI/library audience and the project's drop-in-plus positioning. It composes cleanly with the introspection and partial-decode primitives already shipped (#173
FrameEmitInfo, #174 block-precise errors, #175 block-subset partial decode).Design
Carrier: skippable frame, magic
0x184D2A5FAllocated from the high end of the skippable range (structured-zstd native), per the updated pointwise allocation policy (see "Registry change" below). The recovery skippable-frame payload is versioned from day one (it becomes a wire contract once shipped).
Payload header (sketch):
scheme_id/versionshard_size(sector-aligned, e.g. 4 KiB, to match the physical failure unit: bad sector / bit-rot is sector-sized)(k data shards, m parity shards); redundancy =m / kParity scheme
FrameEmitInfo.blocks[*]ranges (feat(encoding+decoding): FrameEmitInfo block-layout introspection + opt-in per-block XXH64 sidecar #173) so repair and partial decode (perf+feat(decoding): block-subset partial decode (range + recovery) + per-block decompressed byte ranges #175) are block-precise.Repair on read
decode_blocks_partial(perf+feat(decoding): block-subset partial decode (range + recovery) + per-block decompressed byte ranges #175) to return the still-good prefix.Encode
RS implementation
no_std + allocRS crate (reed-solomon-simd/reed-solomon-erasure) with a scalar fallback; do not hand-roll the field arithmetic.API (feature-gated)
FrameCompressoroptionwith_recovery(redundancy_pct)that post-processes the frame stream to append parity frames.RecoveringDecoderwrapper that verifies + repairs, then delegates toFrameDecoder/StreamingDecoder.--recovery=N%on encode → inline recovery; our-dauto-repairs; Czstd -dignores the recovery frames and decodes (when uncorrupted).Registry change (pointwise, dual-end allocation)
Update
docs/SKIPPABLE_MAGIC_ALLOCATIONS.mdto drop the "consumer owns a contiguous range" model in favour of pointwise allocation of concrete frame types, growing from both ends:2A50MetadataFrame,2A51BodyFrame (only what it uses).2A5FRecoveryFrame.2A52(the retired EccFrame, per lsm-tree perf(decode): #247 Part 2 — kill divb in repeat_short_offset + force-inline UserSliceBackend::extend #254) is released back to the pool.2A53..2A5Estay in the unallocated pool. No "future headroom" reservation.This frees the magic for RecoveryFrame and removes lsm-tree's blanket 16-variant reservation (it actively uses 2). lsm-tree needs a heads-up that its reservation narrows from
2A50..2A5Fto the two variants it uses; nothing breaks (the released/unused variants were never emitted), but the registry is a wire-contract so the change is coordinated.Acceptance criteria
libzstd) decodes the uncorrupted output byte-identically (drop-in proof).mshards per stripe → reader repairs and decodes to the original bytes; assert repair fired (metric/flag).> mshards) → typed error, never silent wrong data; optionally returns the good prefix via perf+feat(decoding): block-subset partial decode (range + recovery) + per-block decompressed byte ranges #175.cdylibbuild byte-identical and strict drop-in for donor v1.5.7.docs/SKIPPABLE_MAGIC_ALLOCATIONS.mdupdated to the pointwise dual-end policy with2A5FRecoveryFrame allocated and2A52released.--recovery=N%end-to-end (encode adds records, decode auto-repairs).Out of scope
Related