Skip to content

Draft: Explore warm relink validation reuse#264

Draft
tolgaergin wants to merge 2 commits into
mainfrom
codex/warm-relink-validation-opt
Draft

Draft: Explore warm relink validation reuse#264
tolgaergin wants to merge 2 commits into
mainfrom
codex/warm-relink-validation-opt

Conversation

@tolgaergin

Copy link
Copy Markdown
Contributor

Summary

  • Reuse already-valid v2 link entries during warm relink prevalidation when link/object snapshots agree with the requested graph key and source SRI.
  • Skip full reusable-object metadata hashing and skip dispatching link_v2_one for packages that can be materialized from the validated link entry.
  • Keep conservative fallback: malformed/missing/mismatched snapshots or sidecars return None and use the existing object validation + link path.

Measurements

Release binary: cargo build --release --locked -p lpm-cli --bin lpm-rs
Fixture matrix: cold, warm no-op, warm relink; 3 repeats; isolated project and LPM_HOME; LPM_TIMING_DETAIL=trace.

Baseline artifacts:

  • /tmp/lpm-install-perf-2026-06-19T18-47-26-032Z-14453/results
  • /tmp/lpm-install-perf-large-2026-06-19T18-48-19-694Z-16051/results

After artifacts:

  • /var/folders/p2/32lgcl857ds0wkcnkg0qk51h0000gn/T/lpm-install-perf-final-2026-06-19T19-44-32-829Z-17347/results
  • parsed comparison: /tmp/lpm-install-perf-final-comparison.json
Fixture / phase Baseline total After total Fetch Link Link await v2 validation checked metadata hash ms
small-33 warm relink 160 ms 105 ms 80 -> 86 ms 65 -> 3 ms 62 -> 0 ms 172 -> 0 112 -> 0
large-112 warm relink 237 ms 166 ms 127 -> 120 ms 73 -> 9 ms 65 -> 0 ms 417 -> 0 308 -> 0
small-33 warm no-op 1 ms 1 ms 0 -> 0 ms 0 -> 0 ms 0 -> 0 ms 0 -> 0 0 -> 0
large-112 warm no-op 3 ms 3 ms 0 -> 0 ms 0 -> 0 ms 0 -> 0 ms 0 -> 0 0 -> 0

Cold installs stayed within network/cache noise and still use the existing object validation path. Earlier rejected trials: v2 cache-check concurrency 16->8, 16->32, and Unix mtime-nanos metadata hashing; each regressed or failed to produce a stable win.

Tests

  • cargo test -p lpm-store reusable_link_entry_from_snapshots --locked
  • cargo test -p lpm-cli prevalidate_v2_reusable_objects --locked
  • cargo test -p lpm-workflows --test install install_json_timing_detail --locked
  • cargo fmt --check
  • cargo build --workspace --locked
  • cargo clippy --workspace --all-targets --locked -- -D warnings
  • cargo nextest run --locked --workspace --exclude lpm-workflows --exclude lpm-cli --no-fail-fast --status-level slow --final-status-level fail
  • cargo test --locked -p lpm-cli --bin lpm-rs -- --test-threads=1
  • cargo nextest run --locked -p lpm-cli -E 'not binary_id(/bin\\/lpm-rs/)' --no-fail-fast --status-level slow --final-status-level fail

@tolgaergin tolgaergin marked this pull request as draft June 19, 2026 20:19
@tolgaergin

Copy link
Copy Markdown
Contributor Author

Update after reproducing the macOS failure and rerunning the warm-relink benchmark:

  • The CI failure is real: the original fast path can reuse a link entry before the v2 object validation/repair path runs. I reproduced it locally with cargo nextest run --locked -p lpm-workflows --test install -E 'test(install_v2_cache_hit_repairs_tampered_object_before_linking)' --no-fail-fast --status-level slow --final-status-level fail.
  • I tested a local safety-correct patch that validates the reusable object before returning a ready link. Correctness passes, but the measured optimization no longer wins.

Median warm-relink totals, 3 runs:

fixture baseline safety-correct ready-link reuse
small-33 160 ms 181 ms
large-112 237 ms 269 ms

The safe version removes link await (~62-65 ms -> 0 ms) but moves the same work into fetch prevalidation (small 74 -> 162 ms, large 100 -> 219 ms) and increases metadata-hash time (small 112 -> ~145 ms, large 308 -> ~389 ms). A quick concurrency-cap experiment also failed to recover the loss.

Conclusion: this PR should stay draft/not merge as-is. The next attackable target is not “ready-link reuse after validation”; it is reducing or better attributing the v2 reusable metadata-hash validation cost, or adding narrower link-entry timing so we can optimize the actual package-link validation work without bypassing tamper repair.

Corrected local comparison artifact: /var/folders/p2/32lgcl857ds0wkcnkg0qk51h0000gn/T/lpm-install-perf-safety-final-2026-06-19T20-12-07-535Z/corrected-comparison.json.

@tolgaergin tolgaergin changed the title Optimize warm relink validation reuse Draft: Explore warm relink validation reuse Jun 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant