Migrate the GVL core to Rust (Phases 0–5): numba-free, byte-identical, cargo-standalone by d-laub · Pull Request #262 · mcvickerlab/GenVarLoader

d-laub · 2026-06-27T22:08:11Z

Migrate the GVL core to Rust (Phases 0–5)

This is the big integration merge of the rust-migration branch. It ports GenVarLoader's core read/write data structures and algorithms from Python/numba to a self-contained Rust crate wrapped by a thin PyO3 (abi3) binding, and deletes numba from GVL's own code entirely. Python keeps only the ergonomic surface — Dataset indexing sugar, torch integration, validation/error messages — and dispatches into Rust for everything else.

Scope: 50 commits, 197 files, +29,790 / −2,366. Docs-and-tests-heavy: most of the line count is the differential-parity test suite and the migration roadmap.

Source of truth: docs/roadmaps/rust-migration.md — phase-by-phase tasks, measurements, and the byte-identical parity contract. Every claim below is recorded there with its checkpoint.

Why

Eliminate the ~35 numba kernels scattered across the read/write paths (collapses the bug surface and the ~3 GB llvmlite JIT cost that GVL itself contributed).
A self-contained, cargo test-able Rust crate usable from Rust directly, with a type system that shrinks the code + testing surface.
Faster gvl.write()/update() and parity-or-better Dataset.__getitem__, with headroom for batch parallelism.

The migration contract (strangler fig + byte-identical parity)

Every kernel followed the same loop: implement in Rust on the native ragged layout → expose through src/ffi/ behind a Python-side backend switch → differential-test byte-identical against the numba/Python impl on property-generated inputs → flip the default to Rust and delete the numba impl in the same bundled PR. main stayed shippable at every step; numba removal was continuous, not a big-bang. The ragged layout is consumed from seqpro-core (a pyo3-free rlib, crates.io 0.1.0), not reimplemented in GVL.

What landed, by phase

Phase	Scope	Status	PR(s)
0 — Foundation & harness	`src/ffi/` seam + first live kernel (`intervals_to_tracks`); reusable run-both-assert-byte-identical differential harness + hypothesis generators; `cargo test` wired into pixi; abi3 wheel confirmed; Carter baselines captured	✅	#241
1 — Ragged primitives	Extracted pyo3-free `seqpro-core` rlib owning the `Ragged` layout; ported the last ragged numba ops (`to_padded`, `reverse_complement`) to Rust; GVL consumes it as a crates.io dep; dropped `awkward` from the foundation	✅	seqpro [ML4GLand/SeqPro#60], gvl #240
2 — Genotype assembly + variant gather	`get_diffs_sparse`, `choose_exonic_variants`, the flat-variant gather/fill kernels; dtype-preserving dispatch (int32/float32 hot cores + arbitrary-dtype fallback, after a naive port silently corrupted float32 dosage / int16 FORMAT fields)	✅	landed on `rust-migration`
3 — Reconstruction + track realignment	The numba bulk + the big read-path win: reference assembly, haplotype reconstruct (singular+batch), insertion-fill, track realignment, RLE; five fused `__getitem__` kernels (plain/annotated/spliced haps, annotated-spliced, fused tracks) each crossing the FFI boundary once; format 2.0 zero-copy SoA storage + scale-guard	✅	#245, opt rounds #248 #249 #250 #252
4 — Write / update pipeline	`gvl.write()`/`update()` fully Rust-backed and numba-free: single-pass streaming bigWig writer (SoA `starts/ends/values.npy`), COITrees table/annot overlap engine; dead legacy write paths deleted	✅	#253
5 — Consolidation + thin-binding cleanup	Deleted all remaining core numba kernels (count = 0); added rayon batch parallelism gated byte-identical to the serial golden; thin-shim audit (verdict: shim already thin); cargo-standalone + seqpro-core-released verification; perf re-baseline	✅	#259 (W4 A/B), #260 (W5), #261 (W6)
6 — Absorb genoray	Variant IO (VCF/PGEN + sparse genotypes) into the Rust stack	⬜ future / out of scope	—

Performance

Final single-thread rust-vs-numba __getitem__ A/B (Phase 5 W4, the last apples-to-apples comparison before numba was deleted; Carter, chr22_geuv, NUMBA_NUM_THREADS=1, full tables in docs/roadmaps/phase-5-w4-final-ab.md) — rust is parity-or-better on every mode:

Mode	rust ÷ numba
tracks-only	1.07×
haplotypes / tracks-seqs	1.66×
annotated	1.43×
variants	1.38×
variant-windows	4.58×

The annotated path went from the close-out laggard (0.65×) to a clear rust win after zero-copy interval marshalling + uninit output buffers; tracks-only went from a 0.63× regression to 1.07× after replacing per-interval ndarray slicing with raw-slice writes (#248); variant-windows collapsed an entire Python assembly pass into Rust (#250).

Write path (Phase 4, Carter, chr22_geuv): gvl.write() 1.934 s / 3.520 GB peak RSS; gvl.update() 0.081 s. The bigWig write slice is ~1.88× faster with ~28% less total allocation vs. the legacy path.

Rayon batch parallelism (Phase 5 W5): every read kernel has a parallel gate (should_parallelize, threshold = GVL_NUM_THREADS × 1 MiB) that dispatches into_par_iter(), never a raw *mut across threads, and is gated byte-identical to the serial golden (tests/parity/test_rayon_equivalence.py). The W6 re-baseline corpus stayed below the threshold so rayon ran serial there (a documented finding, not a regression); production-scale batches (SEQLEN≥131072 or BATCH≥256) cross it.

On-disk format & API changes (reviewer note)

Format 2.0 storage: track intervals are now stored struct-of-arrays (starts/ends/values.npy sharing offsets.npy) so the memmaps cross the Python→Rust boundary zero-copy. Opening is gated; existing datasets migrate in place via gvl.migrate. This also closes a rust-only OOM-at-scale defect where the AoS layout forced a full per-sample-scale np.ascontiguousarray copy every batch (locked by tests/integration/test_scale_guard.py).
numba backend & dispatch removed: the GVL_BACKEND env var and python/genvarloader/_dispatch.py are gone — Python calls Rust directly. No runtime backend switch remains.
Unsupported track types now raise TypeError (the dead custom-IntervalTrack write path was removed).
Deps: seqpro ≥ 0.20 (Rust-backed Ragged) and the seqpro-core 0.1.0 crates.io crate; awkward dropped from hot paths.

Verification gate (2026-06-27, HEAD of branch)

pytest whole tree: 973 passed / 44 skipped / 5 xfailed / 0 failed (parity + dataset + unit subset 692/35/2, matching the W5 baseline exactly).
cargo test: 114 passed — and the crate is standalone-testable (cargo test --release from a clean shell, no pixi/PYO3_PYTHON needed).
Lint/types: ruff check + ruff format clean; pyrefly clean; clippy clean (warnings only, 0 errors).
Packaging: abi3 wheel builds clean.

Known caveats (not blockers)

No more numba A/B. numba was deleted in W5, so the W4 table above is the last direct comparison; all later perf signals are rust serial-vs-rayon (same session) + deterministic counts.
Peak RSS (~3.5 GB) is dominated by seqpro's transitive numba JIT (~3.2 GB), not GVL. seqpro's own numba removal is upstream (ML4GLand/SeqPro) and out of scope here.
Two numba-bug sub-domains are excluded from the parity oracle (the #242-family start>=clen clip and a reconstruct trailing-under-write case): numba is the buggy side and is not a valid oracle there; rust is correct in both. Documented at the Phase 3 gate.
Perf is on a shared HPC node (Carter); absolute wall-clock drifts ≥2× across sessions, so the durable signals are byte-identical parity + same-session ratios, not cross-session absolute numbers.
Phase 6 (absorb genoray) is deliberately deferred — the crate still depends on Python genoray for variant IO.

Merge note

Per project policy this should land no-squash to preserve the per-phase commit history. After merge, backfill the Phase 5 _PR: link in docs/roadmaps/rust-migration.md (currently —), consistent with the prior-phase convention.

🤖 Generated with Claude Code

…er design Scope: port get_diffs_sparse + choose_exonic_variants (genotypes) and the 7 flat-variant gather/fill kernels; delete dead filter_af; gate = parity + no regression. Fixes the Phase 2/3 double-count of the reconstruction kernels. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Task-by-task plan: port get_diffs_sparse + choose_exonic_variants + 7 flat gather/fill kernels to Rust, delete dead filter_af, parity + no-regression gate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Pure-ndarray core in src/genotypes/, PyO3 in src/ffi/, dispatched via _dispatch (default rust). Offsets normalized to (2,n) int64. numba retained as parity reference. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…perseded by inline numpy) AF filtering happens in numpy in _haps.py/_flat_variants.py; the numba filter_af had zero production callers. Its dedicated unit test and two stale comment references are removed with it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…(parity) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…elds) Task 5's gather_rows hardcoded int32, silently truncating float32 dosage and arbitrary custom FORMAT field values. Dispatch by dtype: i32/f32 rust cores + dtype-preserving numba fallback for other dtypes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ving) i32/f32 rust cores + dtype-preserving numba fallback for other dtypes (custom FORMAT fields, e.g. int16) — no down-cast. Parity-gated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…st (dtype-preserving) i32/f32 rust cores + dtype-preserving numba fallback for other dtypes (custom FORMAT fields, e.g. int16) — no down-cast. Parity-gated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…rving) Two-level dummy-fill for allele bytes (uint8) AND token windows (int32). u8/i32 rust cores + dtype-preserving numba fallback. Parity-gated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ical) Flips GVL_BACKEND numba<->rust through the real variants getitem path; spy asserts the rust gather_rows_i32 kernel is invoked (non-vacuous); compares every RaggedVariants field byte-identically. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

… lint/docstring cleanup test_flat_variants_type imported the pre-rename _gather_v_idxs_ss; point it at _gather_v_idxs_ss_numba. Also drop an unused strategy var, fix two stale docstring xrefs to the renamed numba gather helpers, and ruff-format. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ration branch Phase 2 genotype assembly + variant gather kernels ported (parity byte-identical, full tree green). filter_af deleted as dead. Records the dtype-preserving design (custom FORMAT fields), the measured ~7% rust-vs-numba read-path gap, and the cProfile finding that it is Python dispatch glue (np.ascontiguousarray = 62%), not rust compute. Per owner decision: drop per-phase throughput gate, accumulate the roadmap on the persistent `rust-migration` branch, restore the perf gate via a single-big-__getitem__-kernel optimization pass before one final merge. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…eal errors) Final-review finding: `except (KeyError, Exception)` could mask a real AF read-path regression as a skip. Catch only KeyError (AF key genuinely absent); let anything else propagate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

1:1 parity twins for the 8 read-path numba kernel groups, plus begin read-path consolidation by fusing the haplotypes and tracks __getitem__ paths. Parity is the hard gate; throughput is recorded only (supersedes the stale throughput-gate line in the roadmap). Sequencing reference -> haps -> tracks -> fuse. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

… plan 15 tasks across 4 sub-units (reference, haplotype reconstruction, track realignment+RLE, fused-path consolidation). Each kernel follows the Phase 2 port recipe: ndarray core + cargo tests -> ffi -> dispatch -> byte-identical hypothesis parity. Parity hard-gated; throughput recorded only. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ilure for start>=clen (parity twin) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…arning (review fixes) I1: capture spy count after rust read, assert it is unchanged after numba read — proves the spy is wired only to the rust kernel, mirroring the guard in test_variants_dataset_parity.py. M1: remove with_tracks(False) call on a no-tracks fixture; the call was a no-op that only emitted a spurious "Dataset has no tracks" warning. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…-tested) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…eep-mask branches

…ity, default rust) Implements Task 5 of Phase 3: adds a Rust batch driver for reconstruct_haplotypes_from_sparse (plural), wires it into the dispatch registry with default=rust, and verifies byte-identical parity against the numba backend via Hypothesis property tests. Also fixes the parity strategy to constrain variant positions to [0, min_contig_len) — mirrors the production invariant that VCF variants are always within-contig — preventing false panics in the Rust kernel on out-of-range random inputs that the parallel numba kernel silently swallows via thread-local SystemError. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ip numba annotated flake When a deletion's ref_end advances ref_idx past the contig boundary, `ref_.len() - ref_idx` is negative. Mirror numba: compute out_end_idx = (out_idx + writable_ref).max(0) so the right-pad range matches exactly. Annotated parity test uses assume(False) to discard inputs where numba's parallel batch driver hits its pre-existing SystemError (negative slice index inside prange); the non-annotated test exercises full byte-identity. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…tch serial-only impl - Expand all three unsafe from_raw_parts_mut SAFETY comments in the batch loop to explicitly state the disjointness invariant: out_offsets required by calling contract to be monotonically non-decreasing → each [out_s..out_e] is a strictly non-overlapping address range; serial loop prevents aliasing UB. - Rename batch_two_queries_two_haplotypes → batch_correctness_two_queries and update doc comment to accurately describe a correctness check (not a serial-vs-parallel comparison); note GIL as reason rayon is omitted. - Add batch_correctness_with_snp test that applies a single SNP (C→T) to exercise the variant-application code path alongside reference-copy. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…mError; correct rayon-deferral comment Fix A: factor a _assert_non_annotated_parity helper that wraps the numba call in try/except SystemError → assume(False), mirroring the guard already present in _assert_annotated_parity. Eliminates latent CI flakiness for the ~0.2% of hypothesis inputs that trigger numba parallel=True crash in the non-annotated path (2000-example high-budget run: 0 uncaught errors). Fix B: replace the incorrect "GIL makes rayon useless" comment in src/reconstruct/mod.rs batch_correctness_two_queries with an accurate note: serial-only is a phase gate decision (throughput recorded not gated), and the loop is rayon-parallelizable later via the same disjoint-chunk split used in src/reference/mod.rs get_reference. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ith _dispatch Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…t calls, delete _dispatch Replace the 22 dispatched call sites across 6 files with direct rust callable references, remove all 20 register() blocks, delete _dispatch.py, delete dead test infra (_harness.py, test_harness_tuple.py, test_dispatch.py), and rewrite make_kernel_spy to monkeypatch the module-level rust symbol instead of mutating the dispatch registry. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…END in bench conftest (W5 B1) - generate_goldens: guard _dispatch import with try/except ImportError (_dispatch=None); _have_numba returns False when _dispatch is None; remove register-triggering side-effect imports (_flat_variants, _genotypes, _intervals, _reference, _tracks); fix E731 lambda-assignment in gen_inplace_kernels - benchmarks/conftest.py: remove dead GVL_BACKEND env manipulation from captured_haplotypes; fix stale _dispatch_get()/_REGISTRY comment in captured_realign_tracks; drop now-unused import os - _tracks.py: remove triple blank line (ruff format) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

… GVL_BACKEND/_active_backend Remove all ~20 backend-conditional forks across _query.py, _haps.py, _reconstruct.py, _reference.py, and _tracks.py. Keep the Rust arm inline and delete the numba composed path at each site. RC accounting preserved byte-identically: _query.py and _reference.py numba post-passes deleted (Rust folds RC in-kernel); _tracks.py keeps its post-pass (unconditional now — tracks RC is Python-side on Rust). All 686 tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…that import) Track-only path spies via _tracks_mod; the haps+tracks fused path is covered by test_fused_tracks_parity. The defensive _recon_mod spy broke after B2 deleted the now-unused intervals_to_tracks import from _reconstruct. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Removed all @nb.njit / @nb.vectorize decorators and `import numba as nb` from python/genvarloader/. Twelve modules touched. Zero numba decorators remain in genvarloader source. Key changes: - _threads.py: cap_numba_threads() → cap_threads(); seeds RAYON_NUM_THREADS for rayon global pool init; keeps optional numba.get_num_threads() cap for backward test compat during migration. - _flat_variants.py: replaced 5 numba dispatch fallbacks with dtype-preserving numpy equivalents (_gather_rows_numpy, _compact_keep_numpy, _fill_empty_scalar_numpy, _fill_empty_seq_numpy, _fill_empty_fixed_numpy) — fixes issue #231 (custom FORMAT fields, e.g. int16/int64 dtypes). - _genotypes.py/_tracks.py/_reference.py/_utils.py: deleted njit functions; restored pure Python oracles for parity/unit test compat (no decorators). - _intervals.py: deleted 4 njit functions + restored dispatch wrappers. - _flat_flanks.py/_sitesonly.py: removed decorators; bodies unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…r pure-OS detection _threads.py: revert sub-agent's conditional numba import; use exact replacement from brief (OS-only, no numba ceiling). _reconstruct.py: drop stale _shift_and_realign_tracks_sparse_rust_wrapper import (ruff F401). tests/unit/test_threads.py: update to new no-numba semantics (env unclamped; threshold via monkeypatched cpu count). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…x B4 guard to own-code Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ard stays own-code B4 removed the conda numba pin, so pixi satisfied seqpro's transitive numba via a broken PyPI llvmlite (libllvmlite.so won't load) -> import genvarloader failed at collection. genvarloader's own code is numba-free; the pin only keeps seqpro working. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…n batch parallelism Add `parallel: bool` to the core batch kernel and all 5 FFI entries (reconstruct_haplotypes_from_sparse, reconstruct_haplotypes_fused, reconstruct_haplotypes_spliced_fused, reconstruct_annotated_haplotypes_fused, reconstruct_annotated_haplotypes_spliced_fused). The parallel branch carves disjoint per-k &mut [_] slices via split_at_mut chains over all active buffers (out u8 always; annot_v_idxs/annot_ref_pos i32 when Some) and dispatches via into_par_iter(), mirroring the proven get_reference idiom. Python callers (reconstruct_haplotypes_from_sparse in _genotypes.py, the 4 fused entries in _haps.py) compute should_parallelize(total_out_bytes) and pass it through. New test tests/parity/test_rayon_equivalence.py asserts serial == parallel == frozen golden for all 200 hypothesis cases. Gate: 64 parity tests pass, cargo test 17/17, ruff clean, clippy 0 errors (16 pre-existing warns). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…t comment Address C1 task-review Important findings: - I-1: add debug_assert!(s >= cursor && e >= s) to the parallel chunk-carve loop documenting/enforcing the out_offsets monotonicity contract (zero-cost in release; the same bounds drive the annotation carves). - I-2: correct the stale comment in test_rayon_equivalence.py — RUST_KERNELS now stores the C1 shim (parallel=False default) that forwards to the FFI, not the bare FFI function. Gate: 688 passed / 35 skipped / 2 xfailed; cargo reconstruct 17/17; ruff + clippy clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…o_intervals (Task C2) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Add parallel=bool to get_diffs_sparse (par_chunks_mut over flat output, one cell per work item) and intervals_to_tracks (split_at_mut cursor idiom, same as C1/C2). Thread parallel through all FFI entry points and Python callers (_genotypes.py, _intervals.py); add parallel=False shims for both kernels in _golden.py so existing replay callers are unaffected. Update genvarloader.pyi stub for intervals_to_tracks. Extend test_rayon_equivalence.py with serial==parallel==golden cases for both kernels. All 68 parity tests pass; 110 cargo tests pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…away micro-benchmarks C4 — Stage-C boundary for the W5 consolidation PR. - Roadmap: rewrite the W5 notes entry to cover all three stages (golden snapshot, numba deletion, rayon batch parallelism) and the per-kernel rayon rollout (C1 reconstruct, C2 tracks, C3 diffs/intervals). Phase 5 stays 🚧 (W6/PR6 is measure-and-merge). Correct the seqpro-numba note to "to be filed". - tests/benchmarks/test_micro.py: skip the 3 micro-benchmarks whose Python-level capture points were fused away in W3/W5 (reconstruct_haplotypes_from_sparse, intervals_to_tracks, shift_and_realign_tracks_sparse) — redesign onto the fused rust entries is deferred to W6. Fix the now-stale shift import to the rust wrapper. test_get_diffs_sparse + e2e benchmarks still run. This unbreaks whole-tree `pytest tests` / `pixi run test` (broken since B2/B3). Stage-C gate (controller-verified, fresh maturin --release): whole `pytest tests` = 973 passed / 44 skipped / 5 xfailed; cargo test --release 114; ruff + format + pyrefly + clippy clean; serial==parallel==golden across all kernels. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…kout Final-review caveat: post-W5 (numba deleted) re-running either golden generator would silently freeze rust == rust with no oracle cross-check, defeating the parity contract. Strengthen both generator docstrings from a passive note into an explicit DANGER warning. Docstring-only; no logic change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…-of-scope - W5 entry PR #TODO → #260. - Correct the seqpro caveat: removing numba from seqpro (ML4GLand/SeqPro) is out of scope (user decision 2026-06-27); W5's numba removal is gvl-only by design, so the transitive numba dep + its JIT-RSS floor remain intentionally. W6 perf re-baseline measures gvl-attributable deltas, not the seqpro JIT floor. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Phase 5 W5: consolidation — golden snapshot + delete numba + rayon

…alone/seqpro verifications) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…urface glue Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ct stale Phase 1 note) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…ead speedup + RSS Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…rderline threshold claim Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Phase 5 W6 wrap-up: thin-shim audit + cargo-standalone + seqpro verification + perf re-baseline

d-laub and others added 30 commits June 24, 2026 00:22

docs(plan): Phase 2 rust migration implementation plan

cf94947

Task-by-task plan: port get_diffs_sparse + choose_exonic_variants + 7 flat gather/fill kernels to Rust, delete dead filter_af, parity + no-regression gate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

test(parity): tuple-aware kernel parity helper for Phase 2 kernels

c3e48b6

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

perf(genotypes): port choose_exonic_variants numba->rust (parity-gated)

e31a1dc

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

perf(variants): port _gather_v_idxs(+_ss) numba->rust as gather_rows …

a95f4f8

…(parity) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

perf(variants): port _gather_alleles numba->rust (parity-gated)

04f9537

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

perf(variants): port _compact_keep numba->rust (i32/f32, dtype-preser…

d8f62a8

…ving) i32/f32 rust cores + dtype-preserving numba fallback for other dtypes (custom FORMAT fields, e.g. int16) — no down-cast. Parity-gated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

perf(variants): port _fill_empty_seq numba->rust (u8/i32, dtype-prese…

1f18908

…rving) Two-level dummy-fill for allele bytes (uint8) AND token windows (int32). u8/i32 rust cores + dtype-preserving numba fallback. Parity-gated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

perf(reference): port padded_slice numba->rust core (cargo-tested)

fb88357

perf(reference): port get_reference numba->rust (parity, default rust)

d0026cb

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fix(reference): revert padded_slice leniency — mirror numba's loud fa…

378b0f6

…ilure for start>=clen (parity twin) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(parity): reference-mode + spliced dataset backstop (spy-guarded)

cbd9a84

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

perf(reconstruct): port reconstruct_haplotype_from_sparse core (cargo…

055ca44

…-tested) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

test(reconstruct): cover allele_start_idx==v_len, skip-variant, and k…

0bc0a44

…eep-mask branches

test(parity): haplotypes + annotated-haps dataset backstop (spy-guarded)

7bade06

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

d-laub and others added 28 commits June 26, 2026 21:42

docs(plan): W5 B1 — delete dead _harness.py + test_harness_tuple.py w…

29a2a4e

…ith _dispatch Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs: correct W5 roadmap count (686/35/2) + seqpro-numba caveat; rela…

06c0963

…x B4 guard to own-code Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat: delete numba backend — rust-only read path (Phase 5 W5)

98f3ee5

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs: W5 resume handoff (Stage C / C1 landed)

099f9c7

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

feat(rayon): parallelize shift_and_realign_tracks_sparse and tracks_t…

edf0141

…o_intervals (Task C2) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Merge pull request #260 from mcvickerlab/phase-5-w5

b7d2c00

Phase 5 W5: consolidation — golden snapshot + delete numba + rayon

docs(spec): Phase 5 rust-migration wrap-up design (W6 + audit + stand…

3933f1e

…alone/seqpro verifications) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs(plan): Phase 5 rust-migration wrap-up implementation plan

3c4cf29

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs(roadmap): Phase 5 W6 thin-shim audit — classify remaining PyO3 s…

0932374

…urface glue Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs(roadmap): verify crate is cargo-testable standalone (Phase 5)

ac052f7

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs(roadmap): seqpro-core is already a released crates.io dep (corre…

0968a0f

…ct stale Phase 1 note) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs(roadmap): Phase 5 W6 perf re-baseline — rayon serial-vs-multithr…

6611540

…ead speedup + RSS Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs(roadmap): clarify W6 perf byte-math batch composition; soften bo…

e47d128

…rderline threshold claim Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

docs(roadmap): finalize Phase 5 W6 — set status marker + gate results

60ccd12

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Merge pull request #261 from mcvickerlab/phase-5-w6-wrapup

182393b

Phase 5 W6 wrap-up: thin-shim audit + cargo-standalone + seqpro verification + perf re-baseline

d-laub merged commit 068c934 into main Jun 27, 2026
8 checks passed

d-laub deleted the rust-migration branch June 27, 2026 23:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migrate the GVL core to Rust (Phases 0–5): numba-free, byte-identical, cargo-standalone#262

Migrate the GVL core to Rust (Phases 0–5): numba-free, byte-identical, cargo-standalone#262
d-laub merged 213 commits into
mainfrom
rust-migration

d-laub commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

d-laub commented Jun 27, 2026