Add leanvec_primary_only build option to C++ runtime#323
Conversation
Adds a new `leanvec_primary_only` parameter (default false) to the
LeanVec build entry points in the C++ runtime API:
- VamanaIndexLeanVec::build (both overloads)
- DynamicVamanaIndexLeanVec::build (DynamicIndexParams overloads)
The flag is stored in {Vamana,DynamicVamana}IndexLeanVecImpl and
forwarded through init_impl -> build_impl -> StorageFactory<LeanVec>::init
-> LeanDataset::reduce, where it skips secondary-dataset allocation
and disables reranking. ABI back-compat overloads are unchanged.
Compiles standalone with SVS_RUNTIME_HAVE_LVQ_LEANVEC=OFF (stub returns
NOT_IMPLEMENTED). When enabled, requires the matching LeanDataset::reduce
overload that accepts the primary_only argument (added in the private
repository alongside the LeanDataset save/load support).
|
Is there a corresponding FAISS update? |
Yes, I will publish it as well once we have this PR merged |
|
I'd recommend drafting a FAISS update and pointing the FAISS install in this PR to your FAISS branch, otherwise our CI will be failing |
Makes sense, I will add the corresponding FAISS update |
ethanglaser
left a comment
There was a problem hiding this comment.
These changes to test-cpp-runtime-bindings.sh should not be needed - please restore based on suggestions and let's see if CI runs smoothly
| # Install libsvs-runtime from the public release tarball (includes the | ||
| # leanvec_primary_only API). Once a conda-forge package with this API is | ||
| # published, this can revert to: | ||
| # conda install -y /runtime_conda/libsvs-runtime-*.conda | ||
| SVS_RUNTIME_URL="${SVS_RUNTIME_URL:-https://github.com/intel/ScalableVectorSearch/releases/download/nightly/svs_runtime-0.3.0-linux-x86_64-leanvec-primary-only-glibc228.tar.gz}" | ||
| SVS_RUNTIME_PREFIX="${SVS_RUNTIME_PREFIX:-$HOME/svs_runtime}" | ||
| mkdir -p "$SVS_RUNTIME_PREFIX" | ||
| curl -fsSL "$SVS_RUNTIME_URL" | tar -xz -C "$SVS_RUNTIME_PREFIX" | ||
| export CMAKE_PREFIX_PATH="$SVS_RUNTIME_PREFIX${CMAKE_PREFIX_PATH:+:$CMAKE_PREFIX_PATH}" | ||
| export LD_LIBRARY_PATH="$SVS_RUNTIME_PREFIX/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" |
There was a problem hiding this comment.
| # Install libsvs-runtime from the public release tarball (includes the | |
| # leanvec_primary_only API). Once a conda-forge package with this API is | |
| # published, this can revert to: | |
| # conda install -y /runtime_conda/libsvs-runtime-*.conda | |
| SVS_RUNTIME_URL="${SVS_RUNTIME_URL:-https://github.com/intel/ScalableVectorSearch/releases/download/nightly/svs_runtime-0.3.0-linux-x86_64-leanvec-primary-only-glibc228.tar.gz}" | |
| SVS_RUNTIME_PREFIX="${SVS_RUNTIME_PREFIX:-$HOME/svs_runtime}" | |
| mkdir -p "$SVS_RUNTIME_PREFIX" | |
| curl -fsSL "$SVS_RUNTIME_URL" | tar -xz -C "$SVS_RUNTIME_PREFIX" | |
| export CMAKE_PREFIX_PATH="$SVS_RUNTIME_PREFIX${CMAKE_PREFIX_PATH:+:$CMAKE_PREFIX_PATH}" | |
| export LD_LIBRARY_PATH="$SVS_RUNTIME_PREFIX/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}" | |
| # Install libsvs-runtime from local conda package | |
| conda install -y /runtime_conda/libsvs-runtime-*.conda |
| conda activate svsenv | ||
| conda config --set solver libmamba | ||
| conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools | ||
| conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools curl |
There was a problem hiding this comment.
| conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools curl | |
| conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools |
rfsaliev
left a comment
There was a problem hiding this comment.
It seems like this API change is not convenient for existing API style:
Suggesting to use new values in StorageKind enum rather than adding extra arguments to index building routines.
| const VamanaIndex::SearchParams& default_search_params, | ||
| const VamanaIndex::DynamicIndexParams& dynamic_index_params | ||
| const VamanaIndex::DynamicIndexParams& dynamic_index_params, | ||
| bool leanvec_primary_only = false |
There was a problem hiding this comment.
Why didn't you just extend the StorageKind enum with values aka: LeanVec4, LeanVec8?
| using allocator_type = rebind_extracted_allocator_t<std::byte, Alloc>; | ||
| using type = LeanDatasetType<8, 8, allocator_type>; | ||
| }; | ||
|
|
There was a problem hiding this comment.
In case if:
StorageKindenum would haveLeanVec4,LeanVec8values andsvs::leanvec::LeanDatasetsupportsvoidfor theT2param, all changes in CPP runtime would include just following new lines:
// LeanVec Primary-Only Storage support
template <
size_t I1,
typename Alloc,
size_t LeanVecDims = svs::Dynamic,
size_t Extent = svs::Dynamic>
using LeanPrimaryOnly = svs::leanvec::LeanDataset<
svs::leanvec::UsingLVQ<I1>,
void,
LeanVecDims,
Extent,
Alloc>;
template <typename Alloc> struct StorageType<StorageKind::LeanVec4, Alloc> {
using allocator_type = rebind_extracted_allocator_t<std::byte, Alloc>;
using type = LeanPrimaryOnly<4, allocator_type>;
};
template <typename Alloc> struct StorageType<StorageKind::LeanVec8, Alloc> {
using allocator_type = rebind_extracted_allocator_t<std::byte, Alloc>;
using type = LeanPrimaryOnly<8, allocator_type>;
};And no other changes needed.
There was a problem hiding this comment.
@rfsaliev — thanks for the review. Since the design choice here (bool leanvec_primary_only argument vs. extending StorageKind) is a direct consequence of how LeanDataset exposes primary-only mode in the core repo, I've posted a consolidated response on the internal counterpart. Reposting the relevant trade-offs here so the public-side discussion stands on its own.
A small framing point first, since it shapes what "homogenization" can mean across SVS storage variants:
LVQDataset: two separate classes — one-level (primary only) and two-level (primary + residual). The split is at the class level;LVQDataset<I1, 0, ...>reaches the one-level class via theI2=0discriminator.LeanDataset: a single class today. Primary is a dimensionality-reduced view (with or without compression), and the secondary is a separate type that can be a different family from the primary (e.g. LVQ-primary + uncompressed-secondary). The richer parameterization (T1, T2, LeanVecDims, Extent, Alloc) is why everything lives in one template.
So full homogenization isn't possible — LeanDataset legitimately needs a multi-typed parameterization that LVQDataset doesn't. But the shape of "primary-only" can be made to match: a partial specialization on T2 = void in LeanDataset is the analog of LVQ's separate one-level class, sharing the same conceptual structure (primary-only is a distinct type, not a runtime mode) while preserving LeanVec's multi-parameter template. On the public-runtime side, that translates exactly to your suggestion: new StorageKind values for primary-only LeanVec variants instead of a bool argument.
The proposal, as I read it across the two PRs:
- In
LeanDataset(core, private repo), extend theT2constraint with aLeanCompatibleOrVoidconcept and add a partial specializationLeanDataset<T1, void, LeanVecDims, Extent, Alloc>withsecondary_data_type = voidand nosecondary_member, noget_secondary/view_secondary_dataset/adapt_secondary, etc. - Search extensions in
extensions/{flat,ivf,vamana}/leanvec.hbranch withif constexpr (std::is_same_v<typename Data::secondary_data_type, void>)instead of a runtime check. - The runtime flag (
primary_only_,set_primary_only,is_primary_only, and theif (primary_only_) ...guards inreduce/save/load/resize/compact) goes away. - On the public-runtime side (this PR), the trailing
bool leanvec_primary_onlyargument onVamanaIndexLeanVec::build/DynamicVamanaIndexLeanVec::builddisappears, replaced by newStorageKindvalues (e.g.LeanVecLVQ4PrimaryOnly,LeanVecLVQ8PrimaryOnly) whoseStorageTypespecializations resolve toLeanDataset<UsingLVQ<I>, void, ...>.
Below is a balanced look at the trade-offs so the team can decide if the rework is worth doing now vs. leaving as a follow-up.
Current approach (runtime bool leanvec_primary_only argument) — pros/cons
Pros
- Minimal surface area in this PR. ~26 lines of plumbing across 6 runtime-binding files:
VamanaIndexLeanVec::build(2 overloads),DynamicVamanaIndexLeanVec::build(2 overloads),StorageFactory<LeanVec>::init, and the impl ctors. - Source- and ABI-compatible. New
boolargument defaults tofalse; existing C++ callers and any downstream FAISS layer compile unchanged. - No
StorageKindenum changes, no dispatch-macro changes, no newStorageTypespecializations. - Doesn't depend on any other refactor; ships in isolation.
Cons / limitations
- Runtime flag for a property that is structurally type-level.
LeanDataset<T1, T2, ...>withprimary_only_=trueis, type-wise, identical to the dual-tier dataset — they're the same instantiation with different state. - API asymmetry with every other LeanVec / LVQ variant in the runtime layer. All other choices (
LVQ4x0,LVQ8x0,LVQ4x4,LVQ4x8,LeanVec4x4,LeanVec4x8,LeanVec8x8, etc.) are expressed by picking aStorageKind; this one alone uses a side-channelbool. New readers have to learn two patterns for one conceptual axis. - Adds the same trailing argument to four public build overloads. If we later add another LeanVec build option (e.g. F16-primary-only, F32-primary-only), the same shape compounds: every new option either becomes another bool, or we restructure the API anyway.
- Type-safety hole inherited from the core runtime flag:
get_secondary(),view_secondary_dataset(),adapt_secondary()etc. all compile fine on a primary-only instance and only fail at runtime. TheStorageKind-based approach makes those calls a compile error on the void specialization. - Compile-time elision is lost. Secondary-related code is instantiated and emitted for the
LeanDataset<T1, T2, ...>storage type even when the runtime flag says it won't execute; hot-loop predicates aren't always hoistable. - The FAISS-side
IndexSVSVamanaLeanVec::create_impl()has to forward an extra parameter throughDynamicVamanaIndexLeanVec::build(...). WithStorageKind-based dispatch, it just picks a different enum value and the runtime API stays uniform.
What the proposed approach gains
API consistency in this PR
bool leanvec_primary_onlydisappears fromVamanaIndexLeanVec::buildandDynamicVamanaIndexLeanVec::build(4 overloads total). Build entry points stay shaped exactly like every other LeanVec / LVQ variant: pick aStorageKind, that's it.- The dispatch story is uniform end-to-end:
StorageKind→StorageType<...>::type→ concreteLeanDataset<...>instantiation, with no side-channel state. - Adding additional primary-only variants later (F16 primary, F32 primary, etc.) is a
StorageKindenum value + aStorageTypespecialization — bounded, mechanical, no cross-cutting changes.
Inherited from the core refactor (private repo)
secondary_data_type = voidmakes "this dataset has no secondary" a compile-time fact. Callingadapt_secondaryon a primary-only instance becomes a compile error.- Compiler can eliminate unreachable secondary code paths; binary footprint for primary-only-heavy builds shrinks; no runtime branches in inner search loops.
FAISS plumbing
IndexSVSVamanaLeanVeckeeps its current public ctor (bool leanvec_primary_only = falsefor FAISS-API stability), but internallycreate_impl()just selects a differentStorageKindvalue instead of plumbing the bool downstream. The downstream runtime API stays uniform with all other LeanVec variants.
Cost / risk
- Code churn in this PR: a few new enum values, matching
StorageTypespecializations, aLeanPrimaryOnlyTypealias, dispatch-macro additions, and removal of the bool argument from the four public build overloads + their impl ctors. Net delta is comparable to the current PR's; the work shifts from "plumb a bool" to "add new enum values + dispatch entries". - Depends on the core
LeanDatasetrefactor in the private repo landing first (or both PRs moving together). On-disk schema there is unchanged — same v0.0.1 TOML structure, just consumed by a different C++ type. See the consolidated discussion in internal repository for the backward-compat detail. - Vamana graph format is unaffected — it's independent of
LeanDatasetinternals. - Python bindings:
primary_onlyis not exposed there yet, so out of scope.
Open question
The work is well-bounded but non-trivial, and the wins are mostly architectural (API consistency with every other StorageKind value, type safety, alignment with the LVQ one-level/two-level split, future composability) rather than measurable runtime-perf gains today. Happy to do the rework across both PRs if the team agrees it's the right shape long-term — wanted to lay out both sides clearly so we can decide together whether to take it on in this PR pair or as a follow-up.
Summary
Adds a new
leanvec_primary_onlyparameter (defaultfalse) to the LeanVecbuild entry points in the C++ runtime API. When enabled, the LeanVec
secondary (full-precision) dataset is not allocated, roughly halving the
LeanVec memory footprint for workloads that don't need re-ranking.
API changes
A trailing
bool leanvec_primary_only = falseargument is added to:VamanaIndexLeanVec::build(both overloads)DynamicVamanaIndexLeanVec::build(theDynamicIndexParamsoverloads)Existing call sites are source- and ABI-compatible: the parameter is
defaulted, and the legacy back-compat overloads are unchanged.
Plumbing
The flag is stored in
{Vamana,DynamicVamana}IndexLeanVecImpland forwardedthrough
init_impl→build_impl→StorageFactory<LeanVec>::init→LeanDataset::reduce, where it skips secondary-dataset allocation anddisables re-ranking at search time.