Skip to content

Add leanvec_primary_only build option to C++ runtime#323

Open
ibhati wants to merge 11 commits into
mainfrom
ib/leanvec-primary-only
Open

Add leanvec_primary_only build option to C++ runtime#323
ibhati wants to merge 11 commits into
mainfrom
ib/leanvec-primary-only

Conversation

@ibhati
Copy link
Copy Markdown
Member

@ibhati ibhati commented Apr 29, 2026

Summary

Adds a new leanvec_primary_only parameter (default false) to the LeanVec
build entry points in the C++ runtime API. When enabled, the LeanVec
secondary (full-precision) dataset is not allocated, roughly halving the
LeanVec memory footprint for workloads that don't need re-ranking.

API changes

A trailing bool leanvec_primary_only = false argument is added to:

  • VamanaIndexLeanVec::build (both overloads)
  • DynamicVamanaIndexLeanVec::build (the DynamicIndexParams overloads)

Existing call sites are source- and ABI-compatible: the parameter is
defaulted, and the legacy back-compat overloads are unchanged.

Plumbing

The flag is stored in {Vamana,DynamicVamana}IndexLeanVecImpl and forwarded
through init_implbuild_implStorageFactory<LeanVec>::init
LeanDataset::reduce, where it skips secondary-dataset allocation and
disables re-ranking at search time.

ibhati added 2 commits April 29, 2026 09:49
Adds a new `leanvec_primary_only` parameter (default false) to the
LeanVec build entry points in the C++ runtime API:

- VamanaIndexLeanVec::build (both overloads)
- DynamicVamanaIndexLeanVec::build (DynamicIndexParams overloads)

The flag is stored in {Vamana,DynamicVamana}IndexLeanVecImpl and
forwarded through init_impl -> build_impl -> StorageFactory<LeanVec>::init
-> LeanDataset::reduce, where it skips secondary-dataset allocation
and disables reranking. ABI back-compat overloads are unchanged.

Compiles standalone with SVS_RUNTIME_HAVE_LVQ_LEANVEC=OFF (stub returns
NOT_IMPLEMENTED). When enabled, requires the matching LeanDataset::reduce
overload that accepts the primary_only argument (added in the private
repository alongside the LeanDataset save/load support).
@ibhati ibhati marked this pull request as ready for review May 4, 2026 21:16
@ethanglaser
Copy link
Copy Markdown
Member

Is there a corresponding FAISS update?

@ibhati
Copy link
Copy Markdown
Member Author

ibhati commented May 5, 2026

Is there a corresponding FAISS update?

Yes, I will publish it as well once we have this PR merged

@ethanglaser
Copy link
Copy Markdown
Member

ethanglaser commented May 5, 2026

I'd recommend drafting a FAISS update and pointing the FAISS install in this PR to your FAISS branch, otherwise our CI will be failing

@ibhati
Copy link
Copy Markdown
Member Author

ibhati commented May 6, 2026

I'd recommend drafting a FAISS update and pointing the FAISS install in this PR to your FAISS branch, otherwise our CI will be failing

Makes sense, I will add the corresponding FAISS update

@ibhati ibhati requested a review from rfsaliev May 8, 2026 18:24
Copy link
Copy Markdown
Member

@ethanglaser ethanglaser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes to test-cpp-runtime-bindings.sh should not be needed - please restore based on suggestions and let's see if CI runs smoothly

Comment on lines +34 to +43
# Install libsvs-runtime from the public release tarball (includes the
# leanvec_primary_only API). Once a conda-forge package with this API is
# published, this can revert to:
# conda install -y /runtime_conda/libsvs-runtime-*.conda
SVS_RUNTIME_URL="${SVS_RUNTIME_URL:-https://github.com/intel/ScalableVectorSearch/releases/download/nightly/svs_runtime-0.3.0-linux-x86_64-leanvec-primary-only-glibc228.tar.gz}"
SVS_RUNTIME_PREFIX="${SVS_RUNTIME_PREFIX:-$HOME/svs_runtime}"
mkdir -p "$SVS_RUNTIME_PREFIX"
curl -fsSL "$SVS_RUNTIME_URL" | tar -xz -C "$SVS_RUNTIME_PREFIX"
export CMAKE_PREFIX_PATH="$SVS_RUNTIME_PREFIX${CMAKE_PREFIX_PATH:+:$CMAKE_PREFIX_PATH}"
export LD_LIBRARY_PATH="$SVS_RUNTIME_PREFIX/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Install libsvs-runtime from the public release tarball (includes the
# leanvec_primary_only API). Once a conda-forge package with this API is
# published, this can revert to:
# conda install -y /runtime_conda/libsvs-runtime-*.conda
SVS_RUNTIME_URL="${SVS_RUNTIME_URL:-https://github.com/intel/ScalableVectorSearch/releases/download/nightly/svs_runtime-0.3.0-linux-x86_64-leanvec-primary-only-glibc228.tar.gz}"
SVS_RUNTIME_PREFIX="${SVS_RUNTIME_PREFIX:-$HOME/svs_runtime}"
mkdir -p "$SVS_RUNTIME_PREFIX"
curl -fsSL "$SVS_RUNTIME_URL" | tar -xz -C "$SVS_RUNTIME_PREFIX"
export CMAKE_PREFIX_PATH="$SVS_RUNTIME_PREFIX${CMAKE_PREFIX_PATH:+:$CMAKE_PREFIX_PATH}"
export LD_LIBRARY_PATH="$SVS_RUNTIME_PREFIX/lib${LD_LIBRARY_PATH:+:$LD_LIBRARY_PATH}"
# Install libsvs-runtime from local conda package
conda install -y /runtime_conda/libsvs-runtime-*.conda

conda activate svsenv
conda config --set solver libmamba
conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools
conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools curl
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools curl
conda install -y -c conda-forge cmake=3.30.4 make=4.2 swig=4.0 "numpy>=2.0,<3.0" scipy=1.16 pytest=7.4 gflags=2.2 setuptools

Copy link
Copy Markdown
Member

@rfsaliev rfsaliev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like this API change is not convenient for existing API style:
Suggesting to use new values in StorageKind enum rather than adding extra arguments to index building routines.

const VamanaIndex::SearchParams& default_search_params,
const VamanaIndex::DynamicIndexParams& dynamic_index_params
const VamanaIndex::DynamicIndexParams& dynamic_index_params,
bool leanvec_primary_only = false
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why didn't you just extend the StorageKind enum with values aka: LeanVec4, LeanVec8?

using allocator_type = rebind_extracted_allocator_t<std::byte, Alloc>;
using type = LeanDatasetType<8, 8, allocator_type>;
};

Copy link
Copy Markdown
Member

@rfsaliev rfsaliev May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case if:

  • StorageKind enum would have LeanVec4, LeanVec8 values and
  • svs::leanvec::LeanDataset supports void for the T2 param, all changes in CPP runtime would include just following new lines:
// LeanVec Primary-Only Storage support
template <
    size_t I1,
    typename Alloc,
    size_t LeanVecDims = svs::Dynamic,
    size_t Extent = svs::Dynamic>
using LeanPrimaryOnly = svs::leanvec::LeanDataset<
    svs::leanvec::UsingLVQ<I1>,
    void,
    LeanVecDims,
    Extent,
    Alloc>;

template <typename Alloc> struct StorageType<StorageKind::LeanVec4, Alloc> {
    using allocator_type = rebind_extracted_allocator_t<std::byte, Alloc>;
    using type = LeanPrimaryOnly<4, allocator_type>;
};
template <typename Alloc> struct StorageType<StorageKind::LeanVec8, Alloc> {
    using allocator_type = rebind_extracted_allocator_t<std::byte, Alloc>;
    using type = LeanPrimaryOnly<8, allocator_type>;
};

And no other changes needed.

Copy link
Copy Markdown
Member Author

@ibhati ibhati May 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rfsaliev — thanks for the review. Since the design choice here (bool leanvec_primary_only argument vs. extending StorageKind) is a direct consequence of how LeanDataset exposes primary-only mode in the core repo, I've posted a consolidated response on the internal counterpart. Reposting the relevant trade-offs here so the public-side discussion stands on its own.

A small framing point first, since it shapes what "homogenization" can mean across SVS storage variants:

  • LVQDataset: two separate classes — one-level (primary only) and two-level (primary + residual). The split is at the class level; LVQDataset<I1, 0, ...> reaches the one-level class via the I2=0 discriminator.
  • LeanDataset: a single class today. Primary is a dimensionality-reduced view (with or without compression), and the secondary is a separate type that can be a different family from the primary (e.g. LVQ-primary + uncompressed-secondary). The richer parameterization (T1, T2, LeanVecDims, Extent, Alloc) is why everything lives in one template.

So full homogenization isn't possible — LeanDataset legitimately needs a multi-typed parameterization that LVQDataset doesn't. But the shape of "primary-only" can be made to match: a partial specialization on T2 = void in LeanDataset is the analog of LVQ's separate one-level class, sharing the same conceptual structure (primary-only is a distinct type, not a runtime mode) while preserving LeanVec's multi-parameter template. On the public-runtime side, that translates exactly to your suggestion: new StorageKind values for primary-only LeanVec variants instead of a bool argument.

The proposal, as I read it across the two PRs:

  1. In LeanDataset (core, private repo), extend the T2 constraint with a LeanCompatibleOrVoid concept and add a partial specialization LeanDataset<T1, void, LeanVecDims, Extent, Alloc> with secondary_data_type = void and no secondary_ member, no get_secondary / view_secondary_dataset / adapt_secondary, etc.
  2. Search extensions in extensions/{flat,ivf,vamana}/leanvec.h branch with if constexpr (std::is_same_v<typename Data::secondary_data_type, void>) instead of a runtime check.
  3. The runtime flag (primary_only_, set_primary_only, is_primary_only, and the if (primary_only_) ... guards in reduce/save/load/resize/compact) goes away.
  4. On the public-runtime side (this PR), the trailing bool leanvec_primary_only argument on VamanaIndexLeanVec::build / DynamicVamanaIndexLeanVec::build disappears, replaced by new StorageKind values (e.g. LeanVecLVQ4PrimaryOnly, LeanVecLVQ8PrimaryOnly) whose StorageType specializations resolve to LeanDataset<UsingLVQ<I>, void, ...>.

Below is a balanced look at the trade-offs so the team can decide if the rework is worth doing now vs. leaving as a follow-up.

Current approach (runtime bool leanvec_primary_only argument) — pros/cons

Pros

  • Minimal surface area in this PR. ~26 lines of plumbing across 6 runtime-binding files: VamanaIndexLeanVec::build (2 overloads), DynamicVamanaIndexLeanVec::build (2 overloads), StorageFactory<LeanVec>::init, and the impl ctors.
  • Source- and ABI-compatible. New bool argument defaults to false; existing C++ callers and any downstream FAISS layer compile unchanged.
  • No StorageKind enum changes, no dispatch-macro changes, no new StorageType specializations.
  • Doesn't depend on any other refactor; ships in isolation.

Cons / limitations

  • Runtime flag for a property that is structurally type-level. LeanDataset<T1, T2, ...> with primary_only_=true is, type-wise, identical to the dual-tier dataset — they're the same instantiation with different state.
  • API asymmetry with every other LeanVec / LVQ variant in the runtime layer. All other choices (LVQ4x0, LVQ8x0, LVQ4x4, LVQ4x8, LeanVec4x4, LeanVec4x8, LeanVec8x8, etc.) are expressed by picking a StorageKind; this one alone uses a side-channel bool. New readers have to learn two patterns for one conceptual axis.
  • Adds the same trailing argument to four public build overloads. If we later add another LeanVec build option (e.g. F16-primary-only, F32-primary-only), the same shape compounds: every new option either becomes another bool, or we restructure the API anyway.
  • Type-safety hole inherited from the core runtime flag: get_secondary(), view_secondary_dataset(), adapt_secondary() etc. all compile fine on a primary-only instance and only fail at runtime. The StorageKind-based approach makes those calls a compile error on the void specialization.
  • Compile-time elision is lost. Secondary-related code is instantiated and emitted for the LeanDataset<T1, T2, ...> storage type even when the runtime flag says it won't execute; hot-loop predicates aren't always hoistable.
  • The FAISS-side IndexSVSVamanaLeanVec::create_impl() has to forward an extra parameter through DynamicVamanaIndexLeanVec::build(...). With StorageKind-based dispatch, it just picks a different enum value and the runtime API stays uniform.

What the proposed approach gains

API consistency in this PR

  • bool leanvec_primary_only disappears from VamanaIndexLeanVec::build and DynamicVamanaIndexLeanVec::build (4 overloads total). Build entry points stay shaped exactly like every other LeanVec / LVQ variant: pick a StorageKind, that's it.
  • The dispatch story is uniform end-to-end: StorageKindStorageType<...>::type → concrete LeanDataset<...> instantiation, with no side-channel state.
  • Adding additional primary-only variants later (F16 primary, F32 primary, etc.) is a StorageKind enum value + a StorageType specialization — bounded, mechanical, no cross-cutting changes.

Inherited from the core refactor (private repo)

  • secondary_data_type = void makes "this dataset has no secondary" a compile-time fact. Calling adapt_secondary on a primary-only instance becomes a compile error.
  • Compiler can eliminate unreachable secondary code paths; binary footprint for primary-only-heavy builds shrinks; no runtime branches in inner search loops.

FAISS plumbing

  • IndexSVSVamanaLeanVec keeps its current public ctor (bool leanvec_primary_only = false for FAISS-API stability), but internally create_impl() just selects a different StorageKind value instead of plumbing the bool downstream. The downstream runtime API stays uniform with all other LeanVec variants.

Cost / risk

  • Code churn in this PR: a few new enum values, matching StorageType specializations, a LeanPrimaryOnlyType alias, dispatch-macro additions, and removal of the bool argument from the four public build overloads + their impl ctors. Net delta is comparable to the current PR's; the work shifts from "plumb a bool" to "add new enum values + dispatch entries".
  • Depends on the core LeanDataset refactor in the private repo landing first (or both PRs moving together). On-disk schema there is unchanged — same v0.0.1 TOML structure, just consumed by a different C++ type. See the consolidated discussion in internal repository for the backward-compat detail.
  • Vamana graph format is unaffected — it's independent of LeanDataset internals.
  • Python bindings: primary_only is not exposed there yet, so out of scope.

Open question

The work is well-bounded but non-trivial, and the wins are mostly architectural (API consistency with every other StorageKind value, type safety, alignment with the LVQ one-level/two-level split, future composability) rather than measurable runtime-perf gains today. Happy to do the rework across both PRs if the team agrees it's the right shape long-term — wanted to lay out both sides clearly so we can decide together whether to take it on in this PR pair or as a follow-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants