Skip to content

perf: merge half-open range queries on the same BTree index#7477

Open
xloya wants to merge 1 commit into
lance-format:mainfrom
xloya:upstream-pr/btree-range-merge
Open

perf: merge half-open range queries on the same BTree index#7477
xloya wants to merge 1 commit into
lance-format:mainfrom
xloya:upstream-pr/btree-range-merge

Conversation

@xloya

@xloya xloya commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

Problem

A filter like fqdn = x AND log_time >= A AND log_time <= B AND channel = y is compiled into two half-open range queries on the same BTree index (log_time >= A with an unbounded upper bound, and log_time <= B with an unbounded lower bound). Each half-open range matches nearly all BTree pages, so the entire index is loaded even though only the pages inside [A, B] are needed.

Fix

Add a ScalarIndexExpr::optimize() pass that:

  1. flattens the AND tree to collect leaf queries,
  2. merges Range queries on the same index into a single closed range,
  3. rebuilds the AND tree.

It is called from ScalarIndexExec::new() before execution.

Benchmark

Method: build a real BTree index over 100,000 sorted unique i32 values, split into ~100 pages (batch_size = 1000). Query a narrow 50-value window in the middle of the range. Measure both (a) the number of BTree pages loaded (via LocalMetricsCollector.parts_loaded) and (b) query wall-clock time, for:

  • Before — the two half-open range searches the AND tree issues separately: value >= a (unbounded upper) and value <= b (unbounded lower);
  • After — the single closed range [a, b] produced by optimize().

Pages loaded:

Execution BTree pages loaded
Before (two half-open ranges) 101 (≈ whole index)
After (merged closed range) 1

101x fewer index pages loaded for this query shape.

Query latency:

Scenario Before (two half-open) After (merged) Speedup
In-memory (CPU/decode bound) 18.6 ms/q 0.28 ms/q ~67x
Per-page latency (2 ms/GET) 76.7 ms/q 10.8 ms/q ~7x

In-memory, the ~67x reflects the decode/scan cost of touching the whole index vs a single page. On remote storage the wall-clock gain depends on whether the workload is latency-bound (page fetches are parallelized, so the gain is smaller) or bandwidth-bound (scales with the ~100x reduction in bytes read).

Test

test_optimize_* suite in expression.rs covers the merge logic (nested AND trees, exclusive bounds, no-merge cases for different indices / non-range queries, OR/NOT preservation, recheck propagation).

…index load

When a filter like 'fqdn = x AND log_time >= A AND log_time <= B AND channel = y'
is evaluated, the expression compiler splits the log_time range into two separate
half-open range queries (>= A with Unbounded upper, <= B with Unbounded lower).
Each half-open range matches nearly all BTree pages, causing the entire index to
be loaded (~433 MB, ~100s) even though only a few pages (~5 MB) are actually needed.

This fix adds a ScalarIndexExpr::optimize() pass that:
1. Flattens the AND expression tree to collect all leaf queries
2. Identifies Range queries on the same index
3. Merges them into a single closed-range query with tighter bounds
4. Rebuilds the AND tree with the merged result

The optimize() pass is called in ScalarIndexExec::new() before execution.

Expected improvement: queries combining BTree range filters with other index
filters (Bitmap) should see ~20x reduction in index I/O on first query.
@github-actions github-actions Bot added A-index Vector index, linalg, tokenizer performance labels Jun 25, 2026
@codecov

codecov Bot commented Jun 25, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.75951% with 41 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance-index/src/scalar/expression.rs 93.75% 38 Missing and 3 partials ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-index Vector index, linalg, tokenizer performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant