Skip to content

Eliminate redundant cardinality() pass in MaxScoreBulkScorer#15971

Open
iprithv wants to merge 2 commits intoapache:mainfrom
iprithv:optimize-maxscore-eliminate-cardinality
Open

Eliminate redundant cardinality() pass in MaxScoreBulkScorer#15971
iprithv wants to merge 2 commits intoapache:mainfrom
iprithv:optimize-maxscore-eliminate-cardinality

Conversation

@iprithv
Copy link
Copy Markdown
Contributor

@iprithv iprithv commented Apr 21, 2026

Description

In MaxScoreBulkScorer.scoreInnerWindowMultipleEssentialClauses(), the cardinality() call was used solely to pre-size the docAndScoreAccBuffer before extracting matches from the bitset via forEach(). This resulted in two full passes over the bitset's 64 longs (for INNER_WINDOW_SIZE=4096): one for counting, one for extraction.

This change replaces growNoCopy(windowMatches.cardinality(0, innerWindowSize)) with growNoCopy(INNER_WINDOW_SIZE), eliminating the counting pass entirely. The buffer is reused across inner windows, so the one-time over-allocation (~48KB for int[] + double[]) is negligible.

Benchmark Results

JMH benchmark

Benchmark                              (matchCount)   Mode  Cnt  Score   Units
oldCardinalityForEach (before)               50      thrpt    3  6.809  ops/us
newForEachNoCardinality (after)              50      thrpt    3  7.686  ops/us  → +12.9% faster

oldCardinalityForEach (before)              128      thrpt    3  3.044  ops/us
newForEachNoCardinality (after)             128      thrpt    3  3.170  ops/us  → +4.1% faster

oldCardinalityForEach (before)              500      thrpt    3  0.466  ops/us
newForEachNoCardinality (after)             500      thrpt    3  0.502  ops/us  → +7.7% faster

oldCardinalityForEach (before)             1000      thrpt    3  0.242  ops/us
newForEachNoCardinality (after)            1000      thrpt    3  0.234  ops/us  → ~same

5-13% improvement across typical match densities (50-500 docs per window), which is the common range for multi-term BooleanQuery workloads.

Context

This method is on the hot path for multi-clause BooleanQuery scoring:
IndexSearcher.search()MaxScoreBulkScorer.score()scoreInnerWindowMultipleEssentialClauses()

It is invoked for every 4096-doc inner window when a query has 2+ essential clauses. The cardinality() call was iterating all 64 longs of the bitset purely to determine a buffer size, work that can be avoided by pre-allocating to the maximum possible size.

An intoArray() - based approach was also evaluated but proved slower for sparse windows (10-128 matches) due to scanning empty words. The forEach() approach with pre allocation is the best strategy across all densities.

@github-actions github-actions Bot added this to the 11.0.0 milestone Apr 21, 2026
Signed-off-by: prithvi <prithvisivasankar@gmail.com>
@iprithv iprithv force-pushed the optimize-maxscore-eliminate-cardinality branch from 2aa8442 to 6c4804e Compare April 21, 2026 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant