When a caller skips all remaining values on a page (to_skip >= values_left),
last_value will not be read again after the skip completes. The current
skip() implementation always decodes values through get_batch to maintain
last_value accuracy, which requires a heap-allocated scratch buffer and
unnecessary decode work.
Proposed fix: Detect the terminal case upfront. When terminal, use
BitReader::skip(n, bit_width) to advance the bit reader without decoding
individual values, and return early without updating last_value. This avoids
the scratch-buffer allocation entirely for the common "skip rest of page" case.
Measured improvement (arrow_reader bench, vs upstream HEAD):
- mixed stepped skip: -3.9%
When a caller skips all remaining values on a page (
to_skip >= values_left),last_valuewill not be read again after the skip completes. The currentskip()implementation always decodes values throughget_batchto maintainlast_valueaccuracy, which requires a heap-allocated scratch buffer andunnecessary decode work.
Proposed fix: Detect the terminal case upfront. When terminal, use
BitReader::skip(n, bit_width)to advance the bit reader without decodingindividual values, and return early without updating
last_value. This avoidsthe scratch-buffer allocation entirely for the common "skip rest of page" case.
Measured improvement (arrow_reader bench, vs upstream HEAD):