Skip to content

[VL][Delta] Add native Delta bitmap aggregation support#12214

Merged
zhztheplayer merged 4 commits into
apache:mainfrom
malinjawi:split/delta-dv-native-bitmap-pr
Jun 17, 2026
Merged

[VL][Delta] Add native Delta bitmap aggregation support#12214
zhztheplayer merged 4 commits into
apache:mainfrom
malinjawi:split/delta-dv-native-bitmap-pr

Conversation

@malinjawi

@malinjawi malinjawi commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

What changes are proposed in this pull request?

Deleting rows from a Delta table with deletion vectors means building (and merging) a roaring bitmap of deleted row indices. Today that happens on the JVM. This adds the native Velox building block for it, so a later PR can run DELETE's bitmap construction natively.

It's just the primitive — nothing is wired into DELETE yet:

  • RoaringBitmapArray handles Delta's Portable-format DV payloads, with bounded deserialization (CRoaring portable sizing before readSafe)
  • a native bitmapaggregator aggregate for row-index aggregation, registered and wired through Gluten's expression/substrait planning
  • native tests for serialize/deserialize + aggregate behavior, and a delta_bitmap_benchmark

Explicitly out of scope (follow-up PRs): DELETE command routing, DML row-index scan planning, and enabling this aggregate as the default DELETE path.

How was this patch tested?

Rebased onto current main (after the DV scan handoff merged in #12269) and re-checked the JVM wiring with JDK 17 (mvn compile -pl gluten-substrait -am, spotless:check, git diff --check). The native build/tests run in CI — that's the authoritative lane here, since local C++ validation is limited by the Velox build tree.

Native evidence from the standalone run (C++ is unchanged by the rebase):

  • RoaringBitmapArrayTest passes; round-trips a payload through both native and a Delta 3.3.2 JVM helper (values 1, 7, 1 << 33; cardinality 3)
  • delta_bitmap_benchmark: 1M build+serialize ~8–10 ms (~105–132M rows/s), merge of 64 partials ~1.1 ms, deserialize+probe ~487 µs

Was this patch authored or co-authored using generative AI tooling?

Generated-by: IBM BOB

@github-actions github-actions Bot added CORE works for Gluten Core VELOX labels Jun 1, 2026
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

2 similar comments
@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions

github-actions Bot commented Jun 1, 2026

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@malinjawi malinjawi marked this pull request as ready for review June 1, 2026 18:39
Add the native bitmap primitive needed by later Delta deletion-vector
DELETE/MoR work, without changing DELETE routing or enabling native
bitmap construction in the command path yet.

- extend RoaringBitmapArray for Delta Portable-format DV payloads, with
  bounded deserialization (CRoaring portable deserialize sizing before
  readSafe)
- add the native bitmapaggregator aggregate for Delta row-index
  aggregation and register it
- wire the aggregate name through Gluten expression / substrait planning
  (ExpressionNames.BITMAP_AGGREGATOR, AggregateFunctionsBuilder)
- add native tests for bitmap serialization/deserialization and aggregate
  behavior, plus a delta_bitmap_benchmark for construction, partial-merge,
  and deserialize/probe cases

Primitive-only: no DELETE command routing, no DML row-index scan planning
changes, and the aggregate is not enabled as the default DELETE path.
@malinjawi malinjawi force-pushed the split/delta-dv-native-bitmap-pr branch from 2122ea6 to 1daeebf Compare June 15, 2026 12:39
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@zhztheplayer zhztheplayer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Thanks.

Comment on lines +70 to +78
private def isPortableDeltaBitmapAggregator(aggregateFunc: AggregateFunction): Boolean = {
aggregateFunc.prettyName == ExpressionNames.BITMAP_AGGREGATOR &&
Try {
aggregateFunc.getClass
.getMethod("serializationFormatString")
.invoke(aggregateFunc)
.asInstanceOf[String]
}.toOption.contains("Portable")
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you explain a bit more about the method? Why reflection is needed, etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, dug into this and the reflection wasn't actually buying us anything, so I dropped it it's just a prettyName match now (cb2f287).

The Portable check was me being paranoid. Delta only ever builds this aggregator in one spot (DMLWithDeletionVectorsHelper) and always with the Portable format, so it never actually filtered anything. And if a weird non-Portable payload ever did show up, the native deserialize already throws on the magic number so the real check is on the C++ side anyway. Bonus: gluten-substrait no longer needs to know anything about delta-spark.


namespace gluten {

void registerDeltaBitmapAggregator(

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a new source folder operators/functions/delta, for all Delta functions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, moved them into operators/functions/delta/ and fixed up the include + test + CMake (cb2f287). Future Delta functions can land there too.

@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

…eflection

- move DeltaBitmapAggregator.{cc,h} into cpp/velox/operators/functions/delta/
  so Delta native functions live under their own source folder; update the
  include in RegistrationAllFunctions.cc, the test, and CMakeLists
- drop the reflective serialization-format check in the bitmap-aggregator match
  and match on prettyName only. Delta's only BitmapAggregator construction site
  (DMLWithDeletionVectorsHelper) always uses the Portable format, and a
  non-Portable payload is rejected natively by RoaringBitmapArray::deserialize
  (magic-number check), so the reflection was redundant
@malinjawi malinjawi force-pushed the split/delta-dv-native-bitmap-pr branch from fc48276 to cb2f287 Compare June 16, 2026 15:54
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds native Velox support for Delta’s deletion-vector bitmap construction by introducing a bounded portable-format roaring bitmap wrapper and a corresponding bitmapaggregator aggregate, plus native tests and a microbenchmark. This enables future work to move DELETE/MoR DV bitmap construction off the JVM and into native execution.

Changes:

  • Add a delta::RoaringBitmapArray implementation matching Delta JVM’s portable DV payload shape, with bounded deserialization and helpers for merge/cardinality/last/optimized serialization.
  • Add and register a native Velox aggregate bitmapaggregator (bigint → row(bigint, bigint, varbinary)) and wire Spark/Substrait naming to it.
  • Add native tests for serialization/deserialization and aggregate behavior, plus delta_bitmap_benchmark.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
shims/common/src/main/scala/org/apache/gluten/expression/ExpressionNames.scala Adds the bitmapaggregator expression name constant.
gluten-substrait/src/main/scala/org/apache/gluten/expression/AggregateFunctionsBuilder.scala Maps Delta’s BitmapAggregator (by prettyName) to the Substrait function name.
cpp/velox/operators/functions/RegistrationAllFunctions.cc Registers the new Delta bitmap aggregate with the Velox function registry.
cpp/velox/operators/functions/delta/DeltaBitmapAggregator.h Declares the registerDeltaBitmapAggregator entry point.
cpp/velox/operators/functions/delta/DeltaBitmapAggregator.cc Implements and registers the bitmapaggregator Velox aggregate.
cpp/velox/compute/delta/RoaringBitmapArray.h Redesigns the DV bitmap wrapper to match Delta’s JVM high-key → 32-bit roaring layout and adds new APIs.
cpp/velox/compute/delta/RoaringBitmapArray.cpp Implements bounded portable-format (de)serialization, merge, cardinality, last, and optimized serialization.
cpp/velox/compute/delta/tests/RoaringBitmapArrayTest.cpp Expands coverage for header compatibility, sparse-gap payloads, bounds enforcement, and trailing-byte checks.
cpp/velox/compute/delta/tests/DeltaBitmapAggregatorTest.cpp Adds end-to-end aggregate tests (null/dup handling, partial+final merge, bounds checks).
cpp/velox/compute/delta/tests/CMakeLists.txt Adds the new aggregate test to the delta test executable.
cpp/velox/CMakeLists.txt Adds the new aggregate implementation file to the Velox build sources.
cpp/velox/benchmarks/DeltaBitmapBenchmark.cc Adds a benchmark for build/serialize, deserialize/probe, and merging partial payloads.
cpp/velox/benchmarks/CMakeLists.txt Registers the new delta_bitmap_benchmark target.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +112 to +120
exec::registerAggregateFunction(
prefix + "bitmapaggregator",
std::move(signatures),
[](core::AggregationNode::Step step,
const std::vector<TypePtr>& argTypes,
const TypePtr& resultType,
const core::QueryConfig& /* config */) -> std::unique_ptr<exec::Aggregate> {
VELOX_CHECK_EQ(argTypes.size(), 1, "bitmapaggregator takes one argument");
VELOX_CHECK_EQ(argTypes[0]->kind(), exec::isRawInput(step) ? TypeKind::BIGINT : TypeKind::VARBINARY);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 3026539

The native bitmapaggregator is registered with Velox but was missing from
SubstraitToVeloxPlanValidator's supportedAggFuncs allow-list, so any plan
containing it would validate-fail and fall back to Spark instead of running
natively. Add it next to bitmap_construct_agg.
@github-actions

Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@malinjawi malinjawi requested a review from zhztheplayer June 17, 2026 12:18

@zhztheplayer zhztheplayer left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@zhztheplayer zhztheplayer merged commit c421195 into apache:main Jun 17, 2026
63 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CORE works for Gluten Core VELOX

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants