Add citus.allow_unsafe_insert_select_pushdown for shard-local batched INSERT ... SELECT#8625
Draft
onurctirtir wants to merge 16 commits into
Draft
Add citus.allow_unsafe_insert_select_pushdown for shard-local batched INSERT ... SELECT#8625onurctirtir wants to merge 16 commits into
onurctirtir wants to merge 16 commits into
Conversation
…tching pushdown Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #8625 +/- ##
==========================================
- Coverage 88.73% 88.71% -0.03%
==========================================
Files 288 288
Lines 64384 64446 +62
Branches 8108 8126 +18
==========================================
+ Hits 57133 57175 +42
- Misses 4909 4918 +9
- Partials 2342 2353 +11 🚀 New features to boost your workflow:
|
Fixes check-style ci/check_gucs_are_alphabetically_sorted.sh. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Wrap the two EXPLAIN statements in public.explain_filter(..., true) so the PG18-only "Window:" line is stripped and the plan footer row counts match across supported Postgres versions. Regenerate the expected output, which also re-syncs blank lines that had drifted from the .sql. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
7e67c16 to
f6f01e7
Compare
Validate the shard-local INSERT..SELECT batching pushdown plan directly on PG18 (no public.explain_filter), so the PG18-only WindowAgg "Window:" line is exercised in pg18.sql. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Cover each relaxed branch (volatile fn, derived insert dist col, GROUP BY on non-dist col, aggregates and HAVING without GROUP BY, window not on dist col, DISTINCT without dist col), a few combinations, and the same constructs nested in subqueries. Add negative tests showing the GUC does not apply to reference-table targets and never relaxes LIMIT/OFFSET. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The unsafe insert-select pushdown GUC no longer relaxes the volatile function ban: a batched-embeddings style UDF is immutable, so the GUC only needs to relax grouping / window / aggregate / DISTINCT on non-distribution columns and partition-column matching. Volatile functions stay blocked from shard pushdown and fall back to the coordinator plan as before. - insert_select_planner.c: always defer on volatile functions - shared_library_init.c: drop "volatile functions" from the GUC description - tests: make batch_transform IMMUTABLE PARALLEL SAFE, replace the volatile EXPLAIN test with the batched-embeddings benchmark query shape (row_number()/batch_size bucketing + array_agg + batch UDF + unnest zip-back) as the SELECT of an INSERT .. SELECT, with an id<->val correctness check, and a negative test showing a volatile function is still not pushed down even with the GUC enabled. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PG18 emits 'Insert on res_... citus_table_alias' for the shard-local INSERT..SELECT pushdown plan; the committed pg18.out was missing the alias, causing the pg18 test (and all Test flakyness shards) to fail on CI. Match CI output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
DESCRIPTION: Adds
citus.allow_unsafe_insert_select_pushdownto allow maybe unsafe shard-local batchedINSERT .. SELECT ..Some workloads want to call an expensive, batch-oriented UDF from a colocated
INSERT … SELECT— e.g.process_array(text[]), which returns somethingper element, in order. The natural way to use it is to batch rows with
array_agg(+ a window function to form batches), call the UDF once per batch, then
unnestthe result back to one row per input and insert it.
On a distributed table this SELECT looks like it "requires a merge" (e.g., GROUP
BY / window on a non-distribution column), so Citus today falls back to
pull-to-coordinator: it pulls every shard's rows to the coordinator, batches
there, and calls the UDF there. That defeats the purpose — the batching and the
expensive call should run on the shards.
What this adds
With
citus.allow_unsafe_insert_select_pushdown(defaultoff), for acolocated
INSERT … SELECT, it relaxes the merge-step checks that wouldotherwise cause pull-to-coordinator, so the whole SELECT — grouping, window,
and the batch UDF — is pushed down and runs shard-local, one task per shard.
Concretely, when the GUC is on:
no longer force a merge step — for the top-level SELECT and the subqueries.
partition column as a plain
Var. Instead it may be a provably shard-localbatch pass-through of the distribution column, i.e.
unnest(array_agg(dist_key))(InsertPartitionColumnMatchesSelectis skippedonly when this exact pattern is detected). Because those values are read
straight from this shard's rows and — since source and target are colocated —
hash right back into this shard's range, routing stays correct. The NOT NULL
filter Citus normally injects on that column is skipped too, since there is no
plain
Varto attach it to.However, the following are still enforced:
unnest(array_agg(dist_key))pass-through(or a plain
Var). Any other derived distribution value — e.g.dist_key + 1,length(text_col), orunnest(f(array_agg(dist_key)))— isstill rejected even with the GUC on, because it could produce values that hash
to a different shard and silently misroute rows.
enforced. This is what keeps every batch shard-local, and is the reason the
relaxation is sound rather than arbitrary.
expected to be immutable, so this ban is not relaxed).
Within those guardrails the user still takes responsibility for making each batch
order-preserving (e.g.
array_agg(... ORDER BY ...)) so generated rows line up.EXPLAIN — before / after
The batch shape: bucket rows into fixed-size batches with
row_number()/batch_size,array_aggeach batch (id and text in the same order), call the batch UDF onceper batch, then
unnestback to one row per id.Default (
off) — batching happens after pulling every shard's rows to thecoordinator, so the window scan runs a distributed subplan and the grouping +
UDF run in a single coordinator task over the intermediate results:
With
citus.allow_unsafe_insert_select_pushdown = on— the window, grouping, thebatch UDF, and the INSERT all run shard-local, one task per shard, with no
distributed subplan / intermediate results: