Dispatch and discriminate REPACK/CLUSTER (T_RepackStmt) on PG19 (#8613)#8624
Open
ihalatci wants to merge 1 commit into
Open
Dispatch and discriminate REPACK/CLUSTER (T_RepackStmt) on PG19 (#8613)#8624ihalatci wants to merge 1 commit into
ihalatci wants to merge 1 commit into
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## pg19-support #8624 +/- ##
===============================================
Coverage ? 88.95%
===============================================
Files ? 288
Lines ? 64392
Branches ? 8096
===============================================
Hits ? 57278
Misses ? 4780
Partials ? 2334 🚀 New features to boost your workflow:
|
32b630f to
d98bdb9
Compare
PG19 replaced ClusterStmt with the unified RepackStmt, which backs both the legacy CLUSTER command and the new REPACK command under a single node tag (T_RepackStmt). The foundation shim in pg_version_compat.h aliases the old type/tag names so existing Citus code keeps compiling; this change makes the command handling correct rather than merely compiling. Citus now discriminates CLUSTER vs REPACK via RepackStmt->command, exposed through version-portable helpers ClusterStmtIsRepack() and ClusterStmtCommandName() (constant CLUSTER/false fallbacks pre-PG19). PreprocessClusterStmt propagates REPACK exactly like CLUSTER -- the original command text is shipped to every shard placement -- and only varies the user-facing WARNING/ERROR wording by command. The worker name-relay in relay_event_utility.c appends the shardId to both the target relation name and the optional USING INDEX name for either command. VACUUM FULL is unaffected: core keeps dispatching it through T_VacuumStmt, so it never reaches this CLUSTER/REPACK path. Adds a PG19-only regress test (pg19) that distributes a table and proves REPACK ... USING INDEX, bare REPACK, and CLUSTER ... USING each rewrite every shard placement (relfilenode change), preserve row data, and that VERBOSE is rejected with command-aware wording. PG17/18 run the early-quit variant (pg19_0.out) and stay regression-neutral. Validated under full -Werror on PG17.10, PG18.4, and PG19beta1 -- all green. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
d98bdb9 to
c3f090a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part of #8597 (PostgreSQL 19 support).
Addresses #8613 — dispatch and discriminate the PG19 unified REPACK/CLUSTER command (
T_RepackStmt).What & why
PG19 removed
ClusterStmt/T_ClusterStmtand replaced them with the unifiedRepackStmt, which backs three commands distinguished byRepackStmt.command:CLUSTER, the newREPACK, andVACUUM FULL. (VACUUM FULL still dispatches throughT_VacuumStmt, so it never reaches this path.) The foundation PR (#8601) added a compile-only shim aliasingT_ClusterStmt→T_RepackStmt; that makes the names compile but is not correct behaviour, because CLUSTER and REPACK now collide on a single node tag and must be told apart.This PR makes Citus dispatch and propagate the command correctly on PG19, while staying byte-for-byte regression-neutral on PG17/PG18.
Approach (Option A)
Citus propagates REPACK exactly like CLUSTER: the command is shipped to every shard placement through the existing CLUSTER code path. The worker name-relay (
RelayEventExtendNames→AppendShardIdToName) mutates the parse tree in place, appending the shardId to both the target relation name and the index name (whenUSING INDEXis given), thenProcessUtilityParseTreeexecutes the mutated node — there is no deparse-to-string step, so the existing relabel localizes REPACK's names with no new deparser work. CLUSTER vs REPACK are discriminated byRepackStmt.command, and the user-facing WARNING/ERROR wording is command-aware.Files
src/include/pg_version_compat.h— RepackStmt shim + version-portable helpersClusterStmtIsRepack()/ClusterStmtCommandName()(compiled out on< PG19).src/backend/distributed/commands/cluster.c—PreprocessClusterStmtis command-aware (correct REPACK vs CLUSTER wording in the no-relation WARNING, the partitioned WARNING, and the VERBOSE ERROR).src/backend/distributed/commands/distribute_object_ops.c—GetDistributeObjectOpsroutes both CLUSTER and REPACK (sharedT_ClusterStmt/T_RepackStmttag).src/backend/distributed/relay/relay_event_utility.c— name-relay reads the RepackStmt layout (relation + index name).src/test/regress/sql/pg19.sql,expected/pg19.out,expected/pg19_0.out— new PG19-only acceptance test (+ the< PG19early-quit variant).src/test/regress/multi_1_create_citus_schedule— registers thepg19test.Acceptance test (PG19-only,
pg19.sql)The test distributes
repack_test(4 shards, replication factor 1) and demonstrates (not merely asserts) that distributed REPACK works:REPACK <t> USING INDEX <idx>, bareREPACK <t>, andCLUSTER <t> USING <idx>each rewrite every shard placement — the relfilenode changes on all shards (verified viarun_command_on_shards).placements_unchanged— shard set + placement nodes identical before/after (symmetricEXCEPT).shard_count_unchanged— shard count identical before/after.shard_row_mapping_unchanged— per-shardcount(*)identical before/after (no row crossed a shard boundary).distribution_unchanged—(partmethod, partkey, colocationid)intact.a = 42 → (42, 2)anda = 100 → (100, 0); cross-shard aggregatecount = 100, sum = 5050.VERBOSE→ REPACK/CLUSTER-worded ERROR; partitioned distributed table →not propagating REPACK command for partitioned table to worker nodesWARNING.The
< PG19early-quit path keeps PG17/PG18 on the unchangedpg19_0.out, so this test is invisible to older majors.Validation gate (WSL, pgenv, full
-Werror, no demotion)-Werror)\q→ unchangedpg19_0.out)\q→ unchangedpg19_0.out)t)CLUSTER's existing tests/expected (
multi_index_statements,pg15) are unchanged on all versions.