Skip to content

syncer(dm): add MariaDB ddl rewriter#12711

Open
joechenrh wants to merge 17 commits into
pingcap:masterfrom
joechenrh:dm-mariadb-ast-rules
Open

syncer(dm): add MariaDB ddl rewriter#12711
joechenrh wants to merge 17 commits into
pingcap:masterfrom
joechenrh:dm-mariadb-ast-rules

Conversation

@joechenrh

@joechenrh joechenrh commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: None

MariaDB DDL can contain syntax that TiDB's parser accepts but TiDB later rejects during DDL validation or execution. One example is a non-time column with a time-function default:

CREATE TABLE t(c VARCHAR(100) DEFAULT current_timestamp());

Without a compatibility layer, these statements can pass DM parsing and then fail later when applied downstream.

What is changed and how it works?

This PR adds a focused AST-level DDL rewriter for the DM syncer path. The rewriter is a best-effort compatibility layer for known parser-accepted MariaDB DDL patterns, not a replacement for TiDB's parser or DDL validator. SQL that TiDB's parser rejects never reaches this layer, and DDL failures that can be handled by downstream session settings, such as SQL mode, continue through the normal DM flow.

  • Add dm/pkg/ddl/rewriter with a RewriteStmt entry point and explicit rule options.
  • Make rewrite rules best-effort: rules mutate the AST when they can confirm a supported rewrite and otherwise leave the statement unchanged. The rewriter only reports whether the AST changed; it does not create a new DDL failure path.
  • Enable MariaDB compatibility rules only when the upstream flavor is MariaDB.
  • Move the existing collation_compatible=strict collation adjustment into the same rewriter package as an independent strict-collation rule, preserving its existing best-effort behavior.
  • Apply MariaDB compatibility rules after TiDB parser successfully returns an AST and before SplitDDL restores SQL for downstream execution and schema tracking.
  • Apply strict collation rules when building per-DDL info, before routing restores SQL from the cached AST.

The MariaDB compatibility rules are selected from both MariaDB's own mysql-test cases and the mariadb2tidb rule set, then narrowed to AST-local transformations that fit DM's row-based sync assumptions. Summarized rule coverage and non-goals:

  • remove unsupported time-function defaults from column types that TiDB rejects, for example VARCHAR DEFAULT current_timestamp() or DECIMAL DEFAULT CURRENT_TIMESTAMP(6);
  • remove literal non-NULL defaults from TEXT/BLOB/JSON columns while keeping DEFAULT NULL and TiDB-supported expression defaults such as DEFAULT(json_object(...));
  • add prefix lengths only for plain secondary indexes on TEXT/BLOB columns or oversized CHAR/VARCHAR columns;
  • rewrite generated-column JSON_VALUE(json_doc, path) expressions to JSON_UNQUOTE(JSON_EXTRACT(json_doc, path)) as a best-effort TiDB-compatible expression, since TiDB does not provide a JSON_VALUE builtin.

Rules that would silently change semantics, such as removing generated columns, dropping JSON_VALID checks, removing zero-time defaults that can be handled by adjusted downstream SQL mode, rewriting unique/primary index prefixes, or changing column definitions only to satisfy index limits, are intentionally not enabled.

Check List

Tests

  • Unit test

Executed:

go test ./dm/pkg/ddl/rewriter
go test ./dm/syncer -run 'TestParseOneStmtWithMariaDBASTRewriter|TestAdjustCollation'
go test ./dm/syncer -run TestSuite -check.f 'TestAdjustDatabaseCollation'
go test ./dm/pkg/ddl/rewriter ./dm/syncer -run 'TestRewrite|TestParseOneStmtWithMariaDBASTRewriter|TestAdjustCollation'

Questions

Will it cause performance regression or break compatibility?

No expected performance regression. The rewriter only runs on parsed DDL statements. MariaDB compatibility rules are enabled only for MariaDB upstreams, and strict collation rules are enabled only when collation_compatible=strict.

The compatibility rules are best-effort and intentionally conservative to avoid broad semantic changes. Unsupported parser failures, failures that can be handled by downstream session settings, and unsupported TiDB validation failures outside the verified rule scope continue to be handled by the existing DM flow.

Do you need to update user documentation, design documentation or monitoring documentation?

No user-facing documentation is included in this draft. The change is intentionally scoped to an internal AST rewrite layer and syncer integration. Rule coverage and non-goals are documented in https://pingcap.feishu.cn/wiki/JtgQwqbZTiU2qlkcKjWcbiRrnqc.

Release note

dm: add an AST-level MariaDB DDL compatibility rewriter for parser-accepted statements that TiDB rejects during downstream DDL execution.

@ti-chi-bot

ti-chi-bot Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@ti-chi-bot ti-chi-bot Bot added do-not-merge/needs-linked-issue do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 11, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign lance6716 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot added area/dm Issues or PRs related to DM. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jun 11, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new mariadbcompat package to handle AST-based compatibility rewriting for MariaDB DDL statements, integrating it into the DM syncer to support MariaDB upstreams. The review feedback highlights several critical improvements, including fixing a potential nil pointer dereference in indexPrefixRule for expression-based indexes, unwrapping parentheses in isJSONGenerated to prevent parsing failures, replacing platform-dependent filepath.Match with strings.HasPrefix, and removing redundant code in collationRule and rewriteDatabaseOptions. Additionally, an optimization was suggested to write directly to strings.Builder in restoreStatements to avoid unnecessary allocations.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread dm/pkg/ddl/rewriter/mariadb_rules.go
Comment thread dm/pkg/ddl/rewriter/rewriter.go Outdated
Comment thread dm/pkg/ddl/rewriter/rules.go Outdated
Comment thread dm/pkg/ddl/rewriter/rules.go Outdated
Comment thread dm/pkg/ddl/rewriter/rules.go Outdated
Comment thread dm/pkg/ddl/rewriter/rules.go Outdated
@joechenrh joechenrh force-pushed the dm-mariadb-ast-rules branch 3 times, most recently from 29d21c7 to fefbd4b Compare June 15, 2026 07:15
@joechenrh joechenrh force-pushed the dm-mariadb-ast-rules branch from fefbd4b to af1fbd9 Compare June 16, 2026 05:17
@joechenrh joechenrh changed the title syncer(dm): add MariaDB AST DDL rewriter syncer(dm): add MariaDB ddl rewriter Jun 16, 2026
joechenrh and others added 2 commits June 17, 2026 17:21
Signed-off-by: Ruihao Chen <joechenrh@gmail.com>
@joechenrh joechenrh marked this pull request as ready for review June 17, 2026 09:25
@ti-chi-bot ti-chi-bot Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 17, 2026
@ti-chi-bot

ti-chi-bot Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

[FORMAT CHECKER NOTIFICATION]

Notice: To remove the do-not-merge/needs-linked-issue label, please provide the linked issue number on one line in the PR body, for example: Issue Number: close #123 or Issue Number: ref #456.

📖 For more info, you can check the "Contribute Code" section in the development guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/dm Issues or PRs related to DM. do-not-merge/needs-linked-issue release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant