Skip to content

BigQuery: update column descriptions on existing columns#3881

Open
vriken wants to merge 3 commits intodlt-hub:develfrom
vriken:fix/3879-bigquery-update-column-descriptions
Open

BigQuery: update column descriptions on existing columns#3881
vriken wants to merge 3 commits intodlt-hub:develfrom
vriken:fix/3879-bigquery-update-column-descriptions

Conversation

@vriken
Copy link
Copy Markdown

@vriken vriken commented Apr 20, 2026

Description

dlt applies column description hints to BigQuery only when creating a table (CREATE TABLE) or adding new columns (ALTER TABLE ADD COLUMN). If descriptions are added to the schema after the table already exists, they are never propagated to the destination on subsequent pipeline runs.

This adds a new overridable hook _alter_existing_column_hints_sql in SqlJobClientBase (returns [] by default), called from _build_schema_update_sql for existing tables. The BigQuery implementation:

  • Fetches current column descriptions via the get_table() API
  • Only emits ALTER COLUMN SET OPTIONS when descriptions actually differ
  • Handles description removal via SET OPTIONS(description=NULL)
  • Uses complete columns only (no include_incomplete)

This is metadata-only β€” no data is modified.

Note: Snowflake and Databricks have the same gap β€” both apply column descriptions/comments on CREATE/ADD COLUMN only (via COMMENT syntax). The _alter_existing_column_hints_sql hook is designed for them to override as well, but this PR only implements and tests BigQuery since that's the destination I can verify against.

Related Issues

Additional Context

Files changed:

  • dlt/destinations/job_client_impl.py β€” base class hook + call site in _build_schema_update_sql
  • dlt/destinations/impl/bigquery/bigquery.py β€” BigQuery diff-based implementation
  • tests/load/bigquery/test_bigquery_table_builder.py β€” 6 unit tests (changed, unchanged, removal, new columns, escaping)

vriken and others added 3 commits April 20, 2026 17:01
dlt currently only applies column `description` hints when creating a
table (CREATE TABLE) or adding new columns (ALTER TABLE ADD COLUMN).
If descriptions are added to the schema after the table exists, they
are never propagated to BigQuery.

This adds a new overridable hook `_alter_existing_column_hints_sql` in
SqlJobClientBase (returns [] by default) and implements it in the
BigQuery client to emit ALTER TABLE ... ALTER COLUMN ... SET OPTIONS
statements for columns whose descriptions have changed.

This is a metadata-only change β€” no data is modified.

Fixes dlt-hub#3879

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address review feedback:
- Fetch current column descriptions from BigQuery via get_table() API
  and only emit ALTER COLUMN SET OPTIONS when they actually differ
- Handle description removal: emit SET OPTIONS(description=NULL) when
  a description is removed from the schema but still exists in BQ
- Use get_table_columns without include_incomplete (complete columns only)
- Add tests: diff skips unchanged, removal emits NULL, special char escaping
- Use instance method replacement instead of unittest.mock

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(bigquery): column descriptions not applied to existing columns on schema update

1 participant