Skip to content

[fix](timestamptz) Preserve correct DST fold branch to go cross the transition point#63034

Merged
zclllyybb merged 1 commit into
apache:masterfrom
zclllyybb:ttz_dst
May 8, 2026
Merged

[fix](timestamptz) Preserve correct DST fold branch to go cross the transition point#63034
zclllyybb merged 1 commit into
apache:masterfrom
zclllyybb:ttz_dst

Conversation

@zclllyybb
Copy link
Copy Markdown
Contributor

Problem Summary: TIMESTAMPTZ values in a DST fold hour could be converted through an ambiguous local time, making explicit post-fold instants use the earlier branch during FE literal handling and BE floor/ceil/date_trunc execution.

The old code would treat DST repeated hours like 2024-11-03 01:xx as local civil time without offset when parsing TIMESTAMPTZ with explicit offset in FE, and when BE converts local time to UTC using date_trunc/floor/ceil, it would lose the -04/-05 offset branch information and default to the earlier branch. After the fix, FE directly constructs TimestampTzLiteral from timezone-aware strings, and BE uses the original UTC instant's offset to select pre/post branches when converting repeated hours back to UTC, thus preserving the correct fold branch.

when time_zone = 'America/New_York':

SELECT
CAST(CAST('2024-11-03 01:05:00 -05:00' AS TIMESTAMPTZ(6)) AS VARCHAR(64)) AS ts,
CAST(date_trunc(CAST('2024-11-03 01:05:30 -05:00' AS TIMESTAMPTZ(6)), 'hour') AS VARCHAR(64)) AS hour_trunc;

before:

ts                                      hour_trunc
2024-11-03 01:05:00.000000-04:00       2024-11-03 01:00:00.000000-04:00

now:

ts                                      hour_trunc
2024-11-03 01:05:00.000000-05:00       2024-11-03 01:00:00.000000-05:00

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zclllyybb zclllyybb changed the title [fix](timestamptz) Preserve correct DST fold branch to [fix](timestamptz) Preserve correct DST fold branch to go cross the transition point May 6, 2026
@zclllyybb zclllyybb marked this pull request as ready for review May 6, 2026 14:12
Copilot AI review requested due to automatic review settings May 6, 2026 14:12
@zclllyybb
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{
"event": "REQUEST_CHANGES",
"commit_id": "81d6e31144bddb0e013415df1e72ec92dfe873fb",
"body": "I found one blocking FE compatibility regression in the new TIMESTAMPTZ literal path. The BE fold-branch handling is focused and the new regression test covers the main DST fold case, but the FE change bypasses existing timezone normalization/range validation for explicit casts.\n\nCritical checkpoint conclusions:\n- Goal and proof: The PR addresses DST fold branch preservation and adds a regression test for explicit -04/-05 values plus date_trunc/floor/ceil behavior. Coverage misses legacy timezone abbreviation and offset-range behavior on the changed FE cast path.\n- Scope/focus: The BE changes are small and localized. The FE shortcut is small, but it bypasses existing parsing semantics instead of reusing the established normalization path.\n- Concurrency/lifecycle: No new concurrency, locking, or special lifecycle concerns found.\n- Config/compatibility: No new config or storage/protocol format changes. There is an incompatible FE behavior change for explicit string-to-TIMESTAMPTZ casts.\n- Parallel paths: Explicit casts and implicit character-literal coercion were checked; the blocking regression is on the explicit StringLikeLiteral cast path that previously normalized CST and validated offset range.\n- Tests/results: New regression output is deterministic and appears generated, but tests do not cover the compatibility cases affected by the parser bypass.\n- Observability/transactions/memory/performance: No new observability, transaction, memory-tracking, or meaningful performance issues found for this change.\n\nUser focus: No additional user-provided review focus was supplied.",
"comments": [
{
"path": "fe/fe-core/src/main/java/org/apache/doris/nereids/trees/expressions/literal/StringLikeLiteral.java",
"line": 152,
"side": "RIGHT",
"body": "This new direct path bypasses the existing castToDateTime normalization and validation for timezone-bearing strings. Two concrete regressions on explicit casts: CAST('2024-01-01 00:00:00 CST' AS TIMESTAMPTZ) used to normalize CST to +08:00 in castToDateTime, but TimestampTzLiteral's Java formatter rejects CST; and CAST('2024-01-01 00:00:00 +15:00' AS TIMESTAMPTZ) now bypasses the existing hourOffset > 14 guard. Please keep the fold-preserving direct construction for valid explicit offsets, but run the same timezone normalization/range validation before constructing the TimestampTzLiteral."
}
]
}

@zclllyybb
Copy link
Copy Markdown
Contributor Author

/review

@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

I reviewed the full PR diff and the existing review context. No distinct actionable correctness issues found.

Critical checkpoint conclusions:

  • Goal and tests: The PR fixes TIMESTAMPTZ DST fold branch preservation for explicit-offset literals and BE date_trunc/floor/ceil paths. The added regression test covers explicit pre/post-fold values, equality, rendering, truncation, and persisted UTC results.
  • Scope and clarity: The change is focused on TIMESTAMPTZ conversion/literal coercion and keeps the implementation small.
  • Concurrency/lifecycle: No new shared mutable state, threads, locks, or lifecycle-sensitive static initialization introduced.
  • Config/compatibility: No new configs, storage format changes, or FE-BE protocol changes identified.
  • Parallel paths: I checked FE string literal coercion/signature search and BE date_trunc/floor/ceil TIMESTAMPTZ paths; the relevant changed paths are covered.
  • Error handling/memory: No ignored Status, new large allocations, or memory ownership concerns identified.
  • Testing: Regression and FE unit coverage are present. I did not rerun the PR tests in this review runner.
  • Observability: No new critical operational path requiring additional logging/metrics.
  • User focus: No additional user-provided review focus was specified.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated no new comments.

### What problem does this PR solve?

Issue Number: DORIS-25353, DORIS-25467

Related PR: None

Problem Summary: TIMESTAMPTZ values in a DST fold hour could be converted through an ambiguous local time, making explicit post-fold instants use the earlier branch during FE literal handling and BE floor/ceil/date_trunc execution.

### Release note

Fix TIMESTAMPTZ handling for explicit offsets and sub-hour truncation in DST fold hours.

### Check List (For Author)

- Test: Regression test / Unit Test / Build / Static Analysis
    - `./build.sh --fe --be -j 100`
    - `./run-regression-test.sh --run -d datatype_p0/timestamptz -s test_timestamptz_dst_fold -forceGenOut`
    - `./run-regression-test.sh --run -d datatype_p0/timestamptz -s test_timestamptz_dst_fold`
    - `mvn -f fe/pom.xml -pl fe-core -Dtest=org.apache.doris.nereids.util.TypeCoercionUtilsTest#testCharacterLiteralTypeCoercion -DfailIfNoTests=false -Dmaven.build.cache.enabled=false test`
    - `build-support/run-clang-tidy.sh --files be/src/core/value/timestamptz_value.h be/src/core/value/timestamptz_value.cpp be/src/exprs/function/function_datetime_floor_ceil.cpp be/src/exprs/function/function_other_types_to_date.cpp`
- Behavior changed: Yes. TIMESTAMPTZ DST fold-hour values keep the correct post-fold branch.
- Does this need documentation: No
@zclllyybb
Copy link
Copy Markdown
Contributor Author

run buildall

@zclllyybb zclllyybb requested a review from Copilot May 8, 2026 07:24
@zclllyybb
Copy link
Copy Markdown
Contributor Author

/review

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

PR approved by at least one committer and no changes requested.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated no new comments.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. I did not find blocking issues in this PR.

Critical checkpoint conclusions:

  • Goal/test evidence: The change targets TIMESTAMPTZ DST fold branch preservation for explicit offsets and sub-hour truncation. The new regression test covers pre/post fold rendering, equality against an explicit post-fold literal, date_trunc(second/minute/hour), minute_floor/ceil, hour_floor, and persisted UTC results; the FE coercion unit test covers wildcard TIMESTAMPTZ literal typing.
  • Scope/focus: The implementation is focused on FE literal/coercion handling and BE local-to-UTC conversion for ambiguous repeated civil times.
  • Concurrency/lifecycle: No new shared mutable state, threading, locks, or non-trivial lifecycle/static initialization hazards were introduced.
  • Configuration/compatibility: No new config, storage format, thrift/proto, or mixed-version protocol changes are introduced.
  • Parallel paths: Both FE paths visible in this PR (StringLikeLiteral casting and TypeCoercionUtils character literal coercion) now preserve explicit offset literals; BE date_trunc and floor/ceil sub-hour paths use the preferred offset when the result civil time is repeated.
  • Data correctness: The BE conversion keeps the original offset only when the target local civil time is actually REPEATED, otherwise it falls back to existing cctz conversion behavior. This preserves the intended UTC instant for the covered DST-fold cases without changing non-ambiguous conversions.
  • Error handling/memory/observability: No ignored Status, new allocations requiring Doris memory tracking, or new observability requirements were found.
  • Test result review: Added/updated regression outputs are consistent with the tested fold behavior. I did not run tests locally in this review runner.
  • User focus: No additional user-provided review focus was present.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

PR approved by anyone and no changes requested.

@zclllyybb zclllyybb merged commit a9ca36f into apache:master May 8, 2026
35 of 37 checks passed
@zclllyybb zclllyybb deleted the ttz_dst branch May 8, 2026 12:47
github-actions Bot pushed a commit that referenced this pull request May 8, 2026
…ransition point (#63034)

Problem Summary: TIMESTAMPTZ values in a DST fold hour could be
converted through an ambiguous local time, making explicit post-fold
instants use the earlier branch during FE literal handling and BE
floor/ceil/date_trunc execution.

The old code would treat DST repeated hours like `2024-11-03 01:xx` as
local civil time without offset when parsing TIMESTAMPTZ with explicit
offset in FE, and when BE converts local time to UTC using
date_trunc/floor/ceil, it would lose the `-04/-05` offset branch
information and default to the earlier branch. After the fix, FE
directly constructs TimestampTzLiteral from timezone-aware strings, and
BE uses the original UTC instant's offset to select pre/post branches
when converting repeated hours back to UTC, thus preserving the correct
fold branch.

when `time_zone = 'America/New_York'`:
```sql
SELECT
CAST(CAST('2024-11-03 01:05:00 -05:00' AS TIMESTAMPTZ(6)) AS VARCHAR(64)) AS ts,
CAST(date_trunc(CAST('2024-11-03 01:05:30 -05:00' AS TIMESTAMPTZ(6)), 'hour') AS VARCHAR(64)) AS hour_trunc;
```

before:
```
ts                                      hour_trunc
2024-11-03 01:05:00.000000-04:00       2024-11-03 01:00:00.000000-04:00
```

now:
```
ts                                      hour_trunc
2024-11-03 01:05:00.000000-05:00       2024-11-03 01:00:00.000000-05:00
```
github-actions Bot pushed a commit that referenced this pull request May 8, 2026
…ransition point (#63034)

Problem Summary: TIMESTAMPTZ values in a DST fold hour could be
converted through an ambiguous local time, making explicit post-fold
instants use the earlier branch during FE literal handling and BE
floor/ceil/date_trunc execution.

The old code would treat DST repeated hours like `2024-11-03 01:xx` as
local civil time without offset when parsing TIMESTAMPTZ with explicit
offset in FE, and when BE converts local time to UTC using
date_trunc/floor/ceil, it would lose the `-04/-05` offset branch
information and default to the earlier branch. After the fix, FE
directly constructs TimestampTzLiteral from timezone-aware strings, and
BE uses the original UTC instant's offset to select pre/post branches
when converting repeated hours back to UTC, thus preserving the correct
fold branch.

when `time_zone = 'America/New_York'`:
```sql
SELECT
CAST(CAST('2024-11-03 01:05:00 -05:00' AS TIMESTAMPTZ(6)) AS VARCHAR(64)) AS ts,
CAST(date_trunc(CAST('2024-11-03 01:05:30 -05:00' AS TIMESTAMPTZ(6)), 'hour') AS VARCHAR(64)) AS hour_trunc;
```

before:
```
ts                                      hour_trunc
2024-11-03 01:05:00.000000-04:00       2024-11-03 01:00:00.000000-04:00
```

now:
```
ts                                      hour_trunc
2024-11-03 01:05:00.000000-05:00       2024-11-03 01:00:00.000000-05:00
```
yiguolei pushed a commit that referenced this pull request May 11, 2026
… cross the transition point #63034 (#63090)

Cherry-picked from #63034

---------

Co-authored-by: zclllyybb <zhaochangle@selectdb.com>
yiguolei pushed a commit that referenced this pull request May 11, 2026
… cross the transition point #63034 (#63091)

Cherry-picked from #63034

---------

Co-authored-by: zclllyybb <zhaochangle@selectdb.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.6-merged dev/4.1.1-merged p0_w reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants