[fix](subquery) Preserve outer-scope columns in correlated subqueries containing CTE#63040
[fix](subquery) Preserve outer-scope columns in correlated subqueries containing CTE#63040starocean999 wants to merge 2 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
There was a problem hiding this comment.
I found one correctness issue that should be fixed before merging.
Critical checkpoint conclusions:
- Goal/test: the PR fixes correlated subqueries containing CTEs and adds FE plus regression coverage for EXISTS, IN, scalar, and an unsupported correlated-CTE-producer case, but it misses nondeterministic or multi-consumer CTE semantics under subqueries.
- Scope/focus: the implementation is small and focused, but the global must-inline marking is too broad for existing CTE materialization semantics.
- Concurrency/lifecycle/config/compatibility: no new concurrency, lifecycle, configuration, persistence, or FE-BE protocol compatibility concerns found.
- Parallel paths: existing CTEInline logic has explicit non-inline handling for nondeterministic or heavily referenced CTEs; this new subquery path bypasses it.
- Tests: added tests cover the main happy path, but not the nondeterministic/materialized CTE case that regresses here. Regression output ordering is deterministic via ORDER BY.
- Observability/performance: no additional observability need identified; performance impact is not the concern, semantics are.
User focus: no additional user-provided review focus was specified.
|
/review |
There was a problem hiding this comment.
Code review completed. I did not find additional distinct blocking issues beyond the already-open inline thread about forcing CTE inline inside subqueries, so I am not duplicating that concern in a new comment.
Critical checkpoint conclusions:
- Goal and tests: the PR targets correlated subqueries containing inner CTEs; added FE unit coverage and a query regression case exercise EXISTS, IN, scalar, and rejected correlated CTE-producer cases.
- Scope/focus: the implementation is small and localized to Nereids context propagation, subquery analysis, subquery-unnesting rewrite placement, and CTE inline safety.
- Concurrency/lifecycle/config/compatibility: no new concurrency, lifecycle, configuration, storage-format, or FE-BE protocol compatibility concerns identified.
- Parallel paths: the relevant subquery analysis and CTE inline paths were checked; the existing force-inline semantics discussion remains the main open area.
- Tests/results: regression output is present and ordered; no additional user-provided review focus was supplied.
- Observability/transactions/data writes: not applicable for this planner-only change.
- Performance: no new distinct issue found beyond the existing discussion about bypassing normal CTE materialization behavior.
User focus: review_focus.txt says no additional user-provided review focus; no extra focus-specific issue found.
|
run buildall |
Problem: Correlated subqueries that include an inner WITH (CTE) could lose references to outer-scope columns during analysis, causing incorrect planning or query rejection.
Fix: Update Nereids CTE/correlated-subquery analysis to preserve outer-scope symbols across CTE boundaries and add a validation to reject unsafe outer-column references inside CTE producers.
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)