-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-29559: materializeCTE to query Analyzer from DDLSemanticAnalyzerFactory #6423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 8 commits
ef345f5
0db2292
bff6d1b
d52ae1c
5050671
28f835f
3000ca6
b3ae50a
f957fc8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| -- Test CTE materialization with both CBO enabled and disabled | ||
| -- Verifies DDLSemanticAnalyzerFactory is used for CTE materialization | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How can this be verified from q test?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree that this is a bit too bold a statement - the test does confirm that the NPE is fixed, but does not really validate classes used |
||
| -- Also ensures that an NPE is no longer triggered with CBO off (HIVE-28724 regression) | ||
|
|
||
| -- Test with CBO enabled (default) | ||
| explain | ||
| WITH cte AS ( | ||
| SELECT COUNT(*) as cnt FROM (SELECT 1 as id) t | ||
| ) | ||
| SELECT * FROM cte | ||
| UNION ALL | ||
| SELECT * FROM cte | ||
| UNION ALL | ||
| SELECT * FROM cte; | ||
|
|
||
| -- Test with CBO disabled | ||
| set hive.cbo.enable=false; | ||
|
|
||
| explain | ||
| WITH cte AS ( | ||
| SELECT COUNT(*) as cnt FROM (SELECT 1 as id) t | ||
| ) | ||
| SELECT * FROM cte | ||
| UNION ALL | ||
| SELECT * FROM cte | ||
| UNION ALL | ||
| SELECT * FROM cte; | ||
|
|
||
| -- Test the recompile-without-CBO auto-trigger path. | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've left very verbose comments in here to better explain the scenario. I will gladly remove or shorten this comment section once reviewed
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Based on the scope of this patch, this path does not need to be covered because the previous test case already covers the non-CBO path. TBH, I’m not sure it is necessary to add this https://issues.apache.org/jira/browse/HIVE-27830
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
@kasakrisz, while I can agree that testing the non-CBO configured path becomes redundant once full CBO deprecation gets addressed, I believe that having at least one new query that guards against similar regressions could be beneficial. The last query of the test file generates the following when executed against master: cte_materialize.q.test.log.txt
The very first test in the file is not broken in the current master. I thought it would be useful to leave it in to confirm that my changes introduce no additional regressions for the CBO path, but you helped me to see that it is indeed redundant now |
||
| -- With hive.cbo.fallback.strategy=ALWAYS, a CBO crash on `= ALL (subquery)` | ||
| -- causes ReCompileWithoutCBOPlugin to set hive.cbo.enable=false in conf and | ||
| -- recompile. The recompile constructs a plain SemanticAnalyzer via the factory, | ||
| -- and must materialize the CTE through the fixed SemanticAnalyzer.materializeCTE | ||
| -- path. Without the fix this path NPEs at SemanticAnalyzer.materializeCTE. | ||
| set hive.cbo.enable=true; | ||
| set hive.cbo.fallback.strategy=ALWAYS; | ||
|
|
||
| explain | ||
| WITH cte AS ( | ||
| SELECT MAX(s) AS m FROM (SELECT 'a' AS s) t | ||
| ) | ||
| SELECT s FROM (SELECT 'a' AS s) u WHERE s = ALL(SELECT m FROM cte) | ||
| UNION ALL | ||
| SELECT s FROM (SELECT 'a' AS s) u WHERE s = ALL(SELECT m FROM cte) | ||
| UNION ALL | ||
| SELECT s FROM (SELECT 'a' AS s) u WHERE s = ALL(SELECT m FROM cte); | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also lower the threshold (
hive.optimize.cte.materialize.threshold) to1in the scope of this test case and use a simpler query