Stabilize CalciteStreamstatsCommandIT on the analytics-engine route#5582
Merged
ahkcs merged 1 commit intoJun 24, 2026
Merged
Conversation
Gate the streamstats ITs that diverge on the analytics-engine route (parquet-backed composite store, DataFusion backend) so the route runs green, while keeping every test active on the v2/Calcite path. Mechanism follows the established capability-gating pattern (opensearch-project#5560): an in-test @RequiresCapability(...) annotation plus a matching integTestRemote excludeTestsMatching entry; both are no-ops on the v2 route. Triaged single-shard and multi-shard (num_shards=3) analytics runs against the v2 baseline. Three groups: - DOC_MUTATION (4 tests): testStreamstatsGlobalWithNull, testStreamstatsGlobalWithNullBucket, testStreamstatsResetWithNull, testStreamstatsResetWithNullBucket seed state via PUT+DELETE. Doc-level DELETE is unsupported on the parquet store, and same-_id PUT is append-only, so the leaked doc inflated the row counts of every sibling test that reads the shared index. Gated with the same DOC_MUTATION capability the three existing mutation tests already carry. - CHAINED_STREAMSTATS_BY (4 tests): chaining two streamstats where an upstream stage partitions `by` a group emits a ROW_NUMBER() sequence column from each stage; the Substrait converter names both physical columns identically, so the stacked schema has a duplicate/ambiguous field name (500) or, for chained window streamstats, non-deterministic values. The Calcite logical plan is correct; the alias is lost in Substrait conversion. Fails single- and multi-shard. - STREAMSTATS_SORT_NOT_HONORED (1 test): testStreamstatsAndSort. The window is computed over the backend scan order, ignoring a preceding `| sort` (the OVER clause carries no explicit ORDER BY), so the per-row aggregates diverge from the v2/Calcite path. Pass rate on the single-shard analytics route, CalciteStreamstatsCommandIT: | metric | before | after | |-----------|--------|-------| | tests run | 47 | 42 | | failures | 12 | 0 | | skipped | 3 | 7 | (The before-failures count is inflated by the DOC_MUTATION leak described above; the four leaking tests plus their downstream row-count victims all clear once gated.) v2 route (:integTest): 47 run, 0 failed, 0 skipped — gates are no-ops off the analytics route. Twelve further tests fail only on the multi-shard route (they pass single-shard) due to cross-shard fragment-order non-determinism in the streamstats window gather; that is an engine-side gap and is left unchanged here rather than gated, to avoid skipping passing tests on the single-shard route. Signed-off-by: Kai Huang <ahkcs@amazon.com>
dai-chen
approved these changes
Jun 24, 2026
| * for chained {@code window} streamstats, non-deterministic window values. A single {@code | ||
| * streamstats by} (or a chain where only the final stage has {@code by}) works. | ||
| */ | ||
| CHAINED_STREAMSTATS_BY( |
Collaborator
There was a problem hiding this comment.
As discussed, let's think how to maintain real backend capability here later. Or maybe there is better mechanism to make our IT work with various engines with different capabilities. Thanks.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Gate the
CalciteStreamstatsCommandITtests that diverge on the analytics-engine route (parquet-backed composite store, DataFusion backend) so the route runs green, while keeping every test active on the v2/Calcite path. Follows the established capability-gating pattern (#5560): an in-test@RequiresCapability(...)annotation plus a matchingintegTestRemoteexcludeTestsMatchingentry — both no-ops on the v2 route.Triaged single-shard and multi-shard (
num_shards=3) analytics runs against the v2 baseline. Failures fall into three groups:DOC_MUTATION(4 tests) —testStreamstatsGlobalWithNull,testStreamstatsGlobalWithNullBucket,testStreamstatsResetWithNull,testStreamstatsResetWithNullBucketseed state viaPUT+DELETE. Doc-levelDELETEis unsupported on the parquet store, and same-_idPUTis append-only, so the leaked doc inflated the row counts of every sibling test reading the shared index. Gated with the sameDOC_MUTATIONcapability the three existing mutation tests already carry.CHAINED_STREAMSTATS_BY(4 tests) —testMultipleStreamstats,testMultipleStreamstatsWithWindow,testMultipleStreamstatsWithNull1,testMultipleStreamstatsWithEval. Chaining twostreamstatswhere an upstream stage partitionsbya group emits aROW_NUMBER()sequence column from each stage; the Substrait converter names both physical columns identically ("row_number() ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW"), so the stacked schema has a duplicate/ambiguous field name (500) or, for chainedwindowstreamstats, non-deterministic values. The Calcite logical plan is correct (the__stream_seq__alias is distinct) — the alias is lost in Substrait conversion. A singlestreamstats by, or a chain where only the final stage hasby, works. Fails single- and multi-shard.STREAMSTATS_SORT_NOT_HONORED(1 test) —testStreamstatsAndSort. The window is computed over the backend scan order, ignoring a preceding| sort(theOVERclause carries no explicitORDER BY, since streamstats orders by encounter order by design), so the per-row aggregates diverge from the v2/Calcite path.Pass rate — single-shard analytics route,
CalciteStreamstatsCommandITThe before-failures count is inflated by the
DOC_MUTATIONleak above; the four leaking tests plus their downstream row-count victims all clear once gated.v2 route (
:integTest): 47 run, 0 failed, 0 skipped — the gates are no-ops off the analytics route.Out of scope
Twelve further tests fail only on the multi-shard route (they pass single-shard) due to cross-shard fragment-order non-determinism in the streamstats window gather (the coordinator gather has no shard-ordinal tiebreaker, so a preceding
| sortcan't recover deterministic order). That is an engine-side gap; these are left unchanged rather than gated, to avoid skipping passing tests on the single-shard route.Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check here.