Skip to content

[Fix](pyudf) clear stale UDAF state cache on drop#63062

Merged
HappenLee merged 3 commits into
apache:masterfrom
linrrzqqq:pyudf-clear-udaf-state
May 12, 2026
Merged

[Fix](pyudf) clear stale UDAF state cache on drop#63062
HappenLee merged 3 commits into
apache:masterfrom
linrrzqqq:pyudf-clear-udaf-state

Conversation

@linrrzqqq
Copy link
Copy Markdown
Collaborator

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Fix Python UDAF stale cache reuse after dropping and recreating an inline UDAF with the same name/signature.

The Python server previously keyed UDAF state managers by function name and argument types, so a recreated inline UDAF could reuse the old loaded Python class. This fix includes the FE function id in the Python UDAF metadata/cache key and clears UDAF state manager cache during DROP FUNCTION cleanup.

set enable_sql_cache = 0;
DROP FUNCTION IF EXISTS py_udaf_bug_repro(INT);
drop database if exists db001;
create database db001;
use db001;

-- 0. Prepare test data
DROP TABLE IF EXISTS t_udaf_cache_bug_test;
CREATE TABLE t_udaf_cache_bug_test (
    id INT,
    val INT
) DUPLICATE KEY(id)
DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES("replication_num"="1");
INSERT INTO t_udaf_cache_bug_test VALUES (1, 10), (2, 20), (3, 30);
-- At this moment, the total of the entire table val is 60.

-- 1. Create V1 version of UDAF (Logic: Accumulate and multiply by 10)
DROP FUNCTION IF EXISTS py_udaf_bug_repro(INT);
select sleep(10);
CREATE AGGREGATE FUNCTION py_udaf_bug_repro(INT)
RETURNS BIGINT
PROPERTIES (
    "type"="PYTHON_UDF",
    "symbol"="RecreateUDAF",
    "runtime_version"="3.12.11", 
    "always_nullable"="true"
)
AS $$
class RecreateUDAF:
    def __init__(self):
        self.total = 0
    @property
    def aggregate_state(self):
        return self.total
    def accumulate(self, val):
        if val is not None:
            self.total += val
    def merge(self, other):
        self.total += other
    def finish(self):
        return self.total * 10  # V1: 乘以 10
$$;

-- 2. Verify V1 Logic
SELECT py_udaf_bug_repro(val) FROM t_udaf_cache_bug_test;
-- Expected Return: 600 (60 * 10)
-- Actual Return: 600 (Correct)

-- 3. Drop the old function and create a V2 version of the UDAF with the same name (logic: accumulate and multiply by 100)
DROP FUNCTION IF EXISTS py_udaf_bug_repro(INT);
select sleep(10);
select sleep(10);
CREATE AGGREGATE FUNCTION py_udaf_bug_repro(INT)
RETURNS BIGINT
PROPERTIES (
    "type"="PYTHON_UDF",
    "symbol"="RecreateUDAF",
    "runtime_version"="3.12.11",
    "always_nullable"="true"
)
AS $$
class RecreateUDAF:
    def __init__(self):
        self.total = 0
    @property
    def aggregate_state(self):
        return self.total
    def accumulate(self, val):
        if val is not None:
            self.total += val
    def merge(self, other):
        self.total += other
    def finish(self):
        return self.total * 100  # V2: Logic modified to multiply by 100
$$;

-- 4. Verify V2 Logic
SELECT py_udaf_bug_repro(val) FROM t_udaf_cache_bug_test;
-- Expected Return: 6000 (60 * 100)
-- Actual Return: 600  ([Bug occurs] Still outputs the old cached 600)

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@linrrzqqq linrrzqqq force-pushed the pyudf-clear-udaf-state branch from ee34297 to 1edf152 Compare May 8, 2026 02:15
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found two blocking issues in the new Python UDAF drop cleanup path.

Critical checkpoint conclusions:

  • Goal/test: The PR addresses stale inline UDAF class reuse after DROP/CREATE with the same signature, and adds a targeted regression test. The function-id cache key covers the recreate case, but the cleanup path introduces failure and lifecycle risks.
  • Scope: The change is mostly focused, but the cleanup action should preserve existing best-effort cache cleanup semantics and in-flight query lifecycles.
  • Concurrency: The new DROP cleanup can run concurrently with active Python Flight UDAF exchanges. Current code can clear states still needed by those exchanges.
  • Lifecycle: Function-id keys avoid reuse by new functions, but removing/clearing old managers needs reference-safe lifecycle handling.
  • Compatibility/config: No new config or storage/protocol incompatibility identified for the BE/Python process pair built from this PR.
  • Parallel paths: Module cache cleanup remains best-effort; the new UDAF cleanup should follow the same non-crashing error handling model.
  • Tests: Regression coverage exercises stale reuse, but does not cover cleanup failure or concurrent DROP while an old UDAF query is still running.
  • Observability/performance/data correctness: No additional observability requirement found; the main correctness risks are in error propagation and active-state invalidation.
  • User focus: No additional user-provided review focus was specified.

Comment thread be/src/udf/python/python_server.cpp Outdated
Comment thread be/src/udf/python/python_server.py
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review completed for PR head 7bdde75837b919220cc8758638b8bdaf61dd14b7.

No new blocking issues found. I also reviewed the existing inline threads and did not duplicate them; the current head appears to address the previously raised UDAF cleanup exception and in-flight state invalidation concerns by using WARN_IF_ERROR, returning the manager under lock, and detaching managers without clearing active states.

Critical checkpoint conclusions:

  • Goal/test: The PR targets stale Python UDAF state/class reuse after DROP/CREATE. The function-id keying plus drop cleanup path matches that goal, with added BE unit tests and regression coverage for inline drop/recreate.
  • Scope: The change is small and focused on Python UDF/UDAF metadata, manager cleanup, and related tests.
  • Concurrency/lifecycle: Reviewed the async DROP cleanup versus active Flight exchanges. The current implementation avoids clearing in-flight state and prevents lookup/pop races for manager retrieval. No additional concurrency blocker found.
  • Configuration: No new configuration items.
  • Compatibility/storage: No storage-format or persisted metadata change. The new id field is carried in the internal BE-to-Python descriptor and all reviewed production builders populate it.
  • Parallel paths: UDF, UDAF, and UDTF metadata serialization now include the function id; only UDAF manager keys use it, which is appropriate for this bug.
  • Error handling: Cleanup failures are best-effort warnings instead of uncaught task-worker exceptions. Status handling in the reviewed changed C++ paths is acceptable.
  • Tests: Added BE unit coverage for no-process, failed-action, action broadcast, and JSON id serialization, plus regression coverage for inline UDAF recreation. I did not run tests locally in this review runner.
  • Observability: Existing INFO/WARNING logs around cleanup/broadcast are sufficient for this change.
  • Transaction/persistence/data writes: Not applicable.
  • Performance: Cleanup scans only the per-process UDAF manager registry under a lock; no obvious performance blocker found.
  • User focus: No additional user-provided review focus was specified.

@linrrzqqq linrrzqqq force-pushed the pyudf-clear-udaf-state branch from 7bdde75 to 7bff96e Compare May 8, 2026 09:11
@linrrzqqq linrrzqqq force-pushed the pyudf-clear-udaf-state branch from 7bff96e to e321868 Compare May 8, 2026 16:05
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29888 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit e32186890f425e43b7bdd68a0e28a0509cdeff0f, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17680	4018	3996	3996
q2	q3	10718	881	617	617
q4	4657	462	353	353
q5	7448	1318	1149	1149
q6	195	171	139	139
q7	902	937	750	750
q8	9342	1394	1298	1298
q9	6050	5367	5366	5366
q10	6291	2083	1813	1813
q11	475	274	257	257
q12	697	415	300	300
q13	18212	3352	2721	2721
q14	302	283	262	262
q15	q16	913	879	807	807
q17	968	1049	782	782
q18	6433	5710	5627	5627
q19	1376	1296	1136	1136
q20	530	411	260	260
q21	4645	2349	1925	1925
q22	457	384	330	330
Total cold run time: 98291 ms
Total hot run time: 29888 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4874	4773	4739	4739
q2	q3	4671	4781	4234	4234
q4	2148	2180	1391	1391
q5	5003	4970	5259	4970
q6	205	172	138	138
q7	2076	1831	1648	1648
q8	3316	3078	3073	3073
q9	8416	8599	8561	8561
q10	4536	4516	4254	4254
q11	625	420	419	419
q12	680	754	551	551
q13	3458	3542	2933	2933
q14	312	319	287	287
q15	q16	760	779	766	766
q17	1460	1384	1364	1364
q18	8036	7219	7161	7161
q19	1152	1156	1204	1156
q20	2289	2254	1959	1959
q21	6232	5705	4900	4900
q22	526	484	399	399
Total cold run time: 60775 ms
Total hot run time: 54903 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 169707 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit e32186890f425e43b7bdd68a0e28a0509cdeff0f, data reload: false

query5	4316	660	513	513
query6	336	229	208	208
query7	4240	564	300	300
query8	320	232	217	217
query9	8863	4007	4009	4007
query10	468	346	306	306
query11	5781	2365	2188	2188
query12	186	133	130	130
query13	1342	596	436	436
query14	6414	5366	5064	5064
query14_1	4400	4379	4369	4369
query15	217	208	186	186
query16	1004	456	457	456
query17	1159	771	642	642
query18	2747	499	366	366
query19	233	216	178	178
query20	143	138	132	132
query21	219	142	120	120
query22	13599	13525	13440	13440
query23	17129	16358	16025	16025
query23_1	16201	16125	16206	16125
query24	7453	1788	1355	1355
query24_1	1340	1361	1344	1344
query25	599	520	475	475
query26	1300	317	181	181
query27	2690	597	343	343
query28	4365	1958	1986	1958
query29	1029	650	539	539
query30	300	242	198	198
query31	1138	1065	938	938
query32	89	80	75	75
query33	555	352	307	307
query34	1190	1127	651	651
query35	774	810	682	682
query36	1378	1360	1138	1138
query37	152	106	85	85
query38	3235	3160	3086	3086
query39	931	934	886	886
query39_1	870	857	917	857
query40	234	159	142	142
query41	69	62	60	60
query42	111	115	108	108
query43	330	327	282	282
query44	
query45	216	203	202	202
query46	1126	1202	747	747
query47	2500	2521	2217	2217
query48	402	430	306	306
query49	645	532	424	424
query50	735	295	221	221
query51	4375	4302	4262	4262
query52	107	106	97	97
query53	255	301	211	211
query54	308	291	275	275
query55	96	91	88	88
query56	331	317	305	305
query57	1411	1447	1343	1343
query58	298	276	271	271
query59	1581	1660	1433	1433
query60	344	351	328	328
query61	165	150	152	150
query62	672	632	567	567
query63	241	201	207	201
query64	2358	827	679	679
query65	
query66	1689	520	384	384
query67	30052	30032	29223	29223
query68	
query69	460	337	303	303
query70	1031	1011	970	970
query71	299	273	268	268
query72	2912	2744	2417	2417
query73	880	756	403	403
query74	5081	4900	4751	4751
query75	2772	2669	2354	2354
query76	2333	1134	739	739
query77	420	430	348	348
query78	12988	12955	12484	12484
query79	1481	996	752	752
query80	1385	579	513	513
query81	525	281	243	243
query82	1261	157	118	118
query83	329	282	247	247
query84	274	144	114	114
query85	907	510	446	446
query86	450	351	325	325
query87	3430	3389	3217	3217
query88	3601	2677	2663	2663
query89	452	396	341	341
query90	1944	176	182	176
query91	178	165	135	135
query92	78	82	76	76
query93	980	937	551	551
query94	699	297	315	297
query95	656	380	337	337
query96	1054	783	332	332
query97	2705	2699	2555	2555
query98	236	231	248	231
query99	1155	1148	974	974
Total cold run time: 255203 ms
Total hot run time: 169707 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 70.59% (12/17) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.79% (27780/37648)
Line Coverage 57.67% (300875/521723)
Region Coverage 54.95% (251028/456811)
Branch Coverage 56.44% (108485/192210)

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29628 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 52b51329ac73de8bc1cff1cac56eaf3a6f247668, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17613	3920	3856	3856
q2	q3	10714	883	608	608
q4	4661	467	349	349
q5	7448	1326	1144	1144
q6	184	174	149	149
q7	917	966	750	750
q8	9325	1376	1245	1245
q9	5599	5411	5380	5380
q10	6243	2081	1825	1825
q11	472	260	252	252
q12	636	422	301	301
q13	18117	3275	2728	2728
q14	288	283	263	263
q15	q16	868	857	793	793
q17	965	1036	751	751
q18	6416	5603	5620	5603
q19	1174	1173	1045	1045
q20	503	398	283	283
q21	4854	2434	1968	1968
q22	484	399	335	335
Total cold run time: 97481 ms
Total hot run time: 29628 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4812	4663	4874	4663
q2	q3	4638	4798	4245	4245
q4	2187	2222	1438	1438
q5	4970	5000	5262	5000
q6	201	176	138	138
q7	2154	1827	1621	1621
q8	3349	3150	3106	3106
q9	8459	8485	8480	8480
q10	4479	4510	4247	4247
q11	620	418	401	401
q12	694	743	512	512
q13	3233	3633	2944	2944
q14	297	311	283	283
q15	q16	864	793	693	693
q17	1327	1293	1251	1251
q18	8048	7098	7047	7047
q19	1163	1181	1136	1136
q20	2251	2214	1932	1932
q21	6107	5360	4837	4837
q22	551	510	414	414
Total cold run time: 60404 ms
Total hot run time: 54388 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 170228 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 52b51329ac73de8bc1cff1cac56eaf3a6f247668, data reload: false

query5	4348	644	513	513
query6	325	221	202	202
query7	4236	563	298	298
query8	339	224	228	224
query9	8828	4025	3971	3971
query10	445	352	300	300
query11	5831	2427	2208	2208
query12	182	136	129	129
query13	1266	585	427	427
query14	5930	5331	5050	5050
query14_1	4319	4329	4351	4329
query15	211	202	179	179
query16	976	449	436	436
query17	902	748	627	627
query18	2427	482	347	347
query19	213	197	168	168
query20	142	135	135	135
query21	213	137	114	114
query22	13628	13530	13278	13278
query23	17095	16269	16719	16269
query23_1	16410	16445	16406	16406
query24	7546	1851	1414	1414
query24_1	1404	1352	1368	1352
query25	603	529	453	453
query26	1322	313	169	169
query27	2750	615	333	333
query28	4530	1953	1948	1948
query29	1022	650	535	535
query30	304	241	202	202
query31	1103	1057	981	981
query32	87	69	69	69
query33	544	331	286	286
query34	1197	1078	625	625
query35	752	781	679	679
query36	1316	1304	1145	1145
query37	148	101	84	84
query38	3176	3124	3041	3041
query39	922	917	888	888
query39_1	879	878	886	878
query40	233	150	132	132
query41	63	59	58	58
query42	113	105	106	105
query43	319	314	288	288
query44	
query45	213	206	195	195
query46	1100	1238	720	720
query47	2321	2336	2167	2167
query48	397	404	285	285
query49	636	531	420	420
query50	702	282	214	214
query51	4273	4301	4222	4222
query52	105	105	95	95
query53	256	280	203	203
query54	309	277	249	249
query55	91	88	83	83
query56	302	300	303	300
query57	1418	1425	1354	1354
query58	299	277	271	271
query59	1534	1611	1382	1382
query60	356	344	330	330
query61	156	156	159	156
query62	671	626	555	555
query63	244	201	205	201
query64	2463	817	683	683
query65	
query66	1748	518	380	380
query67	30073	29994	29817	29817
query68	
query69	469	357	300	300
query70	1029	1013	897	897
query71	309	274	278	274
query72	2971	2697	2405	2405
query73	827	765	423	423
query74	5051	4929	4722	4722
query75	2781	2662	2339	2339
query76	2257	1121	747	747
query77	407	435	353	353
query78	12955	12940	12316	12316
query79	1463	1024	737	737
query80	695	586	480	480
query81	458	291	236	236
query82	1332	163	138	138
query83	351	279	256	256
query84	310	138	110	110
query85	855	529	434	434
query86	401	334	331	331
query87	3434	3341	3210	3210
query88	3520	2682	2648	2648
query89	439	386	345	345
query90	1904	188	180	180
query91	197	190	159	159
query92	82	103	74	74
query93	958	945	557	557
query94	519	335	285	285
query95	668	372	339	339
query96	1035	768	352	352
query97	2768	2708	2607	2607
query98	240	231	236	231
query99	1149	1136	976	976
Total cold run time: 253067 ms
Total hot run time: 170228 ms

@HappenLee
Copy link
Copy Markdown
Contributor

/review

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.24% (15/17) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.87% (27811/37651)
Line Coverage 57.76% (301379/521734)
Region Coverage 55.02% (251323/456810)
Branch Coverage 56.55% (108692/192208)

1 similar comment
@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.24% (15/17) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.87% (27811/37651)
Line Coverage 57.76% (301379/521734)
Region Coverage 55.02% (251323/456810)
Branch Coverage 56.55% (108692/192208)

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one additional blocking issue: the newly added BE unit test calls a private PythonServerManager method directly, so the test target will not compile. Existing review threads already cover the async UDAF manager cleanup lifecycle and cleanup-error handling concerns; I did not duplicate those.

Critical checkpoint conclusions:

  • Goal/test: The PR targets stale Python UDAF cache reuse after DROP/CREATE and adds regression/unit coverage, but the new unit test has a compile-time access violation.
  • Scope: The functional change is focused on Python UDAF metadata/cache keys and drop cleanup; no unrelated broad behavior was necessary for this fix.
  • Concurrency/lifecycle: The relevant async DROP cleanup and in-flight UDAF lifecycle risks are already tracked in existing review context; the current distinct finding is test compilation.
  • Error handling: The C++ cleanup path now uses warning-style handling rather than throwing from the task callback; no new duplicate comment was added.
  • Compatibility/protocol: The new id metadata is produced by BE and consumed by the local Python server; no additional compatibility issue found in this review.
  • Tests: Regression coverage was added for drop/recreate, but BE unit coverage must compile before it can validate anything.
  • User focus: No additional user-provided review focus was supplied.

Comment thread be/test/udf/python/python_server_test.cpp
@linrrzqqq linrrzqqq force-pushed the pyudf-clear-udaf-state branch from 52b5132 to d3d0fe9 Compare May 11, 2026 03:10
@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

/review

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No additional blocking issues found in the current diff.

Critical checkpoint conclusions:

  • Goal/test coverage: The change fixes stale Python UDAF class reuse after DROP/CREATE with the same name/signature by carrying the FE function id into Python metadata and the UDAF manager key. The added regression case covers inline UDAF recreate behavior, and the added unit tests cover metadata serialization and broadcast error/no-process paths.
  • Scope: The modification is focused on Python UDF/UDAF metadata, Python server cache keys, and DROP cleanup dispatch.
  • Concurrency/lifecycle: The current cleanup detaches managers under udaf_managers_lock and avoids clearing manager.states, so in-flight exchanges holding an existing manager are not invalidated by cleanup. The C++ cleanup path is best-effort and no longer throws out of the task callback.
  • Compatibility/protocol: Adding id to the BE-generated Python metadata is consumed by the Python server launched from the same BE package; no separate persistent format or FE-BE thrift schema change is introduced.
  • Parallel paths: UDF, UDAF, and UDTF metadata serialization now includes the existing TFunction.id; the DROP cleanup is applied when FE sends a function id.
  • Error handling: Broadcast failures are converted to Status; DROP UDAF cleanup logs via WARN_IF_ERROR, matching best-effort cache cleanup behavior.
  • Memory/observability/performance: No new tracked BE allocations or hot-path scans of concern were found. Cleanup uses existing logs and Python GC only after managers are detached.
  • Transaction/persistence/data visibility: Not applicable; this PR does not alter storage, transaction, visible-version, or persistence behavior.
  • User focus: No additional user-provided review focus was supplied.

Previously raised inline review threads were treated as known context and not duplicated.

@linrrzqqq
Copy link
Copy Markdown
Collaborator Author

run buildall

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-H: Total hot run time: 29600 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d3d0fe9f19ce3567b901aabb6283b695d8628cd8, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17682	3843	3802	3802
q2	q3	10715	880	603	603
q4	4656	454	352	352
q5	7458	1339	1135	1135
q6	187	170	138	138
q7	948	960	748	748
q8	9610	1415	1278	1278
q9	6137	5419	5355	5355
q10	6299	2082	1797	1797
q11	477	268	252	252
q12	693	418	306	306
q13	18194	3310	2781	2781
q14	293	284	262	262
q15	q16	889	863	793	793
q17	1185	1056	801	801
q18	6526	5725	5651	5651
q19	1595	1211	1109	1109
q20	512	409	258	258
q21	4543	2275	1878	1878
q22	413	347	301	301
Total cold run time: 99012 ms
Total hot run time: 29600 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4228	4139	4234	4139
q2	q3	4634	4761	4243	4243
q4	2089	2158	1381	1381
q5	4974	5005	5263	5005
q6	186	162	130	130
q7	2032	2090	1679	1679
q8	3441	3199	3151	3151
q9	8471	8497	8379	8379
q10	4512	4472	4244	4244
q11	624	421	378	378
q12	693	739	542	542
q13	3215	3645	2948	2948
q14	301	306	285	285
q15	q16	778	813	674	674
q17	1345	1299	1350	1299
q18	7986	7163	6981	6981
q19	1143	1188	1164	1164
q20	2225	2254	1956	1956
q21	6089	5417	4913	4913
q22	539	501	428	428
Total cold run time: 59505 ms
Total hot run time: 53919 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

TPC-DS: Total hot run time: 171047 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d3d0fe9f19ce3567b901aabb6283b695d8628cd8, data reload: false

query5	4403	653	515	515
query6	334	214	200	200
query7	4227	570	317	317
query8	332	221	214	214
query9	8831	4030	4056	4030
query10	479	352	313	313
query11	6036	2353	2235	2235
query12	201	131	131	131
query13	1319	615	445	445
query14	6909	5336	5091	5091
query14_1	4402	4407	4369	4369
query15	244	203	189	189
query16	1011	460	442	442
query17	1385	779	636	636
query18	2735	498	367	367
query19	331	229	166	166
query20	137	137	135	135
query21	217	144	114	114
query22	13568	13989	14489	13989
query23	17503	16549	16127	16127
query23_1	16345	16267	16295	16267
query24	7991	1759	1344	1344
query24_1	1370	1338	1417	1338
query25	534	476	419	419
query26	1279	316	166	166
query27	2649	579	335	335
query28	4315	1957	1939	1939
query29	974	639	505	505
query30	296	233	194	194
query31	1105	1053	929	929
query32	86	70	67	67
query33	526	338	283	283
query34	1120	1140	643	643
query35	787	776	652	652
query36	1310	1374	1182	1182
query37	146	100	87	87
query38	3164	3103	3060	3060
query39	930	915	880	880
query39_1	867	874	868	868
query40	226	155	135	135
query41	62	64	58	58
query42	107	108	106	106
query43	326	324	278	278
query44	
query45	203	193	188	188
query46	1110	1160	698	698
query47	2285	2282	2157	2157
query48	408	420	283	283
query49	625	523	423	423
query50	720	281	216	216
query51	4274	4237	4153	4153
query52	103	116	93	93
query53	251	283	205	205
query54	310	268	245	245
query55	94	87	83	83
query56	296	340	305	305
query57	1417	1396	1304	1304
query58	285	280	261	261
query59	1504	1654	1370	1370
query60	354	320	325	320
query61	158	151	156	151
query62	665	615	553	553
query63	245	202	205	202
query64	2349	848	674	674
query65	
query66	1689	504	389	389
query67	29904	29931	29816	29816
query68	
query69	449	341	310	310
query70	1022	992	967	967
query71	309	277	261	261
query72	2899	2702	2438	2438
query73	870	792	406	406
query74	5057	4922	4761	4761
query75	2792	2654	2324	2324
query76	2285	1145	761	761
query77	409	435	340	340
query78	13036	12991	12297	12297
query79	1504	962	725	725
query80	1375	565	473	473
query81	518	281	243	243
query82	977	156	119	119
query83	319	285	250	250
query84	270	141	110	110
query85	921	539	448	448
query86	451	329	322	322
query87	3402	3343	3195	3195
query88	3570	2657	2640	2640
query89	438	371	338	338
query90	1927	184	181	181
query91	182	169	139	139
query92	77	81	69	69
query93	1100	937	556	556
query94	717	345	299	299
query95	670	377	353	353
query96	1065	786	359	359
query97	2722	2702	2601	2601
query98	239	226	225	225
query99	1104	1127	996	996
Total cold run time: 256003 ms
Total hot run time: 171047 ms

@hello-stephen
Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 90.48% (19/21) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 53.58% (20659/38559)
Line Coverage 37.20% (194986/524203)
Region Coverage 33.52% (152015/453460)
Branch Coverage 34.56% (66304/191835)

@hello-stephen
Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 88.24% (15/17) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.81% (27870/37761)
Line Coverage 57.68% (301573/522826)
Region Coverage 54.84% (251109/457878)
Branch Coverage 56.39% (108577/192563)

Copy link
Copy Markdown
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HappenLee HappenLee merged commit 54d7885 into apache:master May 12, 2026
31 of 32 checks passed
@linrrzqqq linrrzqqq deleted the pyudf-clear-udaf-state branch May 12, 2026 06:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants