Cleanup the legacy engines code path by madsbk · Pull Request #22488 · rapidsai/cudf

madsbk · 2026-05-13T13:19:15Z

Test helpers (`cudf_polars/testing/asserts.py`)

assert_collect_raises

Tests

test_collect_assert_raises
assert_collect_raises imports (in test_asserts.py,
test_stringfunction.py, test_casting.py,
test_datetime_basic.py)

Benchmark CLI (`benchmarks/utils.py`)

Removed the --executor flag and folded its cpu and
in-memory choices into --frontend, which now accepts:
cpu, in-memory, spmd, ray, dask, and duckdb.
ExecutorType alias
RunConfig.executor
Executor serialization / summary output
assert_never(run_config.executor) branch
Unused assert_never import

Benchmark shim

benchmarks/utils_new_frontends.py
(contents merged into benchmarks/utils.py)

…ask", "duckdb"

coderabbitai · 2026-05-13T14:20:53Z

Note

Reviews paused

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR eliminates global default executor configuration from the cudf_polars testing infrastructure by making engine specification explicit throughout. It requires all assertion functions to accept explicit GPUEngine instances, removes the --executor CLI option and helper functions that provided defaults, updates the entire test suite to pass explicit engines, consolidates benchmark utilities from a separate module into the main utils module, and adds comprehensive CLI and execution infrastructure for benchmark runners.

Changes

Testing and Benchmark Engine Parameterization

Layer / File(s)	Summary
Assertion function API refactoring `python/cudf_polars/cudf_polars/testing/asserts.py`	`assert_gpu_result_equal` and `assert_sink_result_equal` now require an explicit `engine: GPUEngine` keyword-only parameter. Removed `DEFAULT_EXECUTOR`, `get_default_engine()`, and `assert_collect_raises()` to eliminate optional engine fallbacks.
Pytest configuration and test fixture cleanup `python/cudf_polars/tests/conftest.py`	Removed the `pytest_addoption()` hook that registered `--executor` CLI flag and the wiring in `pytest_configure()` that set `DEFAULT_EXECUTOR` from it. Configuration now only handles marker registration and global `ResourceWarning` suppression.
Test suite updates for explicit engines `python/cudf_polars/tests/expressions/test_casting.py`, `python/cudf_polars/tests/expressions/test_datetime_basic.py`, `python/cudf_polars/tests/expressions/test_stringfunction.py`, `python/cudf_polars/tests/test_config.py`, `python/cudf_polars/tests/test_executors.py`, `python/cudf_polars/tests/test_sink.py`, `python/cudf_polars/tests/test_tracing.py`, `python/cudf_polars/tests/testing/test_asserts.py`	All test modules replaced `assert_collect_raises()` calls and `executor="..."` string arguments with explicit `pytest.raises(...)` blocks or instantiated `pl.GPUEngine(executor="in-memory", raise_on_fail=True)` objects. Updated subprocess environment to use explicit `CUDF_POLARS__EXECUTOR` setting instead of relying on test defaults.
Benchmark utilities consolidation and CLI infrastructure `python/cudf_polars/cudf_polars/experimental/benchmarks/utils.py` (previously `utils_new_frontends.py` removed)	Merged benchmarking datamodels (`QueryResult`, `RunConfig`, validation records), hardware/version tracking (`HardwareInfo`, `PackageVersions`), frontend execution runners (CPU, in-memory, SPMD, Ray, Dask, DuckDB), NVTX-annotated timing, structured tracing collection, and comprehensive CLI argument parsing (`build_parser`, `parse_args`, `run_polars`) into a single, self-contained module. Includes dataset loading, scale-factor inference, validation golden-file discovery, and result summarization.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

The review involves understanding the systematic removal of a global executor pattern across multiple test files and assertion utilities, verifying that all test branches correctly specify engines, and validating the large consolidation of benchmark utilities that introduces comprehensive CLI infrastructure and multi-frontend execution orchestration.

Possibly related PRs

rapidsai/cudf#22410: Related refactoring of default engine/executor handling in asserts.py with similar removal of fallback engine derivation patterns.
rapidsai/cudf#22381: Extends test fixtures in conftest.py with Ray/Dask-aware streaming engine markers and fixtures that work in tandem with the explicit engine requirement pattern introduced here.

Suggested reviewers

TomAugspurger
mroeschke
bdice
Matt711

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 59.09% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Cleanup the legacy engines code path' accurately describes the main objective of removing deprecated engine-related code and making engine arguments mandatory in assertion functions.
Linked Issues check	✅ Passed	The PR successfully implements all coding requirements from issue `#22346`: makes engine argument mandatory for assert_gpu_result_equal and assert_sink_result_equal, removes DEFAULT_EXECUTOR and get_default_engine, eliminates assert_collect_raises, and removes the --executor CLI option to prevent implicit defaults.
Out of Scope Changes check	✅ Passed	All changes are directly aligned with the linked issue `#22346` objectives. The refactoring of benchmarks/utils.py consolidating frontend selection and removal of utils_new_frontends.py are supporting changes to achieve explicit engine handling throughout the codebase.
Description check	✅ Passed	The PR description clearly outlines the removal of legacy engine code paths, test helpers, and CLI flag consolidation, matching the changeset which removes deprecated test utilities and refactors benchmark/testing infrastructure.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…ode_path

madsbk · 2026-05-14T13:15:22Z

@@ -1,19 +1,33 @@
-# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES.


This diff is hard to read on GitHub. To better isolate the changes in
utils.py, use something like:

git diff \ main:python/cudf_polars/cudf_polars/experimental/benchmarks/utils_new_frontends.py \ HEAD:python/cudf_polars/cudf_polars/experimental/benchmarks/utils.py

wence- · 2026-05-14T13:31:22Z

--executor CLI option

pytest_addoption

DEFAULT_EXECUTOR = config.getoption("--executor")

This stuff has already gone, no?

wence- · 2026-05-14T13:33:12Z

Closes #22346.

This is closed by refusing to do it, AFAICT, rather than shoveling an engine argument into assert_ir_translation_raises.

wence-

Please update the PR description for what is now actually going on.

TBH, I would prefer just changing the benchmark rather than the benchmark and the tests in this PR

wence- · 2026-05-14T13:30:44Z

    return final_polars_collect_kwargs, final_cudf_collect_kwargs


-def assert_collect_raises(


Inlining it instead of calling assert_collect_raises is simpler IMO.

madsbk · 2026-05-14T13:55:22Z

Closes #22346.

This is closed by refusing to do it, AFAICT, rather than shoveling an engine argument into assert_ir_translation_raises.

Removed the closing, you are right that this PR only addresses the first part of #22346.
I am not sure assert_ir_translation_raises is really an issue, though.

copy-pr-bot · 2026-05-14T14:12:40Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

madsbk · 2026-05-14T14:15:50Z

TBH, I would prefer just changing the benchmark rather than the benchmark and the tests in this PR

Fair point, let's do one at a time: #22504

madsbk · 2026-05-16T09:08:50Z

close in favor of #22535

madsbk added 6 commits May 13, 2026 14:11

move to utils.py

b92a9e9

assert(): make engine mandatory

4c95b90

assert: remove the executor argument

e5b1aba

remove remove assert_collect_raises()

0a10efd

Removed the pytest_addoption function and its --executor

79bd93c

benchmark: now --frontend takes "cpu", "in-memory", "spmd", "ray", "d…

e43dff3

…ask", "duckdb"

madsbk self-assigned this May 13, 2026

madsbk added improvement Improvement / enhancement to an existing function breaking Breaking change labels May 13, 2026

github-actions Bot added Python Affects Python cuDF API. cudf-polars Issues specific to cudf-polars labels May 13, 2026

github-project-automation Bot added this to cuDF Python May 13, 2026

GPUtester moved this to In Progress in cuDF Python May 13, 2026

madsbk marked this pull request as ready for review May 13, 2026 14:15

madsbk requested a review from a team as a code owner May 13, 2026 14:15

madsbk requested a review from TomAugspurger May 13, 2026 14:15

This comment was marked as off-topic.

Sign in to view

rapidsai deleted a comment from copy-pr-bot Bot May 13, 2026

Merge branch 'main' into cleanup_legacy_code_path

129d910

rapidsai deleted a comment from coderabbitai Bot May 13, 2026

Merge branch 'main' of github.com:rapidsai/cudf into cleanup_legacy_c…

076c7db

…ode_path

madsbk commented May 14, 2026

View reviewed changes

wence- requested changes May 14, 2026

View reviewed changes

madsbk marked this pull request as draft May 14, 2026 14:12

madsbk closed this May 16, 2026

github-project-automation Bot moved this from In Progress to Done in cuDF Python May 16, 2026

		@@ -1,19 +1,33 @@
		# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES.

		return final_polars_collect_kwargs, final_cudf_collect_kwargs


		def assert_collect_raises(

Conversation

madsbk commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test helpers (cudf_polars/testing/asserts.py)

Tests

Benchmark CLI (benchmarks/utils.py)

Benchmark shim

Uh oh!

coderabbitai Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

This comment was marked as off-topic.

Uh oh!

madsbk May 14, 2026

Choose a reason for hiding this comment

Uh oh!

wence- commented May 14, 2026

Uh oh!

wence- commented May 14, 2026

Uh oh!

wence- left a comment

Choose a reason for hiding this comment

Uh oh!

wence- May 14, 2026

Choose a reason for hiding this comment

Uh oh!

madsbk May 14, 2026

Choose a reason for hiding this comment

Uh oh!

madsbk commented May 14, 2026

Uh oh!

copy-pr-bot Bot commented May 14, 2026

Uh oh!

madsbk commented May 14, 2026

Uh oh!

madsbk commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

madsbk commented May 13, 2026 •

edited

Loading

Test helpers (`cudf_polars/testing/asserts.py`)

Benchmark CLI (`benchmarks/utils.py`)

coderabbitai Bot commented May 13, 2026 •

edited

Loading