Remove cudf-polars[rapidsmpf] pip extra & numpy as a [test] dependency; add [dask] pip extra#22480
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughRemoved the ChangesOptional dependency & CI cleanup
Tests: NumPy → Polars/Arrow
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@python/cudf_polars/tests/experimental/test_allgather.py`:
- Around line 28-34: The test currently only asserts shape/type after converting
pl.Series -> plc.Column.from_arrow -> plc.Table and calling allgather; add a
value-level assertion that the gathered table's column values equal the expected
Polars result (e.g., the concatenated pl.Series values) by extracting the column
from the result of plc.allgather and comparing its values to the CPU-side
pl.concat(...) result; locate the conversion usage of plc.Column.from_arrow and
the allgather call in test_allgather.py and assert element-wise equality (or use
to_list()/to_numpy() on both sides) to ensure GPU results match Polars CPU
results.
In `@python/cudf_polars/tests/experimental/test_spilling.py`:
- Around line 36-38: The current table builder reseeds the RNG on every call
(rng = random.Random(42)) producing identical pl.Series payloads; remove the
per-call reseed and instead use a persistent RNG (e.g., a module-level
random.Random instance or pass an rng parameter) so pl_data =
pl.Series([rng.random() ...], dtype=pl.Float32()) yields varied values across
calls; update both occurrences (the one creating pl_data and the similar
occurrence around line 90) and ensure plc.Column.from_arrow/plc.Table creation
uses the new non-reseeded rng.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 12d4f352-6d66-4d38-ba75-43fa9a4d7b64
📒 Files selected for processing (4)
dependencies.yamlpython/cudf_polars/pyproject.tomlpython/cudf_polars/tests/experimental/test_allgather.pypython/cudf_polars/tests/experimental/test_spilling.py
💤 Files with no reviewable changes (2)
- dependencies.yaml
- python/cudf_polars/pyproject.toml
| - depends_on_rapidsmpf | ||
| - depends_on_pylibcudf | ||
| - depends_on_cuda_python | ||
| # TODO: Eventually remove this alias in favor of dask |
There was a problem hiding this comment.
I recommend picking a timeline and including that in the comment. Remove in 26.08? 26.10?
…idsmpf_optional_dep
…idsmpf_optional_dep
|
/merge |
Description
cudf-polars[rapidsmpf]pip extranumpyas a testing dependency as it was only used fornp.randomandnp.full"nvidia-ml-py>=12"from theexperimentalextra as it's already a required dependency of cudf_polarscudf-polars[dask]as an alias forcudf-polars[experimental]Checklist