Skip to content

GeoT optimization 2/4: Datapipes producer/consumer refactor + stream overlap#1742

Draft
coreyjadams wants to merge 6 commits into
mainfrom
geoT-opt-datapipe-stream-overlap
Draft

GeoT optimization 2/4: Datapipes producer/consumer refactor + stream overlap#1742
coreyjadams wants to merge 6 commits into
mainfrom
geoT-opt-datapipe-stream-overlap

Conversation

@coreyjadams

Copy link
Copy Markdown
Collaborator

PhysicsNeMo Pull Request

This PR is stacked 🥞 on #1741 . It requires the torch SDF implementation first.

This aggressively rebuilds the datapipe's IO behavior to use prefetching more. I also updated the docs. This is still work in progress but getting there.

Description

Checklist

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

@copy-pr-bot

copy-pr-bot Bot commented Jun 22, 2026

Copy link
Copy Markdown

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@copy-pr-bot

copy-pr-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Base automatically changed from geoT-opt-warp-free-SDF to main June 26, 2026 15:44
Refactor the datapipe prefetch path into a thread-safe host producer
(_load_host) plus a main-thread consumer (_consume) with a FIFO
submit/consume primitive (io_pump.IOPump), so all device/Warp kernels
launch on the consuming thread. Build deferred-sync stream overlap on top:
_consume records the preprocessing CUDA event into _events_pending and the
DataLoader does one-batch lookahead, inserting compute_stream.wait_event
just before each yield so batch N+1 preprocessing overlaps batch N compute.

- New io_pump.py (FIFO pump); producer/consumer protocols; _rng
  fork_generator; core.function_spec.warp_stream_from_torch; refactored
  readers (base/numpy/zarr/tensorstore_zarr) and datapipes __init__.
- MeshDataset parallel disk read + pin (serialize_load_consume=False);
  DomainMeshReader drop_interior_cells / drop_in_file_boundaries; volume
  configs enable both.
- radius_search pinned non_blocking H2D; recipe train loop pinned async
  loss D2H. Opt-in timing + torch.profiler labels; streaming + reader
  tests, docs, and the iterable-dataset tutorial.

No Warp keepalive machinery: with the Warp-free SDF (parent branch) the
datapipe no longer launches Warp kernels in _consume.
@coreyjadams coreyjadams force-pushed the geoT-opt-datapipe-stream-overlap branch from 76ddb69 to 65cd32b Compare June 27, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant