GeoT optimization 3/4: Add fused batched Muon optimizer#1743
GeoT optimization 3/4: Add fused batched Muon optimizer#1743coreyjadams wants to merge 6 commits into
Conversation
Add physicsnemo.optim.Muon, a fused/batched drop-in replacement for torch.optim.Muon that groups 2-D parameters by (shape, dtype, device) and runs batched Newton-Schulz via torch.bmm/baddbmm with torch._foreach_* momentum/weight-decay updates. Matches torch.optim.Muon hyperparameters, momentum_buffer state, and LR-adjustment modes. Export it from physicsnemo.optim and switch the unified external aero recipe's build_muon_optimizer to use it via CombinedOptimizer.
|
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually. Contributors can view more details about this message here. |
Greptile SummaryThis PR introduces
Important Files Changed
Reviews (1): Last reviewed commit: "Merge branch 'main' into geoT-opt-muon-o..." | Re-trigger Greptile |
peterdsharpe
left a comment
There was a problem hiding this comment.
Great job with this!
The bucketing-logic ("grouping") is a particularly nice touch that I think will really benefit kernel-launch-bound training. TBH, this might be worth upstreaming to PyTorch's Muon impl too (unless they do it first).
|
/ok to test fdc0b5a |
PhysicsNeMo Pull Request
Cursor made this implementation and I want to clean it up to be a tighter integration against torch before we merge. The key is that the overhead of looping over params is actually pretty significant for models like GeoT. So this is a first draft at taht fusion.
We won't merge it in this state, but I wanted a branch as a placeholder for putting all the pieces together.
Description
Checklist
Dependencies
Review Process
All PRs are reviewed by the PhysicsNeMo team before merging.
Depending on which files are changed, GitHub may automatically assign a maintainer for review.
We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.
AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.