Skip to content

Fix FastDivmod divisor SSA transport for kernel regions (#3243)#3246

Open
zhils wants to merge 1 commit into
NVIDIA:mainfrom
zhils:fix/cutedsl-fastdivmod-fdd-ss-3243
Open

Fix FastDivmod divisor SSA transport for kernel regions (#3243)#3246
zhils wants to merge 1 commit into
NVIDIA:mainfrom
zhils:fix/cutedsl-fastdivmod-fdd-ss-3243

Conversation

@zhils
Copy link
Copy Markdown

@zhils zhils commented May 19, 2026

Fixes #3243

Expose encoded divisor plus scalar divisor IR through FastDivmodDivisor::__extract_mlir_values__ and rebuild .divisor from region-local SSA in __new_from_mlir_values__. Update persistent tile schedulers to slice the FastDivmod tail using extract_mlir_values length. Add a CuTeDSL compile regression aligned with the issue repro.

Test plan

  • pytest test/examples/CuTeDSL/test_fast_divmod_divisor_param_transport.py

Expose encoded divisor plus scalar divisor IR through FastDivmodDivisor::__extract_mlir_values__ and rebuild .divisor from region-local SSA in __new_from_mlir_values__. Update persistent tile schedulers to slice FDD tail by extract_mlir_values length. Add CuTeDSL compile regression derived from upstream repro.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@questa-wang questa-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the fix.

@questa-wang
Copy link
Copy Markdown
Contributor

Hi @zhils, thanks a lot for putting up this fix! We've been tracking the same root cause internally and have landed a broader patch that touches additional consumers beyond what's visible in the public tree, plus a small fix avoid re-emitting IR on the device side during reconstruction.
That fix is on track for the next public release (4.6). We'll keep this PR open as a reference until then, and close it out once 4.6 ships. Thanks again for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] cutlass 4.5.0/4.5.1cute.FastDivmodDivisor.divisor can reference SSA value from outside isolated region

2 participants