Fix firestorm gemmsup#933
Merged
Merged
Conversation
Details: - Add "repeat" flag to test experiment callback so that gemmsup kernel tests can repeat over all microkernel shapes. This is a very hacky solution but something more proper would require a much more substantial change for a very narrow use-case. - Add new interface type so that we can force all mixtures of row- and column-storage for gemmsup. This is necessary to test all kernels. - Some arcane trickery with the storage type (`stor3_t stor_id`), the storage (row/column) of parameters, and transposition in order to a) actually test all kernels, and b) ensure that kernels receive operands which meet the expected constraints. For example, the RCC kernel is "non-primary" for a row-major kernel, and so need to be adjusted so that storage is actually like CRR. - It seems highly likely that a mixture of row- and column-preferential gemmsup kernels will *NEVER* work. There probably shouldn't be separate preferences for each kernel.
Details: - One of the armv8a gemmsup kernels was performing arithmetic on `void*`. - Only firestorm is affected as of now, but potentially other arm64 architectures if gemmsup were to be enabled. - Manifested as UB (typically incorrect numerical result). - Thx to Oliver Grisel (@ogrisel) for reporting and sending MRE.
Details: - Use `make T=1 ...` to echo testsuite output as it is run. - Output is still written to `output.testsuite` regardless. - Have CI tests use this option to prevent timeouts due to inactivity.
devinamatthews
added a commit
that referenced
this pull request
Jun 25, 2026
Details:
- Add comprehensive testing for gemmsup kernels.
- Add "repeat" flag to test experiment callback so that gemmsup kernel
tests can repeat over all microkernel shapes. This is a very hacky
solution but something more proper would require a much more
substantial change for a very narrow use-case.
- Add new interface type so that we can force all mixtures of row- and
column-storage for gemmsup. This is necessary to test all kernels.
- Some arcane trickery with the storage type (`stor3_t stor_id`), the
storage (row/column) of parameters, and transposition in order to a)
actually test all kernels, and b) ensure that kernels receive operands
which meet the expected constraints. For example, the RCC kernel is
"non-primary" for a row-major kernel, and so need to be adjusted so
that storage is actually like CRR.
- It seems highly likely that a mixture of row- and column-preferential
gemmsup kernels will *NEVER* work. There probably shouldn't be
separate preferences for each kernel.
- Test gemm ukrs for all microtile sizes.
- Add Makefile option to echo testsuite output.
- Use `make T=1 ...` to echo testsuite output as it is run.
- Output is still written to `output.testsuite` regardless.
- Have CI tests use this option to prevent timeouts due to inactivity.
- Initialize all reference sup blocksizes.
- Add flags (gcc+llvm) to make `void*` arith. an error
- Fix bug with firestorm (Apple M-series) gemmsup.
- One of the armv8a gemmsup kernels was performing arithmetic on `void*`.
- Only firestorm is affected as of now, but potentially other arm64
architectures if gemmsup were to be enabled.
- Manifested as UB (typically incorrect numerical result).
- Thx to Oliver Grisel (@ogrisel) for reporting and sending MRE.
(cherry picked from commit 5e22e1c)
devinamatthews
added a commit
that referenced
this pull request
Jun 25, 2026
Details:
- Fix bug with firestorm (Apple M-series) gemmsup.
- One of the armv8a gemmsup kernels was performing arithmetic on `void*`.
- Only firestorm is affected as of now, but potentially other arm64
architectures if gemmsup were to be enabled.
- Manifested as UB (typically incorrect numerical result).
- Thx to Oliver Grisel (@ogrisel) for reporting and sending MRE.
(cherry picked from commit 5e22e1c)
devinamatthews
added a commit
that referenced
this pull request
Jun 25, 2026
Details:
- Add comprehensive testing for gemmsup kernels.
- Add "repeat" flag to test experiment callback so that gemmsup kernel
tests can repeat over all microkernel shapes. This is a very hacky
solution but something more proper would require a much more
substantial change for a very narrow use-case.
- Add new interface type so that we can force all mixtures of row- and
column-storage for gemmsup. This is necessary to test all kernels.
- Some arcane trickery with the storage type (`stor3_t stor_id`), the
storage (row/column) of parameters, and transposition in order to a)
actually test all kernels, and b) ensure that kernels receive operands
which meet the expected constraints. For example, the RCC kernel is
"non-primary" for a row-major kernel, and so need to be adjusted so
that storage is actually like CRR.
- It seems highly likely that a mixture of row- and column-preferential
gemmsup kernels will *NEVER* work. There probably shouldn't be
separate preferences for each kernel.
- Test gemm ukrs for all microtile sizes.
- Add Makefile option to echo testsuite output.
- Use `make T=1 ...` to echo testsuite output as it is run.
- Output is still written to `output.testsuite` regardless.
- Have CI tests use this option to prevent timeouts due to inactivity.
- Initialize all reference sup blocksizes.
- Add flags (gcc+llvm) to make `void*` arith. an error
- Fix bug with firestorm (Apple M-series) gemmsup.
- One of the armv8a gemmsup kernels was performing arithmetic on `void*`.
- Only firestorm is affected as of now, but potentially other arm64
architectures if gemmsup were to be enabled.
- Manifested as UB (typically incorrect numerical result).
- Thx to Oliver Grisel (@ogrisel) for reporting and sending MRE.
(cherry picked from commit 5e22e1c)
(cherry picked from commit 6a161f2)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #930. The first couple commits only establish a failing CI baseline---actual fix to come shortly.