Skip to content

Fix firestorm gemmsup#933

Merged
devinamatthews merged 11 commits into
masterfrom
fix-firestorm-gemmsup
Jun 25, 2026
Merged

Fix firestorm gemmsup#933
devinamatthews merged 11 commits into
masterfrom
fix-firestorm-gemmsup

Conversation

@devinamatthews

Copy link
Copy Markdown
Member

Fixes #930. The first couple commits only establish a failing CI baseline---actual fix to come shortly.

Details:
- Add "repeat" flag to test experiment callback so that gemmsup kernel
  tests can repeat over all microkernel shapes. This is a very hacky
  solution but something more proper would require a much more
  substantial change for a very narrow use-case.
- Add new interface type so that we can force all mixtures of row- and
  column-storage for gemmsup. This is necessary to test all kernels.
- Some arcane trickery with the storage type (`stor3_t stor_id`), the
  storage (row/column) of parameters, and transposition in order to a)
  actually test all kernels, and b) ensure that kernels receive operands
  which meet the expected constraints. For example, the RCC kernel is
  "non-primary" for a row-major kernel, and so need to be adjusted so
  that storage is actually like CRR.
- It seems highly likely that a mixture of row- and column-preferential
  gemmsup kernels will *NEVER* work. There probably shouldn't be
  separate preferences for each kernel.
Details:
- One of the armv8a gemmsup kernels was performing arithmetic on `void*`.
- Only firestorm is affected as of now, but potentially other arm64
  architectures if gemmsup were to be enabled.
- Manifested as UB (typically incorrect numerical result).
- Thx to Oliver Grisel (@ogrisel) for reporting and sending MRE.
Details:
- Use `make T=1 ...` to echo testsuite output as it is run.
- Output is still written to `output.testsuite` regardless.
- Have CI tests use this option to prevent timeouts due to inactivity.
@devinamatthews devinamatthews merged commit 5e22e1c into master Jun 25, 2026
15 checks passed
@devinamatthews devinamatthews deleted the fix-firestorm-gemmsup branch June 25, 2026 20:01
devinamatthews added a commit that referenced this pull request Jun 25, 2026
Details:
- Add comprehensive testing for gemmsup kernels.
- Add "repeat" flag to test experiment callback so that gemmsup kernel
  tests can repeat over all microkernel shapes. This is a very hacky
  solution but something more proper would require a much more
  substantial change for a very narrow use-case.
- Add new interface type so that we can force all mixtures of row- and
  column-storage for gemmsup. This is necessary to test all kernels.
- Some arcane trickery with the storage type (`stor3_t stor_id`), the
  storage (row/column) of parameters, and transposition in order to a)
  actually test all kernels, and b) ensure that kernels receive operands
  which meet the expected constraints. For example, the RCC kernel is
  "non-primary" for a row-major kernel, and so need to be adjusted so
  that storage is actually like CRR.
- It seems highly likely that a mixture of row- and column-preferential
  gemmsup kernels will *NEVER* work. There probably shouldn't be
  separate preferences for each kernel.
- Test gemm ukrs for all microtile sizes.
- Add Makefile option to echo testsuite output.
  - Use `make T=1 ...` to echo testsuite output as it is run.
  - Output is still written to `output.testsuite` regardless.
  - Have CI tests use this option to prevent timeouts due to inactivity.
- Initialize all reference sup blocksizes.
- Add flags (gcc+llvm) to make `void*` arith. an error
- Fix bug with firestorm (Apple M-series) gemmsup.
  - One of the armv8a gemmsup kernels was performing arithmetic on `void*`.
  - Only firestorm is affected as of now, but potentially other arm64
    architectures if gemmsup were to be enabled.
  - Manifested as UB (typically incorrect numerical result).
- Thx to Oliver Grisel (@ogrisel) for reporting and sending MRE.

(cherry picked from commit 5e22e1c)
devinamatthews added a commit that referenced this pull request Jun 25, 2026
Details:
- Fix bug with firestorm (Apple M-series) gemmsup.
  - One of the armv8a gemmsup kernels was performing arithmetic on `void*`.
  - Only firestorm is affected as of now, but potentially other arm64
    architectures if gemmsup were to be enabled.
  - Manifested as UB (typically incorrect numerical result).
- Thx to Oliver Grisel (@ogrisel) for reporting and sending MRE.

(cherry picked from commit 5e22e1c)
devinamatthews added a commit that referenced this pull request Jun 25, 2026
Details:
- Add comprehensive testing for gemmsup kernels.
- Add "repeat" flag to test experiment callback so that gemmsup kernel
  tests can repeat over all microkernel shapes. This is a very hacky
  solution but something more proper would require a much more
  substantial change for a very narrow use-case.
- Add new interface type so that we can force all mixtures of row- and
  column-storage for gemmsup. This is necessary to test all kernels.
- Some arcane trickery with the storage type (`stor3_t stor_id`), the
  storage (row/column) of parameters, and transposition in order to a)
  actually test all kernels, and b) ensure that kernels receive operands
  which meet the expected constraints. For example, the RCC kernel is
  "non-primary" for a row-major kernel, and so need to be adjusted so
  that storage is actually like CRR.
- It seems highly likely that a mixture of row- and column-preferential
  gemmsup kernels will *NEVER* work. There probably shouldn't be
  separate preferences for each kernel.
- Test gemm ukrs for all microtile sizes.
- Add Makefile option to echo testsuite output.
  - Use `make T=1 ...` to echo testsuite output as it is run.
  - Output is still written to `output.testsuite` regardless.
  - Have CI tests use this option to prevent timeouts due to inactivity.
- Initialize all reference sup blocksizes.
- Add flags (gcc+llvm) to make `void*` arith. an error
- Fix bug with firestorm (Apple M-series) gemmsup.
  - One of the armv8a gemmsup kernels was performing arithmetic on `void*`.
  - Only firestorm is affected as of now, but potentially other arm64
    architectures if gemmsup were to be enabled.
  - Manifested as UB (typically incorrect numerical result).
- Thx to Oliver Grisel (@ogrisel) for reporting and sending MRE.

(cherry picked from commit 5e22e1c)
(cherry picked from commit 6a161f2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Corrupted GEMM results on macOS arm64 (M1 / firestorm)

1 participant