Skip to content

cranelift(aarch64): lower bare ctz/clz boolean tests via tst/cmp+Cond#13336

Draft
ggreif wants to merge 2 commits into
bytecodealliance:mainfrom
ggreif:gabor/ctz-clz-brif-lowering-aarch64
Draft

cranelift(aarch64): lower bare ctz/clz boolean tests via tst/cmp+Cond#13336
ggreif wants to merge 2 commits into
bytecodealliance:mainfrom
ggreif:gabor/ctz-clz-brif-lowering-aarch64

Conversation

@ggreif
Copy link
Copy Markdown
Contributor

@ggreif ggreif commented May 11, 2026

aarch64 analogue of #13334; egraph counterpart in #13332.

Stacked on #13334. The first two commits in this PR are from #13334 (x64); the aarch64-specific change is the third (HEAD) commit. Mark as ready / merge after #13334 lands.

Same shape as the x64 follow-up: specialise is_nonzero (ctz X) / is_nonzero (clz X) (and their ireduce-wrapped variants) in cranelift/codegen/src/isa/aarch64/inst.isle, so the wasm-natural brif (ireduce.i32 (ctz.i64 X)) shape lowers to a single bit-test instead of rbit; clz; cmp; b.cond.

aarch64-specific instructions used:

  • ctz: tst Xn, #1 (logical AND with immediate, flags only) + Cond.Eq — branches when LSB is clear.
  • clz: cmp Xn, #0 + Cond.Pl — branches when sign bit (N flag) is clear, i.e. X is signed-non-negative.

Test deltas (tests/disas/aarch64-ctz-clz-bool-condition.wat, newly added):

consumer before after
if_ctz_bare_i32 4 insns (rbit + clz + ...) 2 (tst w4, #1; b.eq)
if_ctz_bare_i64 4 insns 2 (tst x4, #1; b.eq)
if_clz_bare_i32 4 insns (clz + ...) 2 (cmp w4, #0; b.pl)

Negative test ((ctz X) == 4) correctly untouched. Same motivation as #13334 — closes the gap for non-Rust wasm frontends like Motoko's moc.

riscv64 and s390x to follow.

ggreif and others added 2 commits May 11, 2026 17:57
Follow-up to bytecodealliance#13332. That PR added egraph rules collapsing
`(eq (ctz X) 0)` / `(ne (ctz X) 0)` / clz analogues to direct
LSB / sign-bit tests — but only when the comparison is mediated by an
explicit `icmp`. The wasm front-end translates `wasm if (ctz X)` to
`brif (ireduce.i32 (ctz.i64 X))` directly (no `icmp`), so the egraph
rules don't fire on the wasm-natural shape.

This commit closes the gap by specialising `is_nonzero` in the x64
backend — the helper that all `brif`/`select`/`trapif` lowerings
funnel through. Four rules: `ctz`/`clz` × bare/`ireduce`-wrapped.

The `ireduce` variant catches the wasm front-end's `i32.wrap_i64`
over a 64-bit `ctz`/`clz` — a no-op on values in [0, bitwidth].

Test deltas (tests/disas/ctz-clz-bool-condition.wat):

  if_ctz_bare_i32:   5 insns -> 2 (testl $1, %edx; je)
  if_ctz_bare_i64:   5 insns -> 2 (testq $1, %rdx; je)
  if_clz_bare_i32:   7 insns -> 2 (testl %edx, %edx; jns)

The icmp-mediated cases (collapsed by bytecodealliance#13332's egraph rules) are
unchanged. The numeric-comparison negative test stays untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
aarch64 analogue of the x64 follow-up. Specialises `is_nonzero (ctz X)`
and `is_nonzero (clz X)` (plus their `ireduce`-wrapped variants) so the
wasm-natural `brif (ireduce.i32 (ctz.i64 X))` shape lowers to a single
bit-test instead of `rbit; clz; cmp; b.cond`.

  ctz: `tst Xn, #1` + `Cond.Eq` — branches when LSB is clear.
  clz: `cmp Xn, #0` + `Cond.Pl` — branches when sign bit is clear.

Test deltas (tests/disas/aarch64-ctz-clz-bool-condition.wat):

  if_ctz_bare_i32:   `tst w4, #1; b.eq`
  if_ctz_bare_i64:   `tst x4, #1; b.eq`
  if_clz_bare_i32:   `cmp w4, #0; b.pl`

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ggreif ggreif changed the title cranelift(aarch64): lower bare ctz/clz boolean tests via tst/cmp+Cond cranelift(aarch64): lower bare ctz/clz boolean tests via tst/cmp+Cond May 11, 2026
@github-actions github-actions Bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen labels May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant