Skip to content

riscv_fpu: RMM (roundTiesToAway) for the scalar OP-FP ops#242

Draft
SolAstrius wants to merge 2 commits into
LekKit:stagingfrom
pufit:fix/rmm-roundtiestoaway
Draft

riscv_fpu: RMM (roundTiesToAway) for the scalar OP-FP ops#242
SolAstrius wants to merge 2 commits into
LekKit:stagingfrom
pufit:fix/rmm-roundtiestoaway

Conversation

@SolAstrius

Copy link
Copy Markdown
Contributor

Summary

Fix RMM (round to nearest, ties to max magnitude — IEEE 754 roundTiesToAway) for the scalar arithmetic ops, addressing #204. Standalone on staging; no dependency on the other in-flight FPU work.

The host FPU has no RMM mode, so under frm == RMM the host is left in round-to-nearest-even. RNE and roundTiesToAway differ only on an exact halfway tie. The previous riscv_prepare_rmm rounded toward ±inf unconditionally — correct on ties but wrong for every inexact non-tie. This replaces it with: compute in RNE, recover the exact rounding error via the library's error-free transforms (TwoSum / TwoProduct), and step one ULP outward only on a true half-ULP tie.

What's covered

  • fadd / fsub / fmul / fdiv, both .s and .d.
  • fmul near the underflow boundary, where Dekker's product error itself underflows — handled with an exact widened (f32) / scaled-residual (f64) tie test.
  • subnormal fdiv quotients, the one division case that can land on an exact tie.
  • fsqrt needs no fixup (a square root is never an exact halfway case).
  • Flag-isolated throughout: the error-free transforms do raw host arithmetic whose intermediate steps can raise spurious NV/OF, so each wrapper snapshots and restores the exception flags; the genuine flags come from the base op.

Scope / draft notes

Deliberately minimal and independent, so it can be reviewed on its own (per review feedback that RMM is too complex to evaluate bundled with the rest):

  • Dynamic RMM only (frm == RMM, rm == DYN). The static ,rmm instruction suffix needs host-rounding-mode forcing and lands with the rounding-mode rework.
  • OP-FP only — no FMA. The FMA-family ties-away fixup depends on the FMA rounding path; under RMM the FMA ops still round ties-to-even here, to be completed with the FMA work.

Draft until the companion FPU PRs settle (it lightly overlaps the fadd/fsub/fmul/fdiv read sites with the canonicalization/mal-box PR, so ordering may want a trivial rebase).

Touches src/cpu/riscv_fpu.c only.

Signed-off-by: Sol Astrius Phoenix <sol@astrius.ink>
(cherry picked from commit 685255b)
…low boundary

Signed-off-by: Sol Astrius Phoenix <sol@astrius.ink>
(cherry picked from commit 81d257c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant