Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
182 changes: 182 additions & 0 deletions doc/architectural/x86-long-double.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
\page x86-long-double Modelling x86 80-bit extended `long double`

\author Michael Tautschnig

# Summary

On x86 hardware, the `long double` C type is the x87 80-bit extended
precision floating-point format, even though it is stored in 12 bytes
(i386) or 16 bytes (x86_64) of memory. Most other modern targets use
either an IEEE 754 `binary64` (e.g. AArch64 macOS, Windows MSVC) or an
IEEE 754 `binary128` (e.g. AArch64 Linux, ppc64le) layout for `long
double`. CBMC must model the byte layout of `long double` faithfully
because programs do read `long double` bytes via union-based bit
twiddling -- most prominently the macOS SDK's `<math.h>` inline helpers
for `signbit`, `__inline_isnan`, etc.

This document describes how CBMC encodes the format and where the code
paths live.

# Hardware reference

The 80-bit extended format is documented in:

- Intel® 64 and IA-32 Architectures Software Developer's Manual, vol. 1,
§4.2.2 ("Floating-Point Data Types").
- The IA-32 ABI for Linux ("System V Application Binary Interface ‒
i386 Architecture Processor Supplement"), §3.1.2.

The relevant facts are:

- The value occupies 80 bits. Storage padding extends it to 96 bits on
i386 (`alignof == 4`) and 128 bits on x86_64 (`alignof == 16`).
- Unlike IEEE binary formats, the leading "integer bit" of the
significand (the J-bit) is **explicit**: it is stored in the
encoding, rather than being implicit-1 as in IEEE.
- Exponent width is 15 bits with bias 16383.

The on-disk byte layout, low to high (LSB first):

```
bits 0 .. 62 : explicit fraction (62 bits, no implicit leading 1)
bit 63 : explicit integer bit (J-bit)
bits 64 .. 78 : biased 15-bit exponent
bit 79 : sign bit
bits 80 .. 127 : storage padding (zero on Linux/macOS)
```

For example, on real macOS-15 x86_64 hardware (Xcode 16.4, Apple clang
17.0.0) the 16-byte little-endian dump of `1.0L` is:

```
00 00 00 00 00 00 00 80 ff 3f 00 00 00 00 00 00
```

with bytes 0-7 holding the mantissa `0x8000000000000000` (only the
explicit integer bit set), bytes 8-9 holding the biased exponent
`0x3FFF = 16383`, and bytes 10-15 being storage padding. Negation flips
bit 79 of the value, giving `... ff bf ...`.

# Modelling

`ieee_float_spect` (in `src/util/ieee_float.h`) records the format:

- `f` -- fraction width (63 for x86 extended).
- `e` -- exponent width (15 for x86 extended).
- `x86_extended` -- when true, the encoding has an explicit integer bit
at position `f`, so the value width is `f + e + 2` (sign + exponent +
J-bit + fraction) rather than `f + e + 1`.
- `storage_width_bits` -- when non-zero, the storage container has more
bits than the value; the high bits are padding zeros.

Three factory methods are provided for the x86 case:

- `ieee_float_spect::x86_80()` -- the bare 80-bit value, used internally
for solver-internal arithmetic.
- `ieee_float_spect::x86_96()` -- 80-bit value in 96-bit storage (i386
`long double`).
- `ieee_float_spect::x86_128()` -- 80-bit value in 128-bit storage
(x86_64 `long double` on Linux, macOS, FreeBSD).

The selection happens in `long_double_type()` (in
`src/util/c_types.cpp`) based on `config.ansi_c.long_double_width` and
`config.ansi_c.arch`.

# Encoding paths

Floating-point values flow through three layers, and each must agree on
the byte layout:

1. **Constants** -- `ieee_float_valuet::pack` / `unpack` in
`src/util/ieee_float.cpp` produce the `mp_integer` bit pattern that
the type checker hands off as a constant initialiser.
2. **Expression-level solver** -- `float_bvt` in
`src/solvers/floatbv/float_bv.cpp` lowers floating-point expressions
to bit-vector expressions before flattening.
3. **SAT-level solver** -- `float_utilst` in
`src/solvers/floatbv/float_utils.cpp` produces literals/clauses for
the SAT back-end directly.

For the x86 extended layout each layer:

- Places the sign bit at `value_width() - 1` (i.e. bit 79), **not** at
the top of the storage container.
- Places the biased exponent at bits `[f + 1, f + 1 + e)`.
- Places the explicit integer bit at bit `f` (== 63).
- Reads the integer bit out of the encoding rather than synthesising a
hidden bit during `unpack`.
- Zero-pads the high storage bits during `pack`.

The `boolbvt::convert_rest` handler for `ID_sign` (in
`src/solvers/flattening/boolbv.cpp`) similarly reads the sign from
`value_width() - 1` for `floatbv` operands.

# SMT2 back-ends

The SMT-LIB FloatingPoint theory only covers IEEE 754 binary formats,
which all use an *implicit* leading integer bit. The x87 extended
format stores that bit *explicitly*, and the value lives in a 96- or
128-bit container whose high bits are padding. Encoding an
x86-extended value as `(_ FloatingPoint 15 64)` would discard the
explicit J-bit and the storage padding, so it is not a faithful
encoding.

`smt2_convt::use_FPA_for_type(typet)` (in
`src/solvers/smt2/smt2_conv.cpp`) returns `false` for x86-extended
`floatbv` types regardless of the global `use_FPA_theory` flag, and
`true` otherwise (delegating to `use_FPA_theory`). All dispatch sites
in `smt2_conv.cpp` consult this helper, which means:

- The SMT-LIB sort emitted for an x86-extended type is always
`(_ BitVec storage_width)` (128 on x86_64, 96 on i386).
- All FP operations on x86-extended operands lower through the
existing `convert_floatbv` path, which emits the `float_bv.*` helper
functions defined elsewhere in the same module.
- Other floating-point types (binary32, binary64, binary128) continue
to use the SMT-LIB FloatingPoint theory on solvers that support it
(currently Z3, CVC5, Bitwuzla, and the cprover-smt2 solver).

The new SMT2 back-end (`--incremental-smt2-solver`, in
`src/solvers/smt2_incremental/`) does not yet implement the
byte-extract / concatenation lowerings the byte-layout regression
tests rely on; those tests are therefore tagged `no-new-smt`.

# Why this matters

Before this change, CBMC modelled `long double` on x86_64 as IEEE
`binary128` (`quadruple_precision`). The two formats agree on the
*value* of representable numbers but disagree on every individual *bit
position*: in `binary128` the sign sits at bit 127, the exponent at
bits 112-126, and there is no explicit integer bit.

The macOS SDK (and similar code in BSD and embedded systems) reaches
into `long double` via a union to extract the sign or classify the
value. With the binary128 model the `__sexp` field of the SDK's
classification union (bytes 8-9) is always zero, so
`signbit(-1.0L)` evaluates to `0` and dependent control flow never
sees a negative value -- a soundness bug in any analysis that relies
on these classifications.

# Testing

The fix is exercised from several angles:

- `regression/cbmc/long-double-x86-bytes` and
`regression/cbmc/long-double-i386-bytes` check the byte layout
against hardware-captured reference bytes.
- `regression/cbmc/long-double-signbit-x86` replicates the macOS SDK
signbit pattern.
- `regression/cbmc/long-double-roundtrip-x86` exercises
`double <-> long double` conversion, including the symbolic case.
- `regression/cbmc/long-double-pseudo-encodings` checks that
`__CPROVER_isnanld` / `__CPROVER_isinfld` / `__CPROVER_isnormalld`
reject the x86 pseudo-encodings (pseudo-NaN, pseudo-Infinity,
pseudo-denormal-with-non-zero-exponent), in agreement with modern
Intel/AMD hardware.
- `regression/cbmc-library/{expl,exp2l,logl,log2l,log10l,__builtin_powil}`
cover the long-double-only Schraudolph-style fast-math models in
`src/ansi-c/library/math.c`, which were updated in lockstep so that
the bit manipulations target the x86-extended layout when active.
- `unit/util/ieee_float.cpp` covers the `ieee_float_spect` factories
and `ieee_float_valuet::pack` / `unpack` against the same hardware
references.
67 changes: 67 additions & 0 deletions regression/cbmc/long-double-arithmetic-x86/main.i
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
// Pure long-double FP arithmetic. No byte-level access to the
// underlying storage, so no bv<->FP boundary; this exercises the FP
// semantics on long double across all four CBMC back-ends.
//
// On x86_64 with x86 80-bit extended `long double`, the arithmetic
// semantics match IEEE binary79 for valid encodings (and our SAT
// backend encodes the layout precisely), so the assertions below hold
// under SAT, --z3, --cprover-smt2 and --incremental-smt2-solver.
//
// This complements the byte-layout tests, which exercise the FP<->bv
// boundary and are restricted to back-ends that fully support it.
//
// The test is hermetic (no system headers) and provided as `main.i`
// so CBMC parses it without invoking the host's preprocessor. It can
// therefore run on any host -- including FreeBSD/OpenBSD CI runners
// that don't ship with gcc and arm64 Linux runners that don't have
// x86_64 cross-compilation tooling -- as long as the test.desc fixes
// `--arch x86_64 --os linux` so CBMC models x86 80-bit extended
// regardless of the host's native long-double layout.

int main(void)
{
// 1. Exact addition / multiplication / division on representable
// values.
long double a = 1.0L;
long double b = 2.0L;
long double c = a + b;
__CPROVER_assert(c == 3.0L, "1.0L + 2.0L == 3.0L");

long double d = c * 2.0L;
__CPROVER_assert(d == 6.0L, "3.0L * 2.0L == 6.0L");

long double e = d / 3.0L;
__CPROVER_assert(e == 2.0L, "6.0L / 3.0L == 2.0L");

// 2. Sign preservation under negation.
long double f = 3.5L;
__CPROVER_assert(-f < 0.0L, "negation gives < 0");
__CPROVER_assert(f > 0.0L, "positive is > 0");
__CPROVER_assert(-(-f) == f, "double negation");

// 3. Long double has 64 bits of significand precision (well above
// the 53 bits of binary64), so 1.0L + 2^-60 must not equal
// 1.0L. This would round to 1.0 if the model were treating long
// double as `double`.
long double tiny = 1.0L;
for(int i = 0; i < 60; ++i)
tiny /= 2.0L;
long double almost_one = 1.0L + tiny;
__CPROVER_assert(almost_one != 1.0L, "long double has > 60 bits precision");

// 4. Reciprocal of a wide-range value: 1/x for very large x is
// positive and finite. At 2^-100 the result is comfortably
// *normal* in extended precision (whose subnormal threshold is
// around 2^-16382), so this exercises the wide exponent range
// rather than subnormal handling. TODO: add a separate test
// for genuine subnormal arithmetic once the pack/unpack of
// canonical x86 denormals is exercised end-to-end.
long double large = 1.0L;
for(int i = 0; i < 100; ++i)
large *= 2.0L;
long double recip = 1.0L / large;
__CPROVER_assert(recip > 0.0L, "1/2^100 > 0");
__CPROVER_assert(recip < 1.0L, "1/2^100 < 1");

return 0;
}
22 changes: 22 additions & 0 deletions regression/cbmc/long-double-arithmetic-x86/test.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
CORE
main.i
--arch x86_64 --os linux --floatbv
^EXIT=0$
^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
--
^warning: ignoring
--
Pure FP arithmetic on x86 80-bit extended `long double` -- no byte-level
access, no FP<->bv boundary. Runs under all four back-ends (SAT,
--z3, --cprover-smt2, --incremental-smt2-solver), demonstrating that
long-double FP arithmetic is precise across the board. Byte-layout
testing is in the long-double-{x86-bytes,i386-bytes,signbit-x86,...}
tests, which are restricted to back-ends that support the FP<->bv
boundary.

The source is provided as a pre-processed `main.i` so CBMC parses it
without invoking the host preprocessor; combined with `--arch x86_64
--os linux`, this lets the test run on hosts whose native long-double
layout differs from x86 80-bit extended (e.g. ARM Linux, macOS
arm64) and on hosts that don't ship gcc (FreeBSD, OpenBSD).
63 changes: 63 additions & 0 deletions regression/cbmc/long-double-i386-bytes/main.i
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
// Same byte-layout invariants as long-double-x86-bytes but for i386
// (32-bit x86), where `long double` is the 80-bit x87 extended-precision
// format stored in 12 bytes. The 80-bit value layout is identical to
// x86_64; only the storage container size differs.
//
// The test is hermetic (no system headers) and provided as `main.i`
// so CBMC parses it without invoking the host's preprocessor. It can
// therefore run on any host (in particular hosts without i386
// cross-compilation tooling installed) provided the test.desc fixes
// `--i386-linux` so CBMC models the 12-byte layout regardless.

typedef unsigned char u8;
typedef unsigned short u16;
typedef unsigned long long u64;

typedef union
{
long double ld;
u8 bytes[12];
struct
{
u64 mantissa;
u16 sign_exp;
u16 pad;
} parts;
} ld_layout_t;

int main(void)
{
// 1. Storage size is 12 bytes on i386.
__CPROVER_assert(sizeof(long double) == 12, "12-byte storage");

// 2. The 80-bit value of 1.0L is mantissa = 0x8000000000000000,
// sign_exp = 0x3FFF (sign 0, biased exponent 0x3FFF = 16383).
ld_layout_t one;
one.ld = 1.0L;
__CPROVER_assert(
one.parts.mantissa == 0x8000000000000000ULL, "mantissa for 1.0L");
__CPROVER_assert(one.parts.sign_exp == 0x3FFF, "sign_exp for 1.0L");

// 3. Negation flips the sign bit at the top of sign_exp.
ld_layout_t neg_one;
neg_one.ld = -1.0L;
__CPROVER_assert(
neg_one.parts.mantissa == 0x8000000000000000ULL, "mantissa for -1.0L");
__CPROVER_assert(neg_one.parts.sign_exp == 0xBFFF, "sign_exp for -1.0L");

// 4. The byte view matches the 12-byte little-endian sequence.
__CPROVER_assert(one.bytes[0] == 0x00, "byte 0");
__CPROVER_assert(one.bytes[1] == 0x00, "byte 1");
__CPROVER_assert(one.bytes[2] == 0x00, "byte 2");
__CPROVER_assert(one.bytes[3] == 0x00, "byte 3");
__CPROVER_assert(one.bytes[4] == 0x00, "byte 4");
__CPROVER_assert(one.bytes[5] == 0x00, "byte 5");
__CPROVER_assert(one.bytes[6] == 0x00, "byte 6");
__CPROVER_assert(one.bytes[7] == 0x80, "byte 7");
__CPROVER_assert(one.bytes[8] == 0xFF, "byte 8");
__CPROVER_assert(one.bytes[9] == 0x3F, "byte 9");
__CPROVER_assert(one.bytes[10] == 0x00, "byte 10");
__CPROVER_assert(one.bytes[11] == 0x00, "byte 11");

return 0;
}
17 changes: 17 additions & 0 deletions regression/cbmc/long-double-i386-bytes/test.desc
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
CORE
main.i
--i386-linux --floatbv
^EXIT=0$
^SIGNAL=0$
^VERIFICATION SUCCESSFUL$
--
^warning: ignoring
--
Verifies the byte layout of x86 80-bit extended `long double` on i386
(12-byte storage container). The 80-bit value layout is identical to
x86_64; only the storage container size differs.

The source is provided as a pre-processed `main.i` so CBMC parses it
without invoking the host preprocessor; combined with `--i386-linux`,
this lets the test run on any host (in particular hosts without i386
multilib system headers and ARM hosts whose gcc lacks `-m32`).
31 changes: 31 additions & 0 deletions regression/cbmc/long-double-math-x86/main.i
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
// Host-independent coverage for the x86 80-bit extended `long double`
// fast-math library models (expl / logl), which encode their
// Schraudolph-style approximation directly in the x86-extended bit
// layout (see src/ansi-c/library/math.c and the
// __cprover_x86ld_{to,from}_schraudolph helpers).
//
// The corresponding regression/cbmc-library/{expl,logl,...} tests run
// under the *host* architecture, so they only exercise the x86-extended
// path when CI happens to run on x86_64. This test pins `--arch x86_64`
// (via test.desc) and is provided pre-processed (`main.i`, no system
// headers), so the x86-extended models are verified regardless of the
// runner's native long-double layout.
//
// The asserted ranges mirror the cbmc-library expl / logl tests; they
// are wide because the models are deliberate approximations.

long double expl(long double);
long double logl(long double);

int main(void)
{
long double e = expl(1.0l);
__CPROVER_assert(e > 2.713l && e < 2.886l, "expl(1.0L) approximates e");

// ln(e) ~ 1, with e written as a long double literal so the test needs
// no <math.h>.
long double one = logl(2.718281828459045235360287471352662498l);
__CPROVER_assert(one > 0.942l && one < 1.002l, "logl(e) approximates 1");

return 0;
}
Loading
Loading