Skip to content

smt2: add missing separator in bvfromfloat equality assertion#9030

Closed
tautschnig wants to merge 1 commit into
diffblue:developfrom
tautschnig:bvfromfloat-space
Closed

smt2: add missing separator in bvfromfloat equality assertion#9030
tautschnig wants to merge 1 commit into
diffblue:developfrom
tautschnig:bvfromfloat-space

Conversation

@tautschnig

Copy link
Copy Markdown
Collaborator

The bvfromfloat mechanism in smt2_convt::find_symbols emits, for a bit-wise typecast from an FPA-encoded float to a bit-vector, an assertion of the form

(assert (= ((_ to_fp e f) )))

with no whitespace between the closing parenthesis of the to_fp application and the second operand of the equality. Most SMT-LIB tokenizers treat ) as a token boundary, but CPROVER's own smt2 tokenizer does not, so it reads )|float| as a single (invalid) token and rejects the assertion.

This path is only reached when reading the raw bytes of an FPA-encoded float (e.g. a union-based bit access under --cprover-smt2 or an FPA solver), which is itself subject to other backend limitations, so the bug was latent. Emit the missing space.

  • Each commit message has a non-empty body, explaining why the change was made.
  • n/a Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • n/a The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • n/a My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

The `bvfromfloat` mechanism in smt2_convt::find_symbols emits, for a
bit-wise `typecast` from an FPA-encoded float to a bit-vector, an
assertion of the form

  (assert (= ((_ to_fp e f) <bv>)<float>))

with no whitespace between the closing parenthesis of the `to_fp`
application and the second operand of the equality.  Most SMT-LIB
tokenizers treat `)` as a token boundary, but CPROVER's own smt2
tokenizer does not, so it reads `)|float|` as a single (invalid)
token and rejects the assertion.

This path is only reached when reading the raw bytes of an
FPA-encoded float (e.g. a union-based bit access under --cprover-smt2
or an FPA solver), which is itself subject to other backend
limitations, so the bug was latent.  Emit the missing space.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig self-assigned this Jun 8, 2026
Copilot AI review requested due to automatic review settings June 8, 2026 10:33

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts SMT-LIB emission in smt2_convt::find_symbols to ensure a missing separator (whitespace) is included in the bvfromfloat equality assertion, preventing the generated SMT2 from being malformed for some consumers.

Changes:

  • Add a space after the first operand of an equality assertion emitted for bit-wise floatbv -> bv typecasts under FPA theory.
  • Keep the overall assertion structure the same while ensuring token separation between operands.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 5727 to 5730
out << "(assert (= ";
out << "((_ to_fp " << floatbv_type.get_e() << " "
<< floatbv_type.get_f() + 1 << ") " << id << ')';
<< floatbv_type.get_f() + 1 << ") " << id << ") ";
convert_expr(tc.op());
@kroening

kroening commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Should ) always be a token boundary?

@tautschnig

Copy link
Copy Markdown
Collaborator Author

Should ) always be a token boundary?

SMT-LIB says: "The lexical tokens of the language are the parenthesis characters ( and ), the elements of the syntactic categories ⟨numeral ⟩, ⟨decimal ⟩, ⟨hexadecimal ⟩, ⟨binary ⟩, ⟨string ⟩, ⟨symbol ⟩, ⟨keyword ⟩, as well as a number of reserved words, all defined below together with a few auxiliary syntactic categories." So ) should indeed be a token boundary.

@codecov

codecov Bot commented Jun 8, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 80.60%. Comparing base (23d2707) to head (e8930c2).

Files with missing lines Patch % Lines
src/solvers/smt2/smt2_conv.cpp 0.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff            @@
##           develop    #9030   +/-   ##
========================================
  Coverage    80.60%   80.60%           
========================================
  Files         1711     1711           
  Lines       189454   189454           
  Branches        73       73           
========================================
  Hits        152700   152700           
  Misses       36754    36754           

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@tautschnig

Copy link
Copy Markdown
Collaborator Author

Closing as this is actually just a cosmetic fix. The real bug was fixed in #9032, and smt2_solver already does correctly tokenize (in contrast to what this commit message claimed).

@tautschnig tautschnig closed this Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants