Skip to content

fix(report): treat threshold as satisfied when efficacy/coverage equals it#280

Open
matuszeg wants to merge 2 commits into
go-gremlins:mainfrom
matuszeg:fix/efficacy-threshold-boundary
Open

fix(report): treat threshold as satisfied when efficacy/coverage equals it#280
matuszeg wants to merge 2 commits into
go-gremlins:mainfrom
matuszeg:fix/efficacy-threshold-boundary

Conversation

@matuszeg
Copy link
Copy Markdown

Proposed changes

The efficacy and mutant-coverage threshold checks in internal/report/report.go
used <=, which makes a configured threshold of N% unsatisfiable when the
actual value is exactly N%.

The clearest manifestation: --threshold-efficacy 100 can never be met, even
when every reached mutant is killed — 100 <= 100 is true, so Gremlins exits
with code 10 and reports "below efficacy-threshold". The same bug existed for
--threshold-mcover.

The docs describe these flags as "the percent of KILLED mutants" / "how many
mutants are covered by tests", and users naturally write 100 to mean "all
reached mutants must be killed". Switching both comparisons to < makes the
flags behave that way: a configured threshold of N is satisfied when the actual
value is greater than or equal to N.

Why the existing tests didn't catch this

The !tc.expectError branch of TestAssessment returned without ever
asserting err == nil. So the upstream test named "efficacy >= efficacy-threshold"
(value 50, actual efficacy 50%) was already returning an ExitError under the
old <= semantics — the assertion just never looked.

This PR fixes that branch (if err != nil { t.Fatalf(...) }) so the boundary
cases actually verify the no-error path, then adds explicit <, ==, >, and
== 0 cases for both efficacy and mutant-coverage, against both float64 and
int configuration values (the assess code reads both via the
Get[float64]Get[int] fallback).

Changes

  • internal/report/report.go: <=< for both efficacy and mutant-coverage
    threshold checks.
  • internal/report/report_test.go: assert err == nil on the no-error branch;
    add == / > / == 0 boundary cases for float64 and int config values
    on both thresholds; rename rows so duplicate names get distinct identifiers.
  • docs/docs/usage/commands/unleash/index.md: clarify that the threshold is
    satisfied when actual ≥ configured (e.g. --threshold-efficacy 100 is met
    when every reached mutant is killed).

Types of changes

  • Bugfix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation Update (if none of the other choices apply)

Checklist

  • I have read the CONTRIBUTING doc
  • Lint and unit tests pass locally with my changes (make all)
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • Any dependent changes have been merged and published in downstream modules

Further comments

This is technically a behavior change for anyone who today relies on
--threshold-efficacy N failing at exactly N%, but I'd argue that behavior
isn't what the flag's name or documentation imply, and the 100 case is
clearly a bug (the flag is unreachable). Happy to gate this behind a different
flag name or a deprecation cycle if maintainers prefer — let me know.

…ls it

The efficacy and mutant-coverage threshold checks used `<=`, which made a
configured threshold of N unsatisfiable when the actual value was exactly N.
Most notably, `--threshold-efficacy 100` could never be met even when every
reached mutant was killed (100 <= 100 was true, so gremlins exited 10).

Switch both comparisons to `<` so a configured threshold of N is satisfied
by an actual value of N. The pre-existing "no error" branch of the
assessment test never asserted err == nil, which is why this regression
went unnoticed; that branch is fixed and boundary cases (== and >) are now
covered explicitly for both float64 and int configuration values.

Docs updated to spell out the >= semantics.
@pull-request-size pull-request-size Bot added the s/M Size: Denotes a Pull Request that changes 30-99 lines label Apr 29, 2026
Reflow the prose added in the previous commit to fit the docs' 80-column
line-length limit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

s/M Size: Denotes a Pull Request that changes 30-99 lines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant