Skip to content

Acceptance tests: Write evals to file and use experiment names#565

Merged
hanna-paasivirta merged 1 commit into
release/1.6.0from
acceptance-tests-save-outputs
Jun 30, 2026
Merged

Acceptance tests: Write evals to file and use experiment names#565
hanna-paasivirta merged 1 commit into
release/1.6.0from
acceptance-tests-save-outputs

Conversation

@hanna-paasivirta

Copy link
Copy Markdown
Contributor

Short Description

Acceptance tests now save judge results to a file and let you label runs with -E so you can keep results from different runs side by side.

Implementation Details

  • This adds a third output file for acceptance test runs. It saves the judge verdicts to a .judges.txt file in tmp/, next to the YAML and response text.
  • You can also pass --experiment=<name> or -E <name> to tag the output files so different runs don't overwrite each other.

AI Usage

Please disclose whether you've used AI in this work (it's cool, we just want to
know!):

  • Yes, I have not used AI
  • No, I have not used AI

You can read more details in our
Responsible AI Policy

@josephjclark

Copy link
Copy Markdown
Collaborator

Nice!

@hanna-paasivirta hanna-paasivirta changed the base branch from main to release/1.6.0 June 30, 2026 10:18
@hanna-paasivirta hanna-paasivirta merged commit cc336c9 into release/1.6.0 Jun 30, 2026
2 checks passed
@hanna-paasivirta hanna-paasivirta mentioned this pull request Jun 30, 2026
2 tasks
@josephjclark josephjclark mentioned this pull request Jun 30, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants