Skip to content

Add Auto-FL report skill#4845

Draft
holgerroth wants to merge 26 commits into
NVIDIA:mainfrom
holgerroth:codex/autofl-report-skill
Draft

Add Auto-FL report skill#4845
holgerroth wants to merge 26 commits into
NVIDIA:mainfrom
holgerroth:codex/autofl-report-skill

Conversation

@holgerroth

@holgerroth holgerroth commented Jun 30, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds a productized nvflare-autofl-report companion skill for generating reproducible final artifacts after an Auto-FL campaign has stopped, reached a cap, hit a hard blocker, or been manually interrupted.

This is a follow-up to #4780 and is intentionally stacked on that branch. The new work is contained in commit 08105a368; once #4780 lands, this PR's diff will collapse to the reporting feature only.

User Experience

After stopping a campaign, the user can ask their coding agent:

Use the NVFlare Auto-FL Report skill.
Generate the final report for the stopped campaign in ./job.

The skill verifies stopped state and deterministically produces:

  • autofl_final_report.md
  • autofl_report_summary.json
  • a refreshed progress.png

The report includes baseline/best results, candidate lineage and inherited code changes, manifests and hashes, exact commands, runtime/failures, literature checkpoints with measured follow-on outcomes, and comparability warnings.

Design

  • Refuses to finalize state with final_response_allowed=false unless the user explicitly confirms an abrupt interruption after execution is independently checked.
  • Does not mutate job source, results.tsv, candidate manifests, or campaign state.
  • Works without Git and does not auto-commit.
  • Reuses the product Auto-FL progress plotter instead of copying research plotting logic.
  • Distinguishes the imported autofl.yaml budget from executed baseline/best commands.
  • Warns when training compute changed or a test-like metric guided repeated candidate selection.
  • Associates each literature checkpoint with subsequent measured candidates until the next checkpoint and preserves campaign-recorded source identifiers without claiming independent citation verification.

No files under research/auto-fl-research and no H100-specific assets or instructions are changed.

Validation

  • 72 passed across report, Auto-FL runner/guard/plotter, importer, skill admission, and release-bundle tests.
  • Black, isort, flake8, and git diff --check pass.
  • Full docs HTML build completes; remaining warnings are pre-existing elsewhere in the docs tree.
  • Forward-tested against a copied 149-row, 8-client CIFAR-10 campaign ledger: the report reproduced the 0.6870 baseline, 0.8218 best score, 34-candidate retained lineage, 10 literature checkpoints, and the executed local-epoch comparability warning without touching the live campaign.

Dependency

holgerroth and others added 24 commits June 8, 2026 16:00
Signed-off-by: Holger Roth <hroth@nvidia.com>
Signed-off-by: Holger Roth <hroth@nvidia.com>
Signed-off-by: Holger Roth <hroth@nvidia.com>
Signed-off-by: Holger Roth <hroth@nvidia.com>
Signed-off-by: Holger Roth <hroth@nvidia.com>
Signed-off-by: Holger Roth <hroth@nvidia.com>
Signed-off-by: Holger Roth <hroth@nvidia.com>
Signed-off-by: Holger Roth <hroth@nvidia.com>
@holgerroth holgerroth force-pushed the codex/autofl-report-skill branch from 48ca4f7 to 08105a3 Compare June 30, 2026 18:07
@codecov-commenter

codecov-commenter commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.20682% with 60 lines in your changes missing coverage. Please review.
✅ Project coverage is 56.68%. Comparing base (7df6a4c) to head (23f6a6e).

Files with missing lines Patch % Lines
nvflare/app_common/autofl/job_importer.py 87.04% 60 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4845      +/-   ##
==========================================
+ Coverage   56.49%   56.68%   +0.18%     
==========================================
  Files         969      971       +2     
  Lines       92210    92679     +469     
==========================================
+ Hits        52096    52535     +439     
- Misses      40114    40144      +30     
Flag Coverage Δ
unit-tests 56.68% <87.20%> (+0.18%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@holgerroth holgerroth changed the title Add Auto-FL stopped campaign report skill Add Auto-FL report skill Jul 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants