Skip to content

feat: Stop with checkpoint save#850

Open
entmike wants to merge 1 commit into
ostris:mainfrom
entmike:feature/stop-with-checkpoint
Open

feat: Stop with checkpoint save#850
entmike wants to merge 1 commit into
ostris:mainfrom
entmike:feature/stop-with-checkpoint

Conversation

@entmike

@entmike entmike commented May 29, 2026

Copy link
Copy Markdown

Summary

Adds a 'Save checkpoint before stopping' option in the job stop confirmation dialog. When enabled, the training loop gracefully saves a checkpoint before terminating, instead of sending SIGINT.

Changes

UI - Stop with Checkpoint Save

  • ConfirmModal: Added support — renders an optional checkbox in confirmation dialogs
  • JobActionBar: Stop dialog now shows a 'Save checkpoint before stopping' checkbox. When checked, calls ; otherwise calls regular
  • jobs.ts: Added function and imported in JobActionBar
  • New API route: — sets both and flags in the DB. The training loop's picks them up at the end of the current step:
    1. sees , saves the checkpoint, clears
    2. sees , marks job as stopped, raises Exception to break the training loop

Training Loop - Graceful Stop Logic

  • DiffusionTrainer.py & UITrainer.py: Stop watcher threads now check flag before sending SIGINT. If is set, the loop completes the current step, saves the checkpoint, then stops cleanly — no forced termination.

Testing

  • ✅ Tested locally — stop with checkpoint save works as expected

@entmike entmike changed the title feat: Stop with checkpoint save, graceful stop logic, and HiDream-01 NaN fix feat: Stop with checkpoint save May 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants