Skip to content

Add Gaussian perturbation checkpoint state#922

Open
NickGeneva wants to merge 8 commits into
mainfrom
codex/checkpoint-gaussian
Open

Add Gaussian perturbation checkpoint state#922
NickGeneva wants to merge 8 commits into
mainfrom
codex/checkpoint-gaussian

Conversation

@NickGeneva

@NickGeneva NickGeneva commented Jun 15, 2026

Copy link
Copy Markdown
Collaborator

Earth2Studio Pull Request

Description

Second PR in the checkpointing stack.

Adds checkpoint opt-in state to the Gaussian perturbation so restartable workflows can preserve its internal torch.Generator state without relying on global Torch or NumPy RNG state. The generator remains internal to the perturbation; checkpointing is controlled through the bound checkpoint state proxy and the selected state_policy.

Stack

  1. Add checkpoint utilities and workflow support #912: checkpoint utilities and built-in workflow support
  2. This PR: Gaussian perturbation checkpoint state
  3. Add FCN3 checkpoint state #923: FCN3 checkpoint state
  4. Add model checkpoint update developer skill #924: developer skill for adding model checkpoint support

Validation

  • uv run ruff check earth2studio/perturbation/gaussian.py test/perturbation/test_gaussian.py
  • uv run pytest test/perturbation/test_gaussian.py::test_gaussian_checkpoint_state_round_trip -q
  • git diff --check

Full test/perturbation/test_gaussian.py was not clean in this local environment because torch_harmonics is not installed for the optional CorrelatedSphericalGaussian tests.

Checklist

Dependencies

Depends on #912.

@greptile-apps

greptile-apps Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR opts the Gaussian perturbation into the checkpoint framework introduced in #912 by adding a _GaussianCheckpointState dataclass that holds a torch.Tensor snapshot of an internal torch.Generator, and wiring save/restore logic through bind_checkpoint_state.

  • _get_generator lazily creates a per-device generator and restores its state from the checkpoint when one is loaded, replacing the previous global-RNG dependency.
  • _save_generator_state records the pre-call state (level 1, for exact replay) or the post-call state (level 2, for resuming from the next step), gated by checkpoint_enabled and checkpoint_level.
  • A new test_gaussian_checkpoint test exercises both level-1 and level-2 round-trips with disk-backed checkpoint sessions.

Confidence Score: 5/5

The change is well-scoped and safe to merge; the generator save/restore paths are exercised by the new test and the core perturbation logic is unchanged.

All changes are additive: a new internal generator, a new checkpoint state dataclass, and two new private methods. The existing __call__ output is equivalent to the previous global-RNG call when checkpointing is disabled. No data paths, serialization formats, or public APIs are broken.

No files require special attention; both changed files are straightforward.

Important Files Changed

Filename Overview
earth2studio/perturbation/gaussian.py Adds _GaussianCheckpointState dataclass and wires an internal torch.Generator to the checkpoint framework; pre/post state capture logic and device-aware generator selection look correct.
test/perturbation/test_gaussian.py New test_gaussian_checkpoint exercises level-1 and level-2 round-trips correctly; name in the PR description does not match the actual test function name.

Reviews (2): Last reviewed commit: "Rename Gaussian checkpoint test" | Re-trigger Greptile

Comment thread earth2studio/perturbation/gaussian.py
Comment thread earth2studio/perturbation/gaussian.py Outdated
@NickGeneva NickGeneva force-pushed the codex/checkpoint-gaussian branch 7 times, most recently from 53c611c to 99c112a Compare June 15, 2026 22:29
@NickGeneva NickGeneva force-pushed the codex/checkpoint-gaussian branch from 99c112a to 8eaa079 Compare June 26, 2026 01:16
@NickGeneva NickGeneva changed the base branch from codex/checkpoint-catalog to main June 26, 2026 01:16
@NickGeneva NickGeneva force-pushed the codex/checkpoint-gaussian branch from 019a23f to 4362a36 Compare June 30, 2026 16:31
@NickGeneva

Copy link
Copy Markdown
Collaborator Author

@greptile-apps

@NickGeneva NickGeneva requested a review from pzharrington June 30, 2026 17:43

@pzharrington pzharrington left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Love PRs like this 😅 Image

@NickGeneva

Copy link
Copy Markdown
Collaborator Author

/blossom-ci

3 similar comments
@NickGeneva

Copy link
Copy Markdown
Collaborator Author

/blossom-ci

@NickGeneva

Copy link
Copy Markdown
Collaborator Author

/blossom-ci

@NickGeneva

Copy link
Copy Markdown
Collaborator Author

/blossom-ci

@NickGeneva

Copy link
Copy Markdown
Collaborator Author

/blossom-ci

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants