-
Notifications
You must be signed in to change notification settings - Fork 266
NVFLARE agent skills: PyTorch/Lightning conversion, orient, diagnose — with lint, eval, packaging, and security hardening #4837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
chesterxgchen
wants to merge
192
commits into
NVIDIA:main
Choose a base branch
from
chesterxgchen:milestone8-agent-skills
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
192 commits
Select commit
Hold shift + click to select a range
147ed94
Refactor shared skill guidance
chesterxgchen 6013255
Add Lightning routing to agent inspect
chesterxgchen 0301207
Fix Lightning patch detection in agent inspect
chesterxgchen 5d7a11c
Detect from-import lightning module alias in agent inspect
chesterxgchen 5bbd00a
Add nvflare-convert-lightning skill and enable Lightning routing
chesterxgchen 71a92a5
Fix Lightning routing misclassification and logger import path
chesterxgchen 2ecb40c
Remove benchmark notes from skill artifacts
chesterxgchen 8bd3131
Promote Lightning on active use regardless of torch import count
chesterxgchen 45b416f
Gate Lightning promotion on entry-point location, not directory-wide …
chesterxgchen 1b2666e
Fix mixed PyTorch Lightning routing
chesterxgchen 502902a
Make PyTorch/Lightning reorder preserve unrelated framework order
chesterxgchen ea8b427
Simplify PyTorch/Lightning routing to the trigger-contract rule
chesterxgchen f881d5b
Scope Lightning-over-PyTorch preference to the PyTorch family
chesterxgchen 345da71
Add Milestone 8 conversion checkpoint
chesterxgchen 38a627b
Revert "Add Milestone 8 conversion checkpoint"
chesterxgchen 7a87e01
Clarify NVFLARE skill source-of-truth guidance
chesterxgchen 9c5b9aa
Clarify Lightning local loss policy guidance
chesterxgchen 946d3de
Deduplicate Lightning loss policy guidance
chesterxgchen 3fa792c
Clarify runtime output guidance ownership
chesterxgchen f40dbbf
Ignore local tmp artifacts
chesterxgchen 53d1666
Document inspect framework routing order
chesterxgchen 07daf77
Remove unused Lightning receive assignment
chesterxgchen fc07818
Clarify Lightning wrapper routing boundary
chesterxgchen d9e5f68
Clarify Lightning validation evidence
chesterxgchen 3a87093
Add inspector skill recommendation coverage
chesterxgchen 6e6191b
Decouple skill lint from design docs
chesterxgchen f3e0516
Handle Lightning patched trainer conversion state
chesterxgchen c365f59
Add Lightning patch alias inspect regression
chesterxgchen 3cb29f3
Harden Lightning patch inspection
chesterxgchen 80656ef
Decouple skill lint from design docs
chesterxgchen 47cc01c
Handle Lightning patch submodule imports
chesterxgchen 17db96f
Refine agent Lightning routing
chesterxgchen d2dddee
Raise on empty Lightning eval batches
chesterxgchen 6ab2b1b
Prefer active Lightning evidence over torch imports
chesterxgchen dac32b2
Fix mixed Lightning PyTorch inspection routing
chesterxgchen b4c1063
Guard Lightning promotion for PyTorch entrypoints
chesterxgchen 4c8a7fa
Preserve PyTorch rank ahead of incidental Lightning
chesterxgchen 4736c51
Cover non-PyTorch Lightning routing
chesterxgchen e403259
Cover Lightning inspector helper paths
chesterxgchen 2adc9bc
Track imported submodules for Lightning routing
chesterxgchen 5316eac
Validate milestone 8 checkpoint evidence
chesterxgchen 7bc4881
Restore agent skill catalog lint sources
chesterxgchen 5ee4848
Use skill categories for trigger overlap lint
chesterxgchen 00c2d94
Re-decouple skill lint engine from design docs
chesterxgchen 0da8a56
Remove stale skill category fixture metadata
chesterxgchen af670b2
Clarify deferred doc crosslink lint
chesterxgchen 82469ce
Clarify PyTorch Lightning inspection routing
chesterxgchen bb42e77
Fix PyTorch routing with incidental Lightning imports
chesterxgchen bde8bc5
Fix Lightning routing with PyTorch entry imports
chesterxgchen 4d2b7a4
Fix modular Lightning inspector routing
chesterxgchen f79f64a
Fix Lightning inspector promotion weighting
chesterxgchen 7f4d339
Fix Lightning inspector routing with PyTorch entry points
chesterxgchen af180e2
Fix inspector import context resolution
chesterxgchen cc7e4a2
Add relative Lightning import inspector regression
chesterxgchen 370cf07
Add milestone 8 lint checkpoint regression test
chesterxgchen e5d5ede
Add package submodule Lightning import coverage
chesterxgchen c5781e1
Require category for public agent skills
chesterxgchen e3ec67d
Require terminal completion before skill validation success
chesterxgchen 626afb7
Document unsupported skill category frontmatter
chesterxgchen 0f4010b
Remove category from skill frontmatter
chesterxgchen 3ad5947
Add inspector coverage for Lightning routing helpers
chesterxgchen 1fd43d6
Clarify agent skill lint contracts
chesterxgchen 4913d3c
Fix Lightning routing evidence scoring
chesterxgchen 36de66d
Fix PyTorch-Lightning routing evidence
chesterxgchen f815be5
Avoid false Lightning reachability from import prefixes
chesterxgchen fca43f7
Keep Lightning routing within entry context
chesterxgchen 003cb12
Guard Lightning routing fallback by entry context
chesterxgchen bb73e4f
Avoid local resolution for dotted external imports
chesterxgchen 66c4792
Trim Lightning detection prose to the inspect override boundary
chesterxgchen f58a9f2
Align skill category frontmatter validation
chesterxgchen 9759811
Surface skill category in manifest and skills list
chesterxgchen 7adbb74
Tighten milestone 8 checkpoint validation
chesterxgchen d3cdc00
Clarify skill category lint metadata invariant
chesterxgchen 50b8724
Fix Lightning fallback routing over PyTorch imports
chesterxgchen 99024ba
Preserve PyTorch import evidence in Lightning fallback
chesterxgchen 14ba8ef
Stop context-prefixing package-prefix import candidates
chesterxgchen 26cc27e
Require full import path to resolve locally before following package …
chesterxgchen 157c9f9
Pin Lightning shadowing guard to entry context in tests
chesterxgchen 1532b28
Resolve nested local dotted imports via context prefix
chesterxgchen 5bbbaa5
Derive package-prefix candidates from resolved modules
chesterxgchen 3377613
Guard raw top-level package prefix is not followed for nested imports
chesterxgchen 27584ab
Simplify Lightning promotion routing helper
chesterxgchen 4eff40c
Untrack agent skill evaluation design doc
chesterxgchen 97d3098
Remove milestone 8 checkpoint utility
chesterxgchen c9f6a97
Untrack agent implementation plan design doc
chesterxgchen 3b579d6
Clarify skill source-of-truth boundaries
chesterxgchen 1451a9d
Tighten source-discovered-strategy override evals
chesterxgchen cbd58f6
Prevent source-discovered conversion overrides
chesterxgchen 87ea1ae
Centralize source override skill guidance
chesterxgchen 3afbbb4
Guard Lightning eval fixtures against empty batches
chesterxgchen 96a83c7
Align conversion skills with operating model (steps 1-6)
chesterxgchen 07ce2d1
Add JSON output contract tests and packaged conversion templates
chesterxgchen 5f6d999
Fix cross-review findings in conversion skills and boundary lint
chesterxgchen 49b7d1c
Add high-level overview to skill architecture doc
chesterxgchen 40b53c2
Modularize inspector framework detection into per-framework detectors
chesterxgchen 69609d0
Document the three responsibility layers in skill architecture doc
chesterxgchen 1738711
Relocate eval suites out of shipped skills into dev_tools/agent/skill…
chesterxgchen e607dfd
Fix conversion-skill routing and template review findings
chesterxgchen b07ba14
Fail closed on stray eval dirs inside shipped skills
chesterxgchen 3dc986c
Harden promotion, aggregator weights, and eval-mode in review fixes
chesterxgchen 1fd9ab6
Fix deferred review findings: cross-family ties, mixed-workspace nami…
chesterxgchen e63d02f
Don't count in-Lightning torch usage as standalone PyTorch base evidence
chesterxgchen ef21abd
Improve agent skill routing and validation guidance
chesterxgchen 1b87dbf
Fix reachability collision, entry-context routing, and nested-eval li…
chesterxgchen 38565b1
Harden conversion aggregator step weights
chesterxgchen 5c26c97
Guard agent skill eval exclusions
chesterxgchen ba4278b
Align eval-loader error messages to the evals.json filename
chesterxgchen 2c3f3ee
Fix Lightning routing for embedded PyTorch evidence
chesterxgchen 86ca934
Ignore local-only agent skill design docs
chesterxgchen 57910ee
Add oversized step count aggregation regression
chesterxgchen 8cba1b0
Prune excluded runtime lint directories
chesterxgchen 5dea87d
Align lint-independence invariant with the eval-root input
chesterxgchen 1fd4d15
Fix Lightning fallback PyTorch scoring
chesterxgchen 5fe8fdd
Clarify skill lint eval inputs
chesterxgchen 8b75820
Fix three residual framework-routing edge cases
chesterxgchen c9290c3
Prefer non-utility framework fallback in inspector
chesterxgchen bcf4e34
Address PR #4837 review comments on Lightning skill/evals
chesterxgchen 0ae56fc
Fix Claude2 code-review findings (8) in the agent skill inspector/lints
chesterxgchen c4c8bc7
Align skill frontmatter with the agentskills.io spec
chesterxgchen 0e934ab
Structure shared skill content as a spec-compliant internal skill
chesterxgchen 8a113f7
Scope agent doctor to conversion-only readiness checks
chesterxgchen ddde656
Close two Claude2 review residuals: parity test + reachability memoiz…
chesterxgchen 80b41e9
De-duplicate and clarify conversion skill guidance
chesterxgchen 1d45c6a
Fix recipe selection to match real catalog fields and keep HE explicit
chesterxgchen 4906842
Clarify skill metadata frontmatter docs
chesterxgchen 151146d
Fix packaged skill asset references
chesterxgchen 5716b10
Fix agent doctor JSON readiness guidance
chesterxgchen aacb539
Make setup.py-build packaging tests hermetic (fix intermittent flake)
chesterxgchen 1149b2a
Clarify PyTorch recipe privacy selection
chesterxgchen 601740f
Clarify import-vs-inspect wording and soften cyclic recipe example
chesterxgchen 4925408
Harden setup.py-build flake fix: xdist loadgroup + isolated bdist-dir
chesterxgchen 9c0b488
Harden received-model metric-ownership guidance (eval report)
chesterxgchen 712f6fb
Add settled conversion rules: device placement, pretrained-path, plai…
chesterxgchen a63e689
Encode Lightning DDP -> external-process executor rule
chesterxgchen 645b57f
Clarify PyTorch distributed launch recipe guidance
chesterxgchen 21f99b2
Update Lightning DDP launch guidance
chesterxgchen 4632602
Remove unfounded shared-token-ID rule from data-derived-arg eval
chesterxgchen a9b223a
Add generated-code quality rules: setup-outside-loop and data-location
chesterxgchen 2d5c39c
Wire conversion-quality behaviors into pytorch/lightning eval suites
chesterxgchen ffb08cd
Fix PyTorch eval template to build model once before the round loop
chesterxgchen 8c585c2
Scope conversion-quality assertions to match basic fixtures
chesterxgchen 0086950
Disambiguate no-hardcoded-absolute-data-path for graders
chesterxgchen 3878dc5
Note why a shared vocabulary mapping matters, not just vocab_size
chesterxgchen 5fbeacf
Fix PyTorch conversion template setup placement
chesterxgchen 9391d69
Refine conversion data path evals
chesterxgchen 9235865
Document external-data eval fixtures in SOURCE notes
chesterxgchen 90372e6
Initialize FLARE before PyTorch setup hook
chesterxgchen 18131e8
Address review: privacy scope wording, CLI syntax, DDP + validation c…
chesterxgchen 26de3f7
Consolidate external data fixture source notes
chesterxgchen eadbed2
Clarify Lightning DDP metadata broadcast guidance
chesterxgchen af40aba
Share PyTorch-family recipe selection between PyTorch and Lightning s…
chesterxgchen cf3f3c0
Remove dead code left by earlier review-fix rounds
chesterxgchen ae50cc8
Fix unresolvable per-skill reference in shared recipe-selection doc
chesterxgchen 42f70f7
Reconcile HE reporting with in-scope recipe-level privacy; list share…
chesterxgchen cb3df73
Fail closed on HE recipes in the SimEnv conversion path
chesterxgchen a9ae059
Simplify the Lightning promotion weighted fallback
chesterxgchen 415bb30
Parametrize three near-clone inspector test families
chesterxgchen 2329b43
Consolidate lint engine registries, walkers, and per-skill I/O
chesterxgchen 155fe82
Carry the HE SimEnv exception into canonical-path and validation docs
chesterxgchen 488ba85
Make SkillRecord validation lazy to preserve bounded reads on scoped …
chesterxgchen 1ec2b8f
Mark homomorphic encryption unsupported by the conversion skills
chesterxgchen 26fa7bc
Harden skill trust boundaries and supply-chain/privacy rules (securit…
chesterxgchen 6450d28
Exclude bytecode from packaged skills (security review, packaging)
chesterxgchen 25573ff
Add injection evals for supply-chain, trust-escalation, and poisoned …
chesterxgchen 91e0cca
Strengthen recipe-list drift contract and clarify privacy_compatible
chesterxgchen 820a7aa
Make injection typosquat a true substitution (review nit)
chesterxgchen d27d6b6
Redact requirement URLs in approval prompts and unify venv guidance
chesterxgchen 6d3263f
Enforce the agent skill lint before push (runtest -s + pre-push hook)
chesterxgchen 17aa625
Align injection eval with the requirement-line redaction contract
chesterxgchen 10f6575
Make unattended dependency install mandatory, not a reportable blocker
chesterxgchen dd9450a
Conform skills to the company (NVCARPS) guideline, keeping NVFLARE ch…
chesterxgchen 44025a6
Ignore the local agent-skill checks report design doc
chesterxgchen bcb74c1
Align privacy scope: DP and privacy filters unsupported everywhere HE is
chesterxgchen 0e0847a
Add device selection eval check
chesterxgchen 39236ef
Harden agent skills and benchmark RCA reporting
chesterxgchen 3b0153c
Revert "Harden agent skills and benchmark RCA reporting"
chesterxgchen c9dc219
Read SKILL.md as a bounded regular file in the frontmatter validator
chesterxgchen 929a69a
Re-land execution-sandbox, runtime-path, supply-chain, and network-lo…
chesterxgchen 9f6d107
Re-land PyTorch eval coverage: device, checkpoint, state-mismatch, pr…
chesterxgchen 7b1610e
Re-land Lightning eval coverage + network-logger gating, excluding th…
chesterxgchen b159cd9
Re-land diagnose/orient eval coverage from the reverted commit
chesterxgchen 0527b07
Harden agent skill install/package integrity (item 11)
chesterxgchen 59edaf7
Fix umask-002 build failure: bundle root world-writable check only
chesterxgchen 201063d
Add empty-batch guard to external-data-lightning fixture
chesterxgchen 7968f5c
Add empty-batch guard to gpu-device-lightning fixture
chesterxgchen 6d25181
Drop the non-essential 'nvflare agent doctor' command
chesterxgchen 429bb50
Strengthen conversion skill validation ordering
chesterxgchen c8fb50a
Avoid approval waits in unattended skill runs
chesterxgchen 403fb8d
Rescope skills to the enforceable security boundary; stop skill-owned…
chesterxgchen 4e4d0a8
Tighten skill boundary: progressive disclosure, direct recipe show, n…
chesterxgchen f91b5f2
Clarify skill layout collision handling
chesterxgchen f05de35
Generalize source model layout guidance
chesterxgchen File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # Git hooks | ||
|
|
||
| Repo-managed git hooks. Enable them once per clone: | ||
|
|
||
| ```bash | ||
| git config core.hooksPath .githooks | ||
| ``` | ||
|
|
||
| ## `pre-push` | ||
|
|
||
| Runs the deterministic agent-skill lint | ||
| (`python -m dev_tools.agent.skills.checks --skills-root skills`) and blocks the | ||
| push if it finds anything, so the agent skills checked into GitHub stay clean. | ||
| It covers `skills/` and the eval suites under `dev_tools/agent/skill_evals/`. | ||
|
|
||
| The same lint also runs in `./runtest.sh -s` and in the pre-merge CI unit tests | ||
| (`tests/unit_test/tool/agent_skill_checks/seed_skills_test.py`), so this hook is | ||
| a fast local pre-push gate rather than the only enforcement. | ||
|
|
||
| Emergency bypass: `git push --no-verify`. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| #!/usr/bin/env bash | ||
| # NVFLARE pre-push hook: block a push when the agent skill lint finds anything, | ||
| # so the skills checked into GitHub stay clean. | ||
| # | ||
| # Enable once per clone: | ||
| # git config core.hooksPath .githooks | ||
| # | ||
| # The lint is fast and dependency-light; it covers skills/ and the eval suites | ||
| # under dev_tools/agent/skill_evals/. The same check runs in `./runtest.sh -s` | ||
| # and in CI. Bypass in an emergency with `git push --no-verify`. | ||
| set -euo pipefail | ||
|
|
||
| repo_root="$(git rev-parse --show-toplevel)" | ||
|
|
||
| # Nothing to check if this repo has no skills root. | ||
| if [ ! -d "$repo_root/skills" ]; then | ||
| exit 0 | ||
| fi | ||
|
|
||
| echo "pre-push: running agent skill lint (python -m dev_tools.agent.skills.checks)..." | ||
| if ! python3 -m dev_tools.agent.skills.checks --skills-root "$repo_root/skills"; then | ||
| echo "" | ||
| echo "pre-push: agent skill lint failed. Fix the findings above (or run" | ||
| echo " ./runtest.sh -s) before pushing. Emergency bypass:" | ||
| echo " git push --no-verify" | ||
| exit 1 | ||
| fi | ||
| echo "pre-push: agent skill lint clean." |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
500 changes: 500 additions & 0 deletions
500
dev_tools/agent/skill_evals/nvflare-convert-lightning/evals.json
Large diffs are not rendered by default.
Oops, something went wrong.
36 changes: 36 additions & 0 deletions
36
dev_tools/agent/skill_evals/nvflare-convert-lightning/files/SOURCE.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| # Fixture Source Notes | ||
|
|
||
| The `hello-lightning` fixtures are minimized, unconverted PyTorch Lightning | ||
| training code modeled on the NVFLARE repository example: | ||
|
|
||
| - Source example: `examples/hello-world/hello-lightning` | ||
|
|
||
| The fixture intentionally omits real datasets, data download, FLARE integration, | ||
| and full job execution details so trigger and behavior evals stay deterministic. | ||
| `train.py` and `model.py` represent plain Lightning code before any FLARE | ||
| conversion; the agent under evaluation is expected to add the | ||
| `flare.patch(trainer)` Client API integration and a `job.py`. | ||
|
|
||
| The `gpu-device-lightning` fixture is synthetic, derived from | ||
| `hello-lightning` with an explicit `torch.cuda.is_available()` choice between | ||
| Lightning's `gpu` and `cpu` accelerators. It makes device-intent preservation | ||
| applicable without requiring a GPU on the evaluation host. | ||
|
|
||
| The `vocab-lightning` fixture adds a `LitTextCNN` model whose `__init__` has a | ||
| required, data-derived argument (`vocab_size`, no default). The conversion must | ||
| pin one shared vocabulary size for the server recipe model config and every | ||
| client model construction path. Passing a live `LightningModule` instance with | ||
| required args can serialize without those args and fail server-side | ||
| reconstruction in the model persistor. | ||
|
|
||
| The `external-data-lightning` fixtures are synthetic, derived from the | ||
| `hello-lightning` fixture but loading train/val CSVs from an external data | ||
| directory (`--data-dir`, default `/data/nvflare/lightning-tabular`) instead of | ||
| building synthetic in-memory tensors. The path is intentionally external to the | ||
| repository and run workspace so configurable data-path behavior is asserted only | ||
| when the source provides an external dataset location. | ||
|
|
||
| The `hello-lightning` fixture's `LitNet` includes `validation_step` with | ||
| `self.log("val_loss", ...)` and the training entry point builds a validation | ||
| dataloader, so evaluation-focused evals can assert Lightning-native evaluation | ||
| (`trainer.validate` before `trainer.fit`) without a separate fixture. |
48 changes: 48 additions & 0 deletions
48
dev_tools/agent/skill_evals/nvflare-convert-lightning/files/external-data-lightning/model.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| # Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import pytorch_lightning as pl | ||
| import torch | ||
| import torch.nn as nn | ||
| import torch.nn.functional as F | ||
|
|
||
|
|
||
| class LitNet(pl.LightningModule): | ||
| def __init__(self, input_size=4, num_classes=2, lr=0.01): | ||
| super().__init__() | ||
| self.save_hyperparameters() | ||
| self.fc1 = nn.Linear(input_size, 8) | ||
| self.fc2 = nn.Linear(8, num_classes) | ||
|
|
||
| def forward(self, x): | ||
| x = F.relu(self.fc1(x)) | ||
| return self.fc2(x) | ||
|
|
||
| def training_step(self, batch, batch_idx): | ||
| features, labels = batch | ||
| if labels.numel() == 0: | ||
| raise ValueError("empty training batch; check per-site data partitioning") | ||
| loss = F.cross_entropy(self(features), labels) | ||
| self.log("train_loss", loss) | ||
| return loss | ||
|
|
||
| def validation_step(self, batch, batch_idx): | ||
| features, labels = batch | ||
| if labels.numel() == 0: | ||
| raise ValueError("empty validation batch; check per-site data partitioning") | ||
| loss = F.cross_entropy(self(features), labels) | ||
| self.log("val_loss", loss) | ||
|
chesterxgchen marked this conversation as resolved.
|
||
|
|
||
|
chesterxgchen marked this conversation as resolved.
|
||
| def configure_optimizers(self): | ||
| return torch.optim.SGD(self.parameters(), lr=self.hparams.lr) | ||
70 changes: 70 additions & 0 deletions
70
dev_tools/agent/skill_evals/nvflare-convert-lightning/files/external-data-lightning/train.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,70 @@ | ||
| # Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import argparse | ||
| import csv | ||
| from pathlib import Path | ||
|
|
||
| import pytorch_lightning as pl | ||
| import torch | ||
| from model import LitNet | ||
| from torch.utils.data import DataLoader, TensorDataset | ||
|
|
||
| DEFAULT_DATA_DIR = "/data/nvflare/lightning-tabular" | ||
|
|
||
|
|
||
| def load_csv(data_path): | ||
| features = [] | ||
| labels = [] | ||
| with Path(data_path).open(newline="", encoding="utf-8") as csv_file: | ||
| reader = csv.DictReader(csv_file) | ||
| for row in reader: | ||
| features.append([float(row[f"feature_{index}"]) for index in range(4)]) | ||
| labels.append(int(row["label"])) | ||
| if not features: | ||
| raise ValueError(f"no rows loaded from {data_path}") | ||
| return TensorDataset(torch.tensor(features, dtype=torch.float32), torch.tensor(labels, dtype=torch.long)) | ||
|
|
||
|
|
||
| class TabularDataModule(pl.LightningDataModule): | ||
| def __init__(self, data_dir=DEFAULT_DATA_DIR, batch_size=4): | ||
| super().__init__() | ||
| self.data_dir = Path(data_dir) | ||
| self.batch_size = batch_size | ||
|
|
||
| def setup(self, stage=None): | ||
| self.train_dataset = load_csv(self.data_dir / "train.csv") | ||
| self.val_dataset = load_csv(self.data_dir / "val.csv") | ||
|
|
||
| def train_dataloader(self): | ||
| return DataLoader(self.train_dataset, batch_size=self.batch_size, shuffle=True) | ||
|
|
||
| def val_dataloader(self): | ||
| return DataLoader(self.val_dataset, batch_size=self.batch_size) | ||
|
|
||
|
|
||
| def main(): | ||
| parser = argparse.ArgumentParser() | ||
| parser.add_argument("--data-dir", default=DEFAULT_DATA_DIR) | ||
| parser.add_argument("--batch-size", type=int, default=4) | ||
| args = parser.parse_args() | ||
|
|
||
| model = LitNet() | ||
| datamodule = TabularDataModule(data_dir=args.data_dir, batch_size=args.batch_size) | ||
| trainer = pl.Trainer(max_epochs=1, accelerator="cpu", devices=1, logger=False) | ||
| trainer.fit(model, datamodule=datamodule) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() |
36 changes: 36 additions & 0 deletions
36
dev_tools/agent/skill_evals/nvflare-convert-lightning/files/gpu-device-lightning/model.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| # Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import pytorch_lightning as pl | ||
| import torch | ||
| import torch.nn as nn | ||
| import torch.nn.functional as F | ||
|
|
||
|
|
||
| class LitNet(pl.LightningModule): | ||
| def __init__(self): | ||
| super().__init__() | ||
| self.layer = nn.Linear(4, 2) | ||
|
|
||
| def forward(self, features): | ||
| return self.layer(features) | ||
|
|
||
| def training_step(self, batch, batch_idx): | ||
| features, labels = batch | ||
| if labels.numel() == 0: | ||
| raise ValueError("empty training batch; check per-site data partitioning") | ||
| return F.cross_entropy(self(features), labels) | ||
|
chesterxgchen marked this conversation as resolved.
|
||
|
|
||
| def configure_optimizers(self): | ||
| return torch.optim.SGD(self.parameters(), lr=0.01) | ||
29 changes: 29 additions & 0 deletions
29
dev_tools/agent/skill_evals/nvflare-convert-lightning/files/gpu-device-lightning/train.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| # Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import pytorch_lightning as pl | ||
| import torch | ||
| from model import LitNet | ||
| from torch.utils.data import DataLoader, TensorDataset | ||
|
|
||
|
|
||
| def main(): | ||
| accelerator = "gpu" if torch.cuda.is_available() else "cpu" | ||
| dataset = TensorDataset(torch.randn(8, 4), torch.randint(0, 2, (8,))) | ||
| trainer = pl.Trainer(max_epochs=1, accelerator=accelerator, devices=1, logger=False) | ||
| trainer.fit(LitNet(), DataLoader(dataset, batch_size=4)) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() |
48 changes: 48 additions & 0 deletions
48
dev_tools/agent/skill_evals/nvflare-convert-lightning/files/hello-lightning/model.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,48 @@ | ||
| # Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import pytorch_lightning as pl | ||
| import torch | ||
| import torch.nn as nn | ||
| import torch.nn.functional as F | ||
|
|
||
|
|
||
| class LitNet(pl.LightningModule): | ||
| def __init__(self, input_size=4, num_classes=2, lr=0.01): | ||
| super().__init__() | ||
| self.save_hyperparameters() | ||
| self.fc1 = nn.Linear(input_size, 8) | ||
| self.fc2 = nn.Linear(8, num_classes) | ||
|
|
||
| def forward(self, x): | ||
| x = F.relu(self.fc1(x)) | ||
| return self.fc2(x) | ||
|
|
||
| def training_step(self, batch, batch_idx): | ||
| features, labels = batch | ||
| if labels.numel() == 0: | ||
| raise ValueError("empty training batch; check per-site data partitioning") | ||
| loss = F.cross_entropy(self(features), labels) | ||
|
greptile-apps[bot] marked this conversation as resolved.
|
||
| self.log("train_loss", loss) | ||
| return loss | ||
|
|
||
| def validation_step(self, batch, batch_idx): | ||
| features, labels = batch | ||
| if labels.numel() == 0: | ||
| raise ValueError("empty validation batch; check per-site data partitioning") | ||
| loss = F.cross_entropy(self(features), labels) | ||
| self.log("val_loss", loss) | ||
|
chesterxgchen marked this conversation as resolved.
|
||
|
|
||
| def configure_optimizers(self): | ||
| return torch.optim.SGD(self.parameters(), lr=self.hparams.lr) | ||
36 changes: 36 additions & 0 deletions
36
dev_tools/agent/skill_evals/nvflare-convert-lightning/files/hello-lightning/train.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| # Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| import pytorch_lightning as pl | ||
| import torch | ||
| from model import LitNet | ||
| from torch.utils.data import DataLoader, TensorDataset | ||
|
|
||
|
|
||
| def make_loader(): | ||
| features = torch.randn(8, 4) | ||
| labels = torch.randint(0, 2, (8,)) | ||
| return DataLoader(TensorDataset(features, labels), batch_size=4) | ||
|
|
||
|
|
||
| def main(): | ||
| model = LitNet() | ||
| train_loader = make_loader() | ||
| val_loader = make_loader() | ||
| trainer = pl.Trainer(max_epochs=1, accelerator="cpu", devices=1, logger=False) | ||
| trainer.fit(model, train_loader, val_loader) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| main() |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.