diff --git a/.claude/skills/scverse-template-update/SKILL.md b/.claude/skills/scverse-template-update/SKILL.md new file mode 100644 index 00000000..98a7a6c9 --- /dev/null +++ b/.claude/skills/scverse-template-update/SKILL.md @@ -0,0 +1,186 @@ +--- +name: scverse-template-update +description: Update a downstream repository to a new release of the cookiecutter-scverse template, intelligently preserving deviations the project made on purpose while still modernizing CI, build system, and tool versions. Use when asked to sync/update an scverse ecosystem package to a new template tag — an agent-driven replacement for the programmatic cruft-based sync that avoids tedious merge conflicts. +--- + +# scverse-template-update + +Sync a repository built from **cookiecutter-scverse** up to a newer template release. + +On a new template release, the scverse bot opens a tracking issue in each downstream +repo (`scripts/src/scverse_template_scripts/template_issues.py`) and a maintainer +assigns an agent to it — that's where you come in. This replaces the old PR-sending +bot, which did a blind 3-way merge: it re-rendered the template and forced every +change onto the repo, so any file the project changed on purpose came back as a +merge conflict the maintainer had to resolve by hand. **Your advantage is +judgement.** You can read the project's +git history to tell an *intentional* customization apart from a file that is merely +*stale*, propagate the template's modernizations cleanly, and only escalate the +genuinely ambiguous cases to a human. + +## Inputs + +- **Target repo** — a local path, or a GitHub `owner/repo` / URL to clone first. + If you were **assigned to a GitHub issue** asking for this update, the target is + the current checkout (`--repo .`). +- **New template tag** — e.g. `v0.6.0`. A git tag in the cookiecutter-scverse repo. + When triggered from an issue, the tag is stated in the issue body/title. +- **Template source** (optional) — defaults to the `template` URL recorded in the + repo's `.cruft.json`. Override with a local checkout when validating an + unreleased tag (e.g. point at the current working tree of this template repo). + +If any of these is missing or ambiguous, ask before proceeding. + +### Getting this skill's helper script + +Downstream repos don't ship the skill — it lives in the cookiecutter-scverse repo. +If `.claude/skills/scverse-template-update/sync_helper.py` is **not** already present +in your checkout, download it (pinned to the tag you're updating to) before step 2: + +```bash +mkdir -p .tmpl-sync +curl -fsSL -o .tmpl-sync/sync_helper.py \ + "https://raw.githubusercontent.com/scverse/cookiecutter-scverse//.claude/skills/scverse-template-update/sync_helper.py" +``` + +and run that copy in place of the path shown in step 2. (If you are reading these +very instructions from a URL rather than a local `SKILL.md`, fetch the helper the +same way.) + +## How the sync works conceptually + +The repo's `.cruft.json` records (a) the template URL, (b) the `commit` it was last +synced to, and (c) the exact cookiecutter answers (`context.cookiecutter`). With +those answers we render the template **twice** — at the old commit and at the new +tag — using the project's own values. Then for every file we know three things: +its old-template version (`T_old`), its new-template version (`T_new`), and the +repo's current version (`R`). That cross-tabulation drives every decision: + +| Situation | Meaning | Default action | +|---|---|---| +| `T_old == T_new` | template didn't touch this file | leave `R` untouched | +| file only in `T_new` | template added a file | add it (reconcile if `R` already differs) | +| file only in `T_old` | template removed a file | remove from repo (confirm it isn't repurposed) | +| `T_old != T_new`, `R == T_old` | template changed it, repo never customized | **take `T_new` verbatim** (clean modernization) | +| `T_old != T_new`, `R != T_old` | template changed it **and** repo diverged | **judgement call** — 3-way merge (below) | +| file only in repo | project content (`src/`, `tests/`, …) | never touch | + +`sync_helper.py` computes all of this for you. You spend your effort only on the +"judgement call" row. + +## Procedure + +### 1. Prepare + +- Ensure the target repo is checked out and its working tree is **clean** + (`git -C status`). Refuse to run on a dirty tree. +- Never work on the default branch. Create `git switch -c template-update-`. +- Confirm `.cruft.json` exists. If it doesn't, this repo isn't template-linked — + stop and tell the user (the legacy `cruft update` path doesn't apply either). + +### 2. Run the helper + +```bash +python .claude/skills/scverse-template-update/sync_helper.py \ + --repo --tag [--template ] --workdir +``` + +It renders `render_old/` and `render_new/`, and writes `manifest.json`. Read the +manifest. It also surfaces the skip lists: +- `_exclude_on_template_update` (from `.cruft.json`) and +- `[tool.cruft] skip` in the repo's `pyproject.toml` +These mark **user-owned** files (`src/**`, `tests/**`, `README.md`, `CHANGELOG.md`, +`docs/api.md`, `docs/index.md`, the example notebook, references, package +`__init__.py`/`basic.py`). Do not overwrite their *content*; only make a minimal +edit if a template change strictly requires it, and call it out. + +### 3. Apply the clean changes (no judgement needed) + +- **`template_modified_repo_clean`**: copy `render_new/` over the repo file. +- **`template_added`**: add `render_new/`. If the repo already has a + differing version, treat it like a diverged file (step 4). +- **`template_removed`**: remove from the repo, unless the file was clearly + repurposed for project use (check `git log`) — when unsure, leave it and note it. +- **`template_unchanged`** / **`repo_only_files`**: leave alone. + +### 4. Reconcile the diverged files (the point of this skill) + +For each entry in `template_modified_repo_diverged`, the manifest includes the +`template_diff` — the `T_old → T_new` change, i.e. *what the new template wants*. +For each one: + +1. **Understand the divergence.** Why does `R` differ from `T_old`? + ```bash + git -C log --oneline --follow -- + git -C log -p --follow -- # read the actual commits + ``` + - If the only commits touching it are past **template-sync / cruft** commits, or + the content simply matches an *older* template, the divergence is **stale** → + take `T_new` verbatim. + - If a deliberate, project-specific commit changed it (extra CI job, different + Python version matrix, added dependency, custom RTD/codecov config, tweaked + ruff rules, …), the divergence is **intentional** → preserve it. + +2. **3-way merge for intentional divergence.** Start from the repo's file `R` and + apply *only* the hunks of `template_diff` that are orthogonal to the + customization. Concretely: take the template's modernization (bumped action + versions, new build-backend settings, renamed keys, added steps) **and** keep + the project's intentional content. The goal is "repo's intent + template's + freshness," never a `<<<<<<<` conflict marker left in a file. + +3. **When genuinely unclear**, prefer preserving the project's version and record + the file in a "needs human review" list for the PR body rather than guessing. + +### 5. Modernization mandate + +Regardless of divergence, the repo **must** end up modern on infrastructure unless +a clear intentional customization says otherwise. Pay special attention to: + +- `.github/workflows/**` — runner images, action versions (`actions/checkout`, + `setup-uv`, …), permissions blocks, job structure. +- `.pre-commit-config.yaml` — hook repos and `rev:` pins. +- `pyproject.toml` — `[build-system]`, hatch envs, `requires-python`, classifiers, + ruff/tool config, dependency-group layout. +- `.readthedocs.yaml`, `docs/conf.py` machinery, `.codecov.yaml`, `.editorconfig`. + +If you preserved an *old* tool version because the repo pinned it, double-check the +pin was intentional (a comment or a commit explaining why) — otherwise modernize it. + +### 6. Finalize + +1. **Update `.cruft.json`**: set `"commit"` to the new template commit + (`new_commit` in the manifest) and bump any recorded checkout/tag. Keep + `context.cookiecutter` unchanged unless the template added new variables — if so, + add them with the template's defaults and note it. +2. **Verify** in the repo (best effort; report what passes/fails, don't hide + failures): + ```bash + cd + git add -A + pre-commit run --all-files # or: prek run --all-files + uv run python -c "import " + hatch run docs:build # if docs deps install cleanly + ``` + Fix mechanical fallout you introduced (formatting, an import the template moved). + Do not paper over a real failure caused by an intentional-divergence decision — + surface it. +3. **Commit** on the `template-update-` branch with a message like + `Update template to `. +4. **Open a PR** (only if the user asked, or you have a clear remote + auth). If you + were assigned to a template-update **issue**, the PR should close it (add + `Closes #` to the body). The PR body should make the maintainer's + review easy: + - what was modernized (CI, build, tool bumps), + - which deviations were **preserved** and why (cite the commits/reasons), + - anything **left for human review** and why, + - a reminder that pre-commit.ci / readthedocs / codecov should be enabled. + +## Guardrails + +- Read before you overwrite. A file matching `_exclude_on_template_update` or + `[tool.cruft] skip` is the project's, not the template's. +- Prefer a smaller, correct diff over a sweeping one. The maintainer should be able + to understand every change you made and why. +- Never leave conflict markers, `.rej`, or `.orig` files behind. +- When you couldn't decide, say so explicitly — an honest "needs review" beats a + confident wrong merge. diff --git a/.claude/skills/scverse-template-update/sync_helper.py b/.claude/skills/scverse-template-update/sync_helper.py new file mode 100644 index 00000000..f36f2b07 --- /dev/null +++ b/.claude/skills/scverse-template-update/sync_helper.py @@ -0,0 +1,227 @@ +#!/usr/bin/env python3 +"""Deterministic groundwork for an agent-driven cookiecutter-scverse template update. + +This does the *mechanical* part of a template sync and stops short of any decision +that benefits from judgement. It renders the template twice using the downstream +repo's own cookiecutter answers: + + * ``render_old`` — the template at the commit the repo was last synced to + (the ``commit`` field in ``.cruft.json``) + * ``render_new`` — the template at the requested new tag + +Comparing ``render_old`` vs ``render_new`` tells us exactly what the *template* +changed between the two versions (the modernization we want to propagate). +Comparing each render against the repo's current file tells us whether the repo +ever diverged. Cross-tabulating the three produces a per-file classification that +the agent then acts on (see SKILL.md). + +It writes nothing into the target repo. Output: the two render trees plus a +``manifest.json`` (and a human-readable summary on stdout). + +Usage: + python sync_helper.py --repo PATH --tag TAG [--template PATH_OR_URL] \ + [--workdir DIR] [--old-ref REF] + +Requires ``cruft`` to be importable/runnable (``python -m cruft``); install it on the +fly with e.g. ``uvx --from cruft python sync_helper.py ...`` if it isn't present. +""" + +from __future__ import annotations + +import argparse +import json +import shutil +import subprocess +import sys +import tempfile +from pathlib import Path + +# cruft create fails if these template-config vars differ from the template default, +# so we must not feed them back in as extra context (see cruft#166). +IGNORE_COOKIECUTTER_VARS = ["_copy_without_render"] + +# How much of each old->new template diff to embed in the manifest before truncating. +MAX_DIFF_CHARS = 8000 + + +def sh(cmd: list[str], **kw) -> subprocess.CompletedProcess: + return subprocess.run(cmd, check=True, text=True, capture_output=True, **kw) + + +def read_cruft_json(repo: Path) -> dict: + cruft_file = repo / ".cruft.json" + if not cruft_file.is_file(): + sys.exit(f"error: {cruft_file} not found — repo is not linked to the template, nothing to sync.") + return json.loads(cruft_file.read_text()) + + +def clone_or_link_template(template: str, workdir: Path) -> Path: + """Return a local path to the template git repo (clone it if a URL was given).""" + if Path(template).expanduser().is_dir(): + return Path(template).expanduser().resolve() + dest = workdir / "template" + print(f"Cloning template {template} -> {dest}") + sh(["git", "clone", "--filter=blob:none", template, str(dest)]) + return dest + + +def resolve_commit(template_dir: Path, ref: str) -> str: + return sh(["git", "-C", str(template_dir), "rev-list", "-n1", ref]).stdout.strip() + + +def render(template_dir: Path, ref: str, context: dict, out: Path, project_name: str) -> Path: + """Render the template at `ref` with `context` into `out`; return the project dir.""" + out.mkdir(parents=True, exist_ok=True) + with tempfile.NamedTemporaryFile("w", suffix=".json", delete=False) as f: + json.dump({k: v for k, v in context.items() if k not in IGNORE_COOKIECUTTER_VARS}, f) + ctx_file = f.name + cmd = [ + sys.executable, "-m", "cruft", "create", str(template_dir), + f"--checkout={ref}", "--no-input", f"--extra-context-file={ctx_file}", + "--output-dir", str(out), + ] + print("Running:", " ".join(cmd)) + # cruft runs the template's post-gen hook (git commit, pre-commit install); that is + # expected to succeed exactly as it does in the production sync. Surface output on failure. + proc = subprocess.run(cmd, text=True, capture_output=True) + if proc.returncode != 0: + sys.exit(f"cruft create at {ref} failed:\n{proc.stdout}\n{proc.stderr}") + project = out / project_name + if not project.is_dir(): + # fall back to the single generated directory if project_name was overridden + subdirs = [p for p in out.iterdir() if p.is_dir()] + if len(subdirs) != 1: + sys.exit(f"could not locate rendered project under {out}: {subdirs}") + project = subdirs[0] + return project + + +def list_files(root: Path) -> set[str]: + return { + str(p.relative_to(root)) + for p in root.rglob("*") + if p.is_file() and ".git/" not in f"{p.relative_to(root)}/" + } + + +def same(a: Path, b: Path) -> bool: + try: + return a.read_bytes() == b.read_bytes() + except OSError: + return False + + +def unified_diff(old: Path, new: Path, rel: str) -> str: + """Best-effort old->new template diff for a single file (the template's *intent*).""" + try: + proc = subprocess.run( + ["git", "diff", "--no-index", "--no-color", str(old), str(new)], + text=True, capture_output=True, + ) + out = proc.stdout + except OSError: + out = "" + if len(out) > MAX_DIFF_CHARS: + out = out[:MAX_DIFF_CHARS] + "\n... [diff truncated] ...\n" + return out + + +def main() -> None: + ap = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter) + ap.add_argument("--repo", required=True, type=Path, help="path to the downstream repo checkout") + ap.add_argument("--tag", required=True, help="new template release tag to update to") + ap.add_argument("--template", help="template path or URL (default: 'template' field in .cruft.json)") + ap.add_argument("--old-ref", help="override the old template ref (default: 'commit' field in .cruft.json)") + ap.add_argument("--workdir", type=Path, help="scratch dir for renders (default: a temp dir)") + args = ap.parse_args() + + repo = args.repo.expanduser().resolve() + workdir = (args.workdir or Path(tempfile.mkdtemp(prefix="tmpl-sync-"))).resolve() + workdir.mkdir(parents=True, exist_ok=True) + + cruft = read_cruft_json(repo) + context = cruft["context"]["cookiecutter"] + project_name = context["project_name"] + template_src = args.template or cruft["template"] + old_ref = args.old_ref or cruft["commit"] + + template_dir = clone_or_link_template(template_src, workdir) + old_commit = resolve_commit(template_dir, old_ref) + new_commit = resolve_commit(template_dir, args.tag) + + render_old = render(template_dir, old_commit, context, workdir / "render_old", project_name) + render_new = render(template_dir, args.tag, context, workdir / "render_new", project_name) + + old_files = list_files(render_old) + new_files = list_files(render_new) + repo_files = list_files(repo) + + cats: dict[str, list] = { + "template_added": [], # new file the template introduces + "template_removed": [], # file the template dropped + "template_modified_repo_clean": [], # template changed it AND repo still matches old -> safe to take new + "template_modified_repo_diverged": [],# template changed it AND repo diverged -> needs judgement (3-way) + "template_unchanged": [], # template identical old==new -> informational, leave repo alone + } + + for rel in sorted(old_files | new_files): + in_old, in_new = rel in old_files, rel in new_files + o, n, r = render_old / rel, render_new / rel, repo / rel + if in_new and not in_old: + entry = {"path": rel, "in_repo": rel in repo_files} + if rel in repo_files and not same(n, r): + entry["note"] = "repo already has a differing version" + cats["template_added"].append(entry) + elif in_old and not in_new: + cats["template_removed"].append({"path": rel, "in_repo": rel in repo_files}) + elif same(o, n): + cats["template_unchanged"].append(rel) + elif rel not in repo_files: + # template changed it but the repo deleted it — treat as a divergence to judge + cats["template_modified_repo_diverged"].append( + {"path": rel, "repo_state": "deleted", "template_diff": unified_diff(o, n, rel)} + ) + elif same(o, r): + cats["template_modified_repo_clean"].append(rel) + else: + cats["template_modified_repo_diverged"].append( + {"path": rel, "repo_state": "modified", "template_diff": unified_diff(o, n, rel)} + ) + + repo_only = sorted(repo_files - old_files - new_files) + + manifest = { + "project_name": project_name, + "repo": str(repo), + "template": str(template_dir), + "old_ref": old_ref, + "old_commit": old_commit, + "new_tag": args.tag, + "new_commit": new_commit, + "render_old": str(render_old), + "render_new": str(render_new), + "cruft_skip": cruft.get("context", {}).get("cookiecutter", {}).get("_exclude_on_template_update", []), + "tool_cruft_skip_hint": "also read [tool.cruft] skip in the repo's pyproject.toml", + "categories": cats, + "repo_only_files": repo_only, + } + (workdir / "manifest.json").write_text(json.dumps(manifest, indent=2)) + + # human-readable summary + print("\n" + "=" * 72) + print(f"Template update plan: {project_name} {old_commit[:8]} -> {args.tag} ({new_commit[:8]})") + print("=" * 72) + print(f"render_old : {render_old}") + print(f"render_new : {render_new}") + print(f"manifest : {workdir / 'manifest.json'}\n") + print(f" template_added : {len(cats['template_added'])}") + print(f" template_removed : {len(cats['template_removed'])}") + print(f" template_modified_repo_clean : {len(cats['template_modified_repo_clean'])} (safe to take new)") + print(f" template_modified_repo_diverged : {len(cats['template_modified_repo_diverged'])} (NEEDS JUDGEMENT)") + print(f" template_unchanged : {len(cats['template_unchanged'])} (leave repo as-is)") + print(f" repo_only_files : {len(repo_only)} (project content, do not touch)") + print("\nNext: follow SKILL.md to reconcile the 'diverged' files and apply the rest.") + + +if __name__ == "__main__": + main() diff --git a/.github/workflows/cruft-prs.yml b/.github/workflows/template-issues.yml similarity index 53% rename from .github/workflows/cruft-prs.yml rename to .github/workflows/template-issues.yml index 0c2d994b..08172a6f 100644 --- a/.github/workflows/cruft-prs.yml +++ b/.github/workflows/template-issues.yml @@ -1,35 +1,31 @@ -name: make cruft PRs for all projects using us +name: notify projects of a new template release on: release: types: [released] # unlike 'published' this does not trigger on pre-releases workflow_dispatch: inputs: release: - description: "Tag of the release PRs should me made for" + description: "Tag of the release to notify projects about" type: string required: true + dry_run: + description: "Log intended actions without creating/editing issues" + type: boolean + default: false jobs: - cruft-prs: + template-issues: runs-on: ubuntu-latest steps: - uses: actions/checkout@v6 - - name: Set git identity - run: | - git config --global user.name "scverse-bot" - git config --global user.email "108668866+scverse-bot@users.noreply.github.com" - name: Install the latest version of uv uses: astral-sh/setup-uv@v7 with: cache-dependency-glob: scripts/pyproject.toml - - name: Update template repo registry - run: uvx --from ./scripts send-cruft-prs ${{ env.RELEASE }} --all_repos --log-dir log + - name: Open template-update issues + run: uvx --from ./scripts send-template-issues "$RELEASE" --all-repos ${DRY_RUN:+--dry-run} env: RELEASE: ${{ github.event_name == 'release' && github.event.release.tag_name || github.event.inputs.release }} + DRY_RUN: ${{ github.event.inputs.dry_run == 'true' && '1' || '' }} GITHUB_TOKEN: ${{ secrets.BOT_GH_TOKEN }} FORCE_COLOR: "1" COLUMNS: "150" - - uses: actions/upload-artifact@v4 - if: always() - with: - name: cruft-logs - path: log/ diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 5e17e1ba..b16a8750 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -75,9 +75,6 @@ jobs: - name: set git default branch run: git config --global init.defaultBranch main - name: Run tests - env: - SCVERSE_BOT_READONLY_GITHUB_TOKEN: ${{ secrets.SCVERSE_BOT_READONLY_GITHUB_TOKEN }} - # PYTHONTRACEMALLOC: '20' # uncomment when debugging unclosed resources working-directory: ./scripts run: uvx hatch test --color=yes diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 00000000..e307b681 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,64 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## What this repository is + +This is **not** a Python package — it is a [cookiecutter](https://github.com/cookiecutter/cookiecutter) template used (via [cruft](https://github.com/cruft/cruft)) to scaffold new [scverse](https://scverse.org) ecosystem packages. The generated projects are best-practice Python libraries that depend on `anndata`/`mudata`. + +Two distinct concerns live side by side: + +1. **The template itself** — everything under `{{cookiecutter.project_name}}/`, controlled by `cookiecutter.json` and the `hooks/` scripts. Files here are Jinja2 templates rendered at generation time. +2. **`scripts/`** — a real, installable Python package (`scverse-template-scripts`) containing automation that runs in CI, most importantly the bot that sends template-update PRs to downstream repos. + +## Key architectural points + +### Template rendering (`{{cookiecutter.project_name}}/`) +- Files are rendered with Jinja2. `cookiecutter.json` defines the variables (`project_name`, `package_name`, `license`, `ide_integration`, `issue_categorization`, plus `_`-prefixed config). +- `_jinja2_env_vars` sets `trim_blocks`/`lstrip_blocks`, so `{% %}` block lines don't leave blank lines. +- `_copy_without_render` lists files copied verbatim (e.g. downstream GitHub workflows that themselves contain `{{ }}` syntax). Editing those requires no Jinja escaping. +- `_exclude_on_template_update` lists files cruft won't overwrite when updating an existing project (user-owned files like `src/**`, `tests/**`, `README.md`). +- **Conditional files/dirs**: a path like `{{".vscode" if cookiecutter.ide_integration else "DELETE-ME"}}/` renders to a directory literally named `DELETE-ME` when disabled; `hooks/post_gen_project.py` then deletes every `DELETE-ME` directory after generation. +- `hooks/post_gen_project.py` also makes the initial git commit (so cruft has a clean template-only baseline) and installs pre-commit. `hooks/pre_gen_project.py` runs validation before rendering. + +### Template-update notifier (`scripts/src/scverse_template_scripts/template_issues.py`) +- Entry point `send-template-issues` (`scripts.send-template-issues` in `scripts/pyproject.toml`). Triggered by `.github/workflows/template-issues.yml` on a GitHub **release** (not pre-release). +- Reads the list of downstream repos from `template-repos.yml` in `scverse/ecosystem-packages`, then opens (or refreshes) one tracking **issue** per repo announcing the new tag. The issue links the `scverse-template-update` agent skill (by URL, pinned to the tag); a maintainer assigns a coding agent to perform the update. Repos opt out via `skip: true` there or by deleting `.cruft.json`. +- Idempotent via a hidden `` marker: an open bot issue already at the tag is skipped, one at an older tag is edited in place, closed issues are left alone. +- This **replaced** the previous PR-based bot (`cruft_prs.py`), which forked/cloned/re-rendered each repo and pushed a template-update branch — that produced tedious merge conflicts on intentionally-customized files. The agent-driven flow reconciles semantically instead. The actual update logic lives in the skill at `.claude/skills/scverse-template-update/` (`SKILL.md` + `sync_helper.py`), not in `scripts/`. + +## Commands + +All Python tooling uses `uv`/`hatch`. There is no top-level Python package to install. + +### Working on `scripts/` +```bash +cd scripts +uvx hatch test # run the test suite +uvx hatch test -- -k test_name # run a single test +``` +Note: `test_build.py` runs cookiecutter against the template and asserts on rendered output and that no `DELETE-ME` dirs remain (its hook makes a git commit, hence the `git config` steps in CI); `test_issues.py` unit-tests the issue body/marker helpers. Neither needs network or a token. + +### Testing the template end-to-end (what CI does) +```bash +# Render the template from the current working tree +cruft create . --no-input --extra-context='{"package_name":"package_alt"}' +cd project-name +git add . +pre-commit run --all-files # lint the generated project +uv run python -c "import package_alt" +hatch run docs:build # build generated docs (needs pandoc installed) +``` +CI matrixes over Python 3.11 & 3.14 and over a `package_name` that matches the default derivation vs. one that doesn't (`package_alt`) — always consider both cases when touching name-dependent template logic. + +### Linting +```bash +pre-commit run --all-files +``` +`.pre-commit-config.yaml` here covers the template repo itself: `biome-format` (JSON/JS via `biome.jsonc`), `ruff` + `pyproject-fmt` scoped to `scripts/`, and `codespell`. The generated project ships its own separate pre-commit config. + +## Conventions + +- Ruff in `scripts/` enforces `from __future__ import annotations` (required-import) and a broad lint set; types used only for annotations go under `if TYPE_CHECKING:`. +- When editing template files, remember output is the *rendered* result — test a representative `cruft create` rather than eyeballing Jinja. +- Releasing a new template version = creating a GitHub release with a `vX.X.X` tag; this automatically fans out cruft PRs to downstream packages, so treat releases as user-facing changes. diff --git a/cookiecutter.json b/cookiecutter.json index e6d37188..2403dad0 100644 --- a/cookiecutter.json +++ b/cookiecutter.json @@ -15,6 +15,10 @@ "Unlicense" ], "ide_integration": true, + "issue_categorization": [ + "labels", + "issue types" + ], "_copy_without_render": [ ".github/workflows/build.yaml", ".github/workflows/test.yaml", @@ -45,6 +49,7 @@ "author_email": "The e-mail address your package’s users can contact you under", "github_user": "The GitHub username or org the project is to be published under", "github_repo": "If the repo name should differ from the project name, edit it now", - "ide_integration": "Whether to generate IDE configuration files" + "ide_integration": "Whether to generate IDE configuration files", + "issue_categorization": "How to categorize issues in the issue templates. Choose “issue types” only if your GitHub organization has issue types enabled (https://docs.github.com/en/issues/tracking-your-work-with-issues/configuring-issues/managing-issue-types-in-an-organization); otherwise choose “labels”" } } diff --git a/scripts/pyproject.toml b/scripts/pyproject.toml index d4dba31a..0522c49f 100644 --- a/scripts/pyproject.toml +++ b/scripts/pyproject.toml @@ -19,11 +19,7 @@ classifiers = [ ] dynamic = [ "version" ] dependencies = [ - "cruft", "cyclopts>=3.10", - "furl", - "gitpython", - "pre-commit", # is ran by cruft "pygithub>=2", "pyyaml", "rich", @@ -32,11 +28,14 @@ urls.Documentation = "https://github.com/scverse/cookiecutter-scverse#readme" urls.Issues = "https://github.com/scverse/cookiecutter-scverse/issues" urls.Source = "https://github.com/scverse/cookiecutter-scverse" scripts.make-rich-output = "scverse_template_scripts.make_rich_output:main" -scripts.send-cruft-prs = "scverse_template_scripts.cruft_prs:cli" +scripts.send-template-issues = "scverse_template_scripts.template_issues:cli" [tool.hatch] version.source = "vcs" version.fallback-version = "0.0" +# `cookiecutter` is only needed to render the template in tests (test_build.py); +# it used to be pulled in transitively via `cruft`, which the package no longer depends on. +envs.hatch-test.extra-dependencies = [ "cookiecutter" ] [tool.ruff] line-length = 120 diff --git a/scripts/src/scverse_template_scripts/backoff.py b/scripts/src/scverse_template_scripts/backoff.py deleted file mode 100644 index 303d2556..00000000 --- a/scripts/src/scverse_template_scripts/backoff.py +++ /dev/null @@ -1,28 +0,0 @@ -from __future__ import annotations - -import random -import time -from typing import TYPE_CHECKING - -from ._log import log - -if TYPE_CHECKING: - from collections.abc import Callable - - -def retry_with_backoff[T]( - fn: Callable[[], T], - retries: int = 5, - backoff_in_seconds: int | float = 1, - exc_cls: type = Exception, -) -> T: - exc = None - for x in range(retries): - try: - return fn() - except exc_cls as _exc: - exc = _exc - sleep = backoff_in_seconds * 2**x + random.uniform(0, 1) - log.info(f"Action failed. Retrying in {sleep}s.") - time.sleep(sleep) - raise exc diff --git a/scripts/src/scverse_template_scripts/cruft_prs.py b/scripts/src/scverse_template_scripts/cruft_prs.py deleted file mode 100644 index 5950e421..00000000 --- a/scripts/src/scverse_template_scripts/cruft_prs.py +++ /dev/null @@ -1,594 +0,0 @@ -"""Script to send cruft update PRs. - -Uses `template-repos.yml` from `scverse/ecosystem-packages`. -""" - -from __future__ import annotations - -import json -import math -import os -import sys -from collections.abc import Iterable -from dataclasses import KW_ONLY, InitVar, dataclass, field -from glob import glob -from pathlib import Path -from subprocess import run -from tempfile import TemporaryDirectory -from typing import TYPE_CHECKING, ClassVar, TypedDict, cast - -from cyclopts import App -from furl import furl -from git.exc import GitCommandError -from git.repo import Repo -from git.util import Actor -from github import Auth, Github, UnknownObjectException -from yaml import safe_load - -from ._log import log, setup_logging -from .backoff import retry_with_backoff - -if TYPE_CHECKING: - from collections.abc import Generator, Sequence - from typing import IO, Literal, LiteralString, NotRequired - - from github.ContentFile import ContentFile - from github.GitRelease import GitRelease as GHRelease - from github.NamedUser import NamedUser - from github.PullRequest import PullRequest - from github.Repository import Repository as GHRepo - - -PR_BODY_TEMPLATE = """\ -`cookiecutter-scverse` released [{release.tag_name}]({release.html_url}). - -## Changes - -{release.body} - -## Additional remarks -* **unsubscribe**: If you don’t want to receive these PRs in the future, - add `skip: true` to [`template-repos.yml`][] using a PR or, - if you never want to sync from the template again, delete the `.cruft.json` file in the root of your repository. -* If there are **merge conflicts**, you need to resolve them manually. -* The scverse template works best when the [pre-commit.ci][], [readthedocs][] and [codecov][] services are enabled. - Make sure to activate those apps if you haven't already. - -[`template-repos.yml`]: https://github.com/scverse/ecosystem-packages/blob/main/template-repos.yml -[pre-commit.ci]: {template_usage}#pre-commit-ci -[readthedocs]: {template_usage}#documentation-on-readthedocs -[codecov]: {template_usage}#coverage-tests-with-codecov -""" - -# GitHub says that up to 5 minutes of waiting for a fork are OK, -# So we error our once we wait longer, i.e. when 2ⁿ = 5 min × 60 sec/min -N_RETRIES_WAIT_FOR_FORK = math.ceil(math.log(5 * 60) / math.log(2)) # = ⌈~8.22⌉ = 9 -# Due to exponential backoff, we’ll maximally wait 2⁹ sec, or 8.5 min - -# Ignore the following variables when re-initializing the template from a cookiecutter.json file -IGNORE_COOKIECUTTER_VARS = [ - # ignored because `cruft create` fails if it contains any different value than the default, see also https://github.com/cruft/cruft/issues/166 - "_copy_without_render", -] - - -@dataclass -class GitHubConnection: - """API connection to a GitHub user (e.g. scverse-bot)""" - - _login: InitVar[str] - token: str | None = field(repr=False, default=None) - _: KW_ONLY - email: str | None = field(default=None) - - gh: Github = field(init=False) - user: NamedUser = field(init=False) - sig: Actor = field(init=False) - - def __post_init__(self, _login: str) -> None: - self.gh = Github(auth=Auth.Token(self.token) if self.token else None) - self.user = cast("NamedUser", self.gh.get_user(_login)) - if self.email is None: - self.email = self.user.email - self.sig = Actor(self.login, self.email) - - @property - def login(self) -> str: - return self.user.login - - def auth(self, url_str: str) -> str: - url = furl(url_str) - if self.token: - url.username = self.token - return str(url) - - -@dataclass -class TemplateUpdatePR: - """A template update pull request to a repository using the cookiecutter-scverse template""" - - con: GitHubConnection - release: GHRelease - repo_id: str # something like scverse-scirpy - - title_prefix: ClassVar[LiteralString] = "Update template to " - # -v2 to distinguish from branch names generated from earlier version of template sync that was using cruft - # (before v0.5.0 release of cookiecutter-scverse) - branch_prefix: ClassVar[LiteralString] = "template-update-v2-" - - @property - def title(self) -> str: - return f"{self.title_prefix}{self.release.tag_name}" - - @property - def template_branch(self) -> str: - """Branch name in the forked repo that tracks template updates (stay the same across versions)""" - # as of v0.5.0 (new template sync), the branch name does not contain the release-tag anymore - return f"{self.branch_prefix}{self.repo_id}" - - @property - def pr_branch(self) -> str: - """Name of the branch that is used to create the pull-request. A new branch is created for each version.""" - return f"{self.template_branch}-{self.release.tag_name}" - - @property - def namespaced_head(self) -> str: - """Branch used to crate the pull request, including repo namespace""" - return f"{self.con.login}:{self.pr_branch}" - - @property - def body(self) -> str: - return PR_BODY_TEMPLATE.format( - release=self.release, - template_usage="https://cookiecutter-scverse-instance.readthedocs.io/en/latest/template_usage.html", - ) - - def matches_prefix(self, pr: PullRequest) -> bool: - """Check if `pr` is either a current or previous template update PR by matching the branch name""" - # Don’t compare title prefix, people might rename PRs - return pr.head.ref.startswith(self.branch_prefix) and pr.user.id == self.con.user.id - - def matches_current_version(self, pr: PullRequest) -> bool: - """Check if `pr` is a template update PR for the current version""" - return pr.head.ref == self.pr_branch and pr.user.id == self.con.user.id - - -class RepoInfo(TypedDict): - """Info about a repository using the cookiecutter-scverse template""" - - url: str - skip: NotRequired[bool] - - -def get_template_release(gh: Github, tag_name: str) -> GHRelease: - """ - Get a release by tag from the cookiecutter-scverse repo - - `gh` represents the github API, authenticated against scverse-bot. - """ - template_repo = gh.get_repo("scverse/cookiecutter-scverse") - return template_repo.get_release(tag_name) - - -def _parse_repos(f: IO[str] | str | bytes) -> list[RepoInfo]: - repos = cast("list[RepoInfo]", safe_load(f)) - log.info(f"Found {len(repos)} known repos") - return repos - - -def get_repo_urls(gh: Github) -> Generator[str]: - """ - Get a list of all repos using the cookiecutter-scverse template (based on a YML file in scverse/ecosystem-packages). - - `gh` represents the github API, authenticated against scverse-bot. - """ - repo = gh.get_repo("scverse/ecosystem-packages") - file = cast("ContentFile", repo.get_contents("template-repos.yml")) - for repo in _parse_repos(file.decoded_content): - if not repo.get("skip"): - yield repo["url"] - - -def get_fork(con: GitHubConnection, repo: GHRepo) -> GHRepo: - """ - Fork target repo into the scverse-bot namespace and wait until the fork has been created. - - If the fork already exists, it is reused. - - Parameters - ---------- - con - Github API connection, authenticated against scverse-bot - repo - Reference to the *original* github repo that uses the template (i.e. not the fork) - """ - log.info(f"Creating fork for {repo.url}") - fork = repo.create_fork() - return retry_with_backoff( - lambda: con.gh.get_repo(fork.id), - retries=N_RETRIES_WAIT_FOR_FORK, - exc_cls=UnknownObjectException, - ) - - -def _clone_and_prepare_repo( - con: GitHubConnection, clone_dir: Path, template_branch_name: str, *, forked_repo: GHRepo, original_repo: GHRepo -) -> Repo: - """ - Clone the forked repo and set up branches and remotes. - - This function - * clones the forked repo - * adds the original repo as a remote named "upstream" - * checks out a branch called `{template_branch_name}`. If it does not exist yet, - it is created off the initial commit of the default branch of the original repo. - - Parameters - ---------- - con - GitHub connection - clone_dir - directory into which to clone the repo - forked_repo - reference to the forked repo (to be cloned) - original_repo - reference to the original repo (to be set as upstream) - template_branch_name - branch to contain the repo template (to be added to fork) - """ - # Clone the repo with blob filtering for better performance - log.info(f"Cloning {forked_repo.clone_url} into {clone_dir}") - clone = retry_with_backoff( - lambda: Repo.clone_from(con.auth(forked_repo.clone_url), clone_dir, filter="blob:none"), - retries=N_RETRIES_WAIT_FOR_FORK, - exc_cls=GitCommandError, - ) - - # Add original repo as remote - upstream = clone.create_remote(name="upstream", url=original_repo.clone_url) - upstream.fetch() - - # Get the default branch - default_branch = original_repo.default_branch - - # Check if the branch already exists in the forked repo - remote_refs = [ref.name for ref in clone.remote("origin").refs] - full_branch_name = f"origin/{template_branch_name}" - - # create and/or checkout template-update branch - if full_branch_name not in remote_refs: - log.info(f"Branch {template_branch_name} does not exists yet, creating it from initial commit") - # Get the initial commit on the default branch - initial_commit = next(clone.iter_commits(default_branch, reverse=True)) - - # Create and checkout a new branch from the initial commit - branch = clone.create_head(template_branch_name, initial_commit.hexsha) - branch.checkout() - else: - log.info(f"Branch {template_branch_name} already exists, checking it out") - branch = clone.create_head(template_branch_name, full_branch_name) - branch.checkout() - - return clone - - -class CruftConfig(TypedDict): - context: dict[Literal["cookiecutter"], dict[str, str]] - - -def _get_cruft_config_from_upstream(repo: Repo, default_branch: str) -> CruftConfig: - """Get cruft config from the default branch in the upstream repo""" - log.info(f"Getting .cruft.json from the {default_branch} branch in {repo.remote('upstream').url}") - try: - # Try to get .cruft.json from the latest commit in upstream's default branch - cruft_content = repo.git.show(f"upstream/{default_branch}:.cruft.json") - cruft_config = cast("CruftConfig", json.loads(cruft_content)) - log.info(f"Successfully read .cruft.json from upstream/{default_branch}") - except GitCommandError: - msg = "No .cruft.json found in repository" - raise FileNotFoundError(msg) from None - - return cruft_config - - -def _apply_update( - clone: Repo, - *, - template_tag_name: str | None = None, - cruft_log_file: Path, - cookiecutter_config: dict, - template_url: str = "https://github.com/scverse/cookiecutter-scverse", -) -> None: - """ - Apply the changes from the template to the original repo - - Instantiate the specified version of the cookiecutter template with the config used by the original repo. - Then remove everything from the original repo and copy over all template files. - - The outcome is a branch in the original repo that contains the updated template that can be merged - into the default branch by the user. - """ - clone_dir = Path(clone.working_dir) - with TemporaryDirectory() as td: - template_dir = Path(td) - # Initialize a new repo off the current template version, using the configuration from .cruft.json - cookiecutter_config_file = template_dir / "cookiecutter.json" - with cookiecutter_config_file.open("w") as f: - # need to put the cookiecutter-related info from .cruft.json into separate file - json.dump({k: v for k, v in cookiecutter_config.items() if k not in IGNORE_COOKIECUTTER_VARS}, f) - - # run in a subprocess, otherwise not possible to capture output of post-run hooks - with cruft_log_file.open("w") as log_f: - cmd = [ - sys.executable, - "-m", - "cruft", - "create", - template_url, - *([f"--checkout={template_tag_name}"] if template_tag_name is not None else []), - "--no-input", - f"--extra-context-file={cookiecutter_config_file}", - ] - log.info("Running " + " ".join(cmd)) - run(cmd, stdout=log_f, stderr=log_f, check=True, cwd=template_dir) - template_dir_project_name = template_dir / cookiecutter_config["project_name"] - - # Remove everything from the original repo (except the `.git` directoroy) - cmd = ["/usr/bin/find", ".", "-not", "-path", "./.git*", "-delete"] - log.info("Running " + " ".join(cmd) + f" in {clone_dir}") - run(cmd, check=True, cwd=clone_dir) - - # move over the contents from the new directory into the emptied git repo - cmd = [ - "/usr/bin/rsync", - "-Pva", - "--exclude", - ".git", - f"{template_dir_project_name.absolute()}/", - f"{clone_dir.absolute()}/", - ] - log.info("Running " + " ".join(repr(a) if " " in a else a for a in cmd)) - run(cmd, check=True, capture_output=True) - - -def _commit_update(clone: Repo, *, exclude_files: Sequence = (), commit_msg: str, commit_author: str) -> bool: - """ - Check if changes were made, and if yes, commit them. - - Glob patterns in `exclude_files` will not be staged for the commit. - - Returns a `bool` indicating whether changes have been made and committed. - """ - # Stage and commit (no_verify to avoid running pre-commit) - log.info("Changes detected. Staging and committing changes.") - # Check if something has changed at all - if not clone.is_dirty() and not clone.untracked_files: - log.info("Nothing has changed, aborting") - return False - - clone.git.add(A=True) - # unstage the files that we want to exclude from the template update - log.info(f"Excluding files from patterns: {exclude_files}") - for glob_pattern in exclude_files: - # need to check if pattern matches anything, because - if len(glob(glob_pattern, root_dir=clone.working_dir)): - clone.git.restore(glob_pattern, staged=True) - - # Check if there are any staged changes for commit - if not clone.git.diff_index("HEAD", cached=True, name_only=True): - log.info("Nothing has changed after excluding files, aborting") - return False - - clone.git.commit(m=commit_msg, no_verify=True, author=commit_author, no_gpg_sign=True) - return True - - -def template_update( # noqa: PLR0913, (= too many function arguments) - con: GitHubConnection, - *, - forked_repo: GHRepo, - original_repo: GHRepo, - template_branch_name: str, - versioned_branch_name: str, - tag_name: str, - cruft_log_file: Path, - dry_run: bool, -) -> bool: - """ - Create or update a template branch in the forked repo. - - Replacement for `cruft update` that implements all the template update logic from scratch. - Using this function, conflicts will show up as actual merge conflicts on Github, rather than creating `.rej` files. - - Here's a rough description of the approach: - 1) fork the repo to update into the scverse-bot namespace - 2) If no `template-update` branch exists in the fork, create one from the initial commit of the repo - 3) check out the `template-update` branch - 3) Remove everything from the template-branch - 4) Use `cruft create` to instantiate the template into a separate directory - 5) sync the changes from the separate directory into the `template-branch` - 6) commit - 7) check out commit into a version-specific branch used for making the pull request. See #396 for why this is - necessary. - - --> From this commit, we can make a pull-request to the original repo including the latest template-changes. - - Parameters - ---------- - con - A connection to the github API, authenticated against scverse-bot - forked_repo - The repo forked in scverse-bot namespace - template_branch_name - branch name to use for the template in the forked repo - versioned_branch_name - version-specific branch name (will be created off the template branch) - original_repo - The original (upstream) repo - tag_name - tag name of cookiecutter template to use - cruft_log_file - Filename to write cruft logs to - dry_run - If True, do not push changes - - """ - with ( - TemporaryDirectory() as clone_dir_str, - _clone_and_prepare_repo( - con, - (clone_dir := Path(clone_dir_str)), - template_branch_name, - forked_repo=forked_repo, - original_repo=original_repo, - ) as clone, - ): - default_branch: str = original_repo.default_branch - - cruft_config = _get_cruft_config_from_upstream(clone, default_branch) - cookiecutter_config = cruft_config["context"]["cookiecutter"] - _apply_update( - clone, - template_tag_name=tag_name, - cruft_log_file=cruft_log_file, - cookiecutter_config=cookiecutter_config, - ) - - # Load .cruft.json file of the current version of the template (includes `_exclude_on_template_update` key) - with (clone_dir / ".cruft.json").open() as f: - tmp_config = json.load(f) - exclude_files = tmp_config["context"]["cookiecutter"].get("_exclude_on_template_update", []) - - if ( - updated := _commit_update( - clone, - exclude_files=exclude_files, - commit_msg=f"Automated template update to {tag_name}", - commit_author=f"{con.sig.name} <{con.sig.email}>", - ) - ) and not dry_run: - clone.git.switch(versioned_branch_name, template_branch_name, C=True) - clone.git.push("origin", template_branch_name) - clone.git.push("origin", versioned_branch_name) - - return updated - - -def make_pr(con: GitHubConnection, release: GHRelease, repo_url: str, *, log_dir: Path, dry_run: bool = False) -> None: - """ - Make a pull request with the template update to the original repo - - Parameters - ---------- - con - A connection to the github API, authenticated against scverse-bot - release - A github release object, pointing to the release of cookiecutter-scverse to be used - repo_url - git URL of the repo to update - log_dir - Path in which cruft logs will be stored - dry_run - If True, skip making the actual pull request but perform all other actions up to this point - - """ - repo_id = repo_url.replace("https://github.com/", "").replace("/", "-") - log.info(f"Working on template update for {repo_id}") - - pr = TemplateUpdatePR(con, release, repo_id) - # create fork, populate branch, do PR from it - original_repo = con.gh.get_repo(repo_url.removeprefix("https://github.com/")) - - forked_repo = get_fork(con, original_repo) - - updated = template_update( - con, - forked_repo=forked_repo, - original_repo=original_repo, - template_branch_name=pr.template_branch, - versioned_branch_name=pr.pr_branch, - tag_name=release.tag_name, - cruft_log_file=log_dir / f"{pr.template_branch}.log", - dry_run=dry_run, - ) - if dry_run: - log.info("Skipping PR because in dry-run mode") - return - if updated: - if old_pr := next((p for p in original_repo.get_pulls("open") if pr.matches_current_version(p)), None): - log.info(f"PR already exists: #{old_pr.number} with branch name `{old_pr.head.ref}`. Skipping PR creation.") - return - - if old_pr := next((p for p in original_repo.get_pulls("open") if pr.matches_prefix(p)), None): - log.info(f"Closing old PR #{old_pr.number} with branch name `{old_pr.head.ref}`.") - old_pr.edit(state="closed") - - log.info(f"Creating PR of {pr.namespaced_head} against {original_repo.default_branch}") - new_pr = original_repo.create_pull( - title=pr.title, - body=pr.body, - base=original_repo.default_branch, - head=pr.namespaced_head, - maintainer_can_modify=True, - ) - log.info(f"Created PR #{new_pr.number} with branch name `{new_pr.head.ref}`.") - - -cli = App() - - -@cli.default -def main( - tag_name: str, - repo_urls: Iterable[str] | None = None, - *, - all_repos: bool = False, - log_dir: Path = Path("cruft_logs"), - dry_run: bool = False, -) -> None: - """ - Make PRs to GitHub repos. - - Parameters - ---------- - tag_name - Identifier of the release of cookiecutter-scverse - repo_urls - One or more repo URLs to make PRs to (e.g. for testing purposes). - Must be full GitHub URLs, e.g. https://github.com/scverse/scirpy. - all - With this flag, get the list of all repos that use the template from https://github.com/scverse/ecosystem-packages/blob/main/template-repos.yml. - log_dir - Directory to which cruft logs are written - dry_run - Skip making actual pull requests. All other actions up to this point are performed - (forking the repo, updating the template branch etc.). - """ - setup_logging() - log_dir.mkdir(exist_ok=True, parents=True) - - token = os.environ["GITHUB_TOKEN"] - con = GitHubConnection("scverse-bot", token, email="108668866+scverse-bot@users.noreply.github.com") - - if all_repos: - repo_urls = get_repo_urls(con.gh) - - if repo_urls is None: - msg = "Need to either specify `--all` or one or more repo URLs." - raise ValueError(msg) - - release = get_template_release(con.gh, tag_name) - failed = 0 - for repo_url in repo_urls: - try: - make_pr(con, release, repo_url, log_dir=log_dir, dry_run=dry_run) - except Exception as e: - failed += 1 - log.error(f"Error while updating {repo_url}") - log.exception(e) - - sys.exit(failed > 0) - - -if __name__ == "__main__": - cli() diff --git a/scripts/src/scverse_template_scripts/template_issues.py b/scripts/src/scverse_template_scripts/template_issues.py new file mode 100644 index 00000000..d5bdc99f --- /dev/null +++ b/scripts/src/scverse_template_scripts/template_issues.py @@ -0,0 +1,250 @@ +"""Notify downstream repositories that a new template version is available. + +Replaces the old PR-sending bot (``cruft_prs.py``). Instead of forking every repo +and pushing a (frequently conflict-ridden) template-update branch, this opens a +single tracking **issue** per repo pointing at the ``scverse-template-update`` agent +skill. A maintainer then assigns the coding agent of their choice to the issue, and +the agent performs the update — preserving intentional deviations rather than +producing mechanical merge conflicts. + +The list of repos comes from ``template-repos.yml`` in ``scverse/ecosystem-packages``. +""" + +from __future__ import annotations + +import os +import sys +from collections.abc import Iterable +from typing import TYPE_CHECKING, TypedDict, cast + +from cyclopts import App +from github import Auth, Github +from yaml import safe_load + +from ._log import log, setup_logging + +if TYPE_CHECKING: + from collections.abc import Generator + from typing import NotRequired + + from github.ContentFile import ContentFile + from github.GitRelease import GitRelease as GHRelease + from github.Issue import Issue + from github.NamedUser import NamedUser + from github.Repository import Repository + +# Hidden marker embedded in every issue body so we can find (and update) the issue we +# previously opened, without relying on the title (which maintainers may rename). +MARKER_PREFIX = "" + + +def parse_issue_tag(body: str | None) -> str | None: + """Extract the template tag a bot issue was last written for, if any.""" + if not body or MARKER_PREFIX not in body: + return None + rest = body.split(MARKER_PREFIX, 1)[1] + return rest.split("-->", 1)[0].strip() or None + + +def issue_title(tag: str) -> str: + return f"Template update available: {tag}" + + +ISSUE_BODY_TEMPLATE = """\ +A new release of the [`cookiecutter-scverse`]({template_url}) template — \ +[**{tag}**]({release_url}) — is available, and your repository is built from it. +Updating keeps your CI, build system, and developer tooling in line with current +scverse best practices. + +## What's new in {tag} + +{release_body} + +## How to apply this update + +This update is meant to be carried out by an **AI coding agent** of your choice +(e.g. the Claude GitHub app, or any agent you run against a local checkout). The +template ships a skill that does the heavy lifting: it re-renders the template at +your repo's recorded version and at {tag}, works out which files you changed *on +purpose* versus which are merely *stale*, and applies the modernization while +keeping your intentional customizations — no blind merge conflicts. + +1. **Assign your coding agent to this issue** (or open a clean checkout locally). +2. **Point it at the skill** and ask it to update this repository to {tag}: + - Instructions: [`SKILL.md`]({skill_md_url}) + - Helper script: [`sync_helper.py`]({helper_url}) + + A one-line prompt that works for most agents: + > Update this repository to `cookiecutter-scverse` {tag} by following the + > instructions at {skill_md_url} (use `--repo .` and `--tag {tag}`). Open a PR + > that closes this issue. +3. **Review the PR** the agent opens. It will summarize what it modernized, which + of your deviations it preserved (and why), and anything it left for your review. + +### Prefer to do it by hand? + +You can still update manually with cruft: ensure a clean working tree, then run +`cruft update`. See the template-usage docs for details. + +## Remarks + +* **Unsubscribe**: add `skip: true` to [`template-repos.yml`][repos] via PR, or — to + stop syncing from the template entirely — delete `.cruft.json` from your repo root. +* The template works best with [pre-commit.ci][], [readthedocs][] and [codecov][] + enabled; make sure those apps are activated. + +[repos]: https://github.com/scverse/ecosystem-packages/blob/main/template-repos.yml +[pre-commit.ci]: https://pre-commit.ci/ +[readthedocs]: https://readthedocs.org/ +[codecov]: https://about.codecov.io/ + +{marker} +""" + + +def skill_urls(template_slug: str, tag: str) -> tuple[str, str]: + """(human-readable SKILL.md URL, raw sync_helper.py URL) pinned to the release tag.""" + base = ".claude/skills/scverse-template-update" + skill_md = f"https://github.com/{template_slug}/blob/{tag}/{base}/SKILL.md" + helper = f"https://raw.githubusercontent.com/{template_slug}/{tag}/{base}/sync_helper.py" + return skill_md, helper + + +def render_issue_body(*, tag: str, release: GHRelease, template_url: str, template_slug: str) -> str: + skill_md_url, helper_url = skill_urls(template_slug, tag) + return ISSUE_BODY_TEMPLATE.format( + tag=tag, + release_url=release.html_url, + release_body=(release.body or "").strip() or "_See the release notes linked above._", + template_url=template_url, + skill_md_url=skill_md_url, + helper_url=helper_url, + marker=issue_marker(tag), + ) + + +class RepoInfo(TypedDict): + """Info about a repository using the cookiecutter-scverse template.""" + + url: str + skip: NotRequired[bool] + + +def _parse_repos(f: str | bytes) -> list[RepoInfo]: + repos = cast("list[RepoInfo]", safe_load(f)) + log.info(f"Found {len(repos)} known repos") + return repos + + +def get_repo_urls(gh: Github) -> Generator[str]: + """Yield the URLs of all (non-skipped) repos that use the template.""" + repo = gh.get_repo("scverse/ecosystem-packages") + file = cast("ContentFile", repo.get_contents("template-repos.yml")) + for entry in _parse_repos(file.decoded_content): + if not entry.get("skip"): + yield entry["url"] + + +def get_template_release(gh: Github, template_slug: str, tag_name: str) -> GHRelease: + return gh.get_repo(template_slug).get_release(tag_name) + + +def is_bot_template_issue(issue: Issue, bot_user_id: int) -> bool: + """An open issue we previously opened (matched by hidden marker + author).""" + return issue.pull_request is None and MARKER_PREFIX in (issue.body or "") and issue.user.id == bot_user_id + + +def notify_repo(repo: Repository, bot: NamedUser, *, tag: str, body: str, dry_run: bool) -> None: + """Create or update the single tracking issue in `repo`. + + Idempotent: if an open bot issue already targets `tag`, do nothing; if one targets + an older tag, edit it in place; otherwise create a new one. Closed issues are left + untouched (the maintainer already dealt with them). + """ + title = issue_title(tag) + + existing = [i for i in repo.get_issues(state="open", creator=bot.login) if is_bot_template_issue(i, bot.id)] + + if up_to_date := [i for i in existing if parse_issue_tag(i.body) == tag]: + log.info(f"{repo.full_name}: issue #{up_to_date[0].number} already targets {tag}, skipping") + return + + if existing: + issue = existing[0] + if len(existing) > 1: + log.warning(f"{repo.full_name}: {len(existing)} open template issues; updating #{issue.number} only") + if dry_run: + log.info(f"{repo.full_name}: would update issue #{issue.number} -> {tag}") + return + issue.edit(title=title, body=body) + log.info(f"{repo.full_name}: updated issue #{issue.number} -> {tag}") + return + + if dry_run: + log.info(f"{repo.full_name}: would create issue '{title}'") + return + new_issue = repo.create_issue(title=title, body=body) + log.info(f"{repo.full_name}: created issue #{new_issue.number}") + + +cli = App() + + +@cli.default +def main( + tag_name: str, + repo_urls: Iterable[str] | None = None, + *, + all_repos: bool = False, + dry_run: bool = False, + template_url: str = "https://github.com/scverse/cookiecutter-scverse", +) -> None: + """Open/refresh a template-update notification issue in downstream repos. + + Parameters + ---------- + tag_name + Release tag of cookiecutter-scverse to notify about. + repo_urls + One or more full repo URLs to notify (e.g. for testing). + all_repos + Notify every repo listed in scverse/ecosystem-packages/template-repos.yml. + dry_run + Log intended actions without creating or editing any issues. + template_url + URL of the template repo (used for the release/skill links). + """ + setup_logging() + template_slug = template_url.removeprefix("https://github.com/").rstrip("/") + + gh = Github(auth=Auth.Token(os.environ["GITHUB_TOKEN"])) + bot = cast("NamedUser", gh.get_user()) # the authenticated bot account + + if all_repos: + repo_urls = list(get_repo_urls(gh)) + if not repo_urls: + msg = "Need to either pass `--all-repos` or one or more repo URLs." + raise ValueError(msg) + + release = get_template_release(gh, template_slug, tag_name) + body = render_issue_body(tag=tag_name, release=release, template_url=template_url, template_slug=template_slug) + + failed = 0 + for repo_url in repo_urls: + try: + repo = gh.get_repo(repo_url.removeprefix("https://github.com/").rstrip("/")) + notify_repo(repo, bot, tag=tag_name, body=body, dry_run=dry_run) + except Exception as e: # one bad repo shouldn't abort the rest + failed += 1 + log.error(f"Failed to notify {repo_url}") + log.exception(e) + + sys.exit(failed > 0) + + +if __name__ == "__main__": + cli() diff --git a/scripts/tests/test_cruft.py b/scripts/tests/test_cruft.py deleted file mode 100644 index e76f84ec..00000000 --- a/scripts/tests/test_cruft.py +++ /dev/null @@ -1,145 +0,0 @@ -from __future__ import annotations - -import os -from pathlib import Path -from typing import TYPE_CHECKING - -import pytest -from git.repo.base import Repo -from github.Repository import Repository - -from scverse_template_scripts.cruft_prs import ( - GitHubConnection, - _apply_update, - _clone_and_prepare_repo, - _commit_update, - _get_cruft_config_from_upstream, - get_repo_urls, - get_template_release, -) - -if TYPE_CHECKING: - from collections.abc import Generator - - from git import Repo - from github.Repository import Repository - - -@pytest.fixture -def bot_con() -> GitHubConnection: - """Connect to the scverse-bot github account. Make sure to use only a readonly-token to not destroy anything.""" - token = os.environ["SCVERSE_BOT_READONLY_GITHUB_TOKEN"] - return GitHubConnection("scverse-bot", token, email="108668866+scverse-bot@users.noreply.github.com") - - -@pytest.fixture -def instance_orig(bot_con: GitHubConnection) -> Repository: - return bot_con.gh.get_repo("scverse/cookiecutter-scverse-instance") - - -@pytest.fixture -def instance_fork(bot_con: GitHubConnection, instance_orig: Repository) -> Repository: - del instance_orig # included for the side effect - return bot_con.gh.get_repo("scverse-bot/cookiecutter-scverse-instance") - - -@pytest.fixture -def clone( - tmp_path: Path, bot_con: GitHubConnection, instance_orig: Repository, instance_fork: Repository -) -> Generator[Repo]: - with _clone_and_prepare_repo( - bot_con, - tmp_path / "clone", - "test-template-update-branch", - forked_repo=instance_fork, - original_repo=instance_orig, - ) as repo: - yield repo - - -@pytest.fixture -def current_repo_path() -> Path: - """Get the currently checked out commit hash of this repository""" - repo_path = Path(__file__).resolve() - while True: - git_dir = repo_path / ".git" - if git_dir.exists(): - break - if repo_path == repo_path.parent: - msg = "Could not find .git directory" - raise ValueError(msg) - repo_path = repo_path.parent - - return repo_path - - -@pytest.mark.parametrize("tag_name", ["v0.4.0", "v0.2.17"]) -def test_get_template_release(bot_con: GitHubConnection, tag_name: str) -> None: - """Test if reference to release can be obtained""" - release = get_template_release(bot_con.gh, tag_name) - assert release.tag_name == tag_name - - -def test_get_repo_urls(bot_con: GitHubConnection) -> None: - """Test if list of repos using template can be obtained from scverse/ecosystem-packages""" - repo_urls = get_repo_urls(bot_con.gh) - assert any("scverse/scirpy" in url for url in repo_urls) - - -def test_clone_and_prepare_repo(clone: Repo) -> None: - """Test that example repo can be cloned an all branches setup correctly""" - assert (Path(clone.working_dir) / "pyproject.toml").exists() - assert clone.active_branch.name == "test-template-update-branch" - assert clone.remote("upstream").url.endswith("github.com/scverse/cookiecutter-scverse-instance.git") - assert clone.remote().url.endswith("github.com/scverse-bot/cookiecutter-scverse-instance.git") - - -def test_get_cruft_config_from_upstream(clone: Repo) -> None: - config = _get_cruft_config_from_upstream(clone, "main") - assert config["context"]["cookiecutter"]["project_name"] == "cookiecutter-scverse-instance" - - -def test_apply_update(clone: Repo, current_repo_path: Path, tmp_path: Path) -> None: - """Test that a template update can be applied to a cloned repo without crashing""" - log_file = tmp_path / "cruft_log.txt" - _apply_update( - clone, - template_tag_name=None, - cruft_log_file=log_file, - cookiecutter_config={"project_name": "cookiecutter-scverse-instance"}, - template_url=str(current_repo_path), - ) - - -@pytest.mark.parametrize( - ("exclude_files", "expected_untracked"), - [ - ([], []), - (["doesntexist.txt"], []), - (["dir1/A.txt", "dir1/doesntexist.txt"], ["dir1/A.txt"]), - (["dir2/**.txt"], ["dir2/foo/A.txt", "dir2/foo/B.txt", "dir2/bar/C.txt", "dir2/D.txt"]), - (["dir2/*"], ["dir2/foo/A.txt", "dir2/foo/B.txt", "dir2/bar/C.txt", "dir2/D.txt"]), - ], -) -def test_commit_update(clone: Repo, exclude_files: list[str], expected_untracked: list[str]) -> None: - repo_dir = Path(clone.working_dir) - (repo_dir / "dir1").mkdir() - (repo_dir / "dir2").mkdir() - (repo_dir / "dir2/foo").mkdir() - (repo_dir / "dir2/bar").mkdir() - (repo_dir / "dir1/A.txt").touch() - (repo_dir / "dir2/foo/A.txt").touch() - (repo_dir / "dir2/foo/B.txt").touch() - (repo_dir / "dir2/bar/C.txt").touch() - (repo_dir / "dir2/D.txt").touch() - - status = _commit_update(clone, exclude_files=exclude_files, commit_msg="foo", commit_author="scverse-bot") - - # some files have changed and commit has been made - assert status is True - - assert sorted(clone.untracked_files) == sorted(expected_untracked) - - -def test_commit_update_no_files(clone: Repo) -> None: - assert _commit_update(clone, commit_msg="foo", commit_author="scverse-bot") is False diff --git a/scripts/tests/test_issues.py b/scripts/tests/test_issues.py new file mode 100644 index 00000000..db48e4fd --- /dev/null +++ b/scripts/tests/test_issues.py @@ -0,0 +1,32 @@ +from __future__ import annotations + +from scverse_template_scripts.template_issues import ( + issue_marker, + issue_title, + parse_issue_tag, + skill_urls, +) + + +def test_marker_roundtrip() -> None: + tag = "v0.6.0" + body = f"some text\n\n{issue_marker(tag)}\n" + assert parse_issue_tag(body) == tag + + +def test_parse_issue_tag_absent() -> None: + assert parse_issue_tag(None) is None + assert parse_issue_tag("a regular issue with no marker") is None + + +def test_title_contains_tag() -> None: + assert "v1.2.3" in issue_title("v1.2.3") + + +def test_skill_urls_pin_to_tag() -> None: + skill_md, helper = skill_urls("scverse/cookiecutter-scverse", "v0.6.0") + assert skill_md == ( + "https://github.com/scverse/cookiecutter-scverse/blob/v0.6.0/.claude/skills/scverse-template-update/SKILL.md" + ) + assert helper.startswith("https://raw.githubusercontent.com/scverse/cookiecutter-scverse/v0.6.0/") + assert helper.endswith("/sync_helper.py") diff --git a/{{cookiecutter.project_name}}/.github/ISSUE_TEMPLATE/bug_report.yml b/{{cookiecutter.project_name}}/.github/ISSUE_TEMPLATE/bug_report.yml index 6104b9e6..aee13650 100644 --- a/{{cookiecutter.project_name}}/.github/ISSUE_TEMPLATE/bug_report.yml +++ b/{{cookiecutter.project_name}}/.github/ISSUE_TEMPLATE/bug_report.yml @@ -1,6 +1,10 @@ name: Bug report description: Report something that is broken or incorrect +{% if cookiecutter.issue_categorization == "issue types" %} type: Bug +{% else %} +labels: bug +{% endif %} body: - type: markdown attributes: diff --git a/{{cookiecutter.project_name}}/.github/ISSUE_TEMPLATE/feature_request.yml b/{{cookiecutter.project_name}}/.github/ISSUE_TEMPLATE/feature_request.yml index c0f52a9d..3ee69f5d 100644 --- a/{{cookiecutter.project_name}}/.github/ISSUE_TEMPLATE/feature_request.yml +++ b/{{cookiecutter.project_name}}/.github/ISSUE_TEMPLATE/feature_request.yml @@ -1,6 +1,10 @@ name: Feature request description: Propose a new feature for {{ cookiecutter.project_name }} +{% if cookiecutter.issue_categorization == "issue types" %} type: Enhancement +{% else %} +labels: enhancement +{% endif %} body: - type: textarea id: description diff --git a/{{cookiecutter.project_name}}/docs/template_usage.md b/{{cookiecutter.project_name}}/docs/template_usage.md index c49c8a5e..71fcfddc 100644 --- a/{{cookiecutter.project_name}}/docs/template_usage.md +++ b/{{cookiecutter.project_name}}/docs/template_usage.md @@ -369,18 +369,24 @@ Automated template sync is enabled by default for public repositories on GitHub. Our [scverse-bot][] automatically crawls GitHub for repositories that are based on this template, and adds them to the [list of template repositories][]. Whenever a new release of the template is made, -a pull request is opened in every repository listed there. -This helps keeping the repository up-to-date with the latest coding standards. - -It may happen that a template sync results in a merge conflict. -In that case, you need to resolve the merge conflicts manually, -either using the GitHub UI, or in your favorite editor. +the bot opens (or refreshes) a single tracking **issue** in every repository listed there, +letting you know that a newer template version is available. + +To apply the update, **assign an AI coding agent of your choice to that issue** +(for example the Claude GitHub app, or any agent you run against a local checkout). +The issue links to a dedicated agent _skill_ that performs the update: it figures out +which of your files were customized on purpose versus which are merely out of date, +applies the modernization (CI, build system, tooling) while preserving your +customizations, and opens a pull request for you to review. +Because the agent reconciles changes semantically rather than doing a blind merge, +you should rarely need to untangle a mechanical merge conflict by hand. :::{tip} The following hints may be useful to work with the template sync: -- If you want to ignore certain files from the template update, - you can add them to the `[tool.cruft]` section in the `pyproject.toml` file in the root of your repository. +- If you want certain files to be left untouched by updates, + add them to the `[tool.cruft]` section in the `pyproject.toml` file in the root of your repository. + The agent honors this list (and the `_exclude_on_template_update` entries) as user-owned files. - To disable the sync entirely, remove your package from the [list of template repositories][] via pull request, or simply remove the file `.cruft.json` from the root of your repository.