Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions PRODUCT.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ The April 2026 Claude Code source analysis confirmed that Anthropic's internal t
| Anthropic Concept | AgentOps Equivalent | Status |
|---|---|---|
| **Learning Loop** — memory extraction, dream cycle consolidation, future session context | Knowledge Flywheel — `/retro` → `/forge` → `/harvest` → `ao lookup` / `ao context assemble`, tiered promotion (learning → pattern → rule), plus bounded Dream via `/dream` | Live with bounds. On-demand capture/promotion works, and Dream provides an operator-started compounding lane. GitHub nightly is the public proof harness for the contracts, not the user's private runtime. |
| **Skillify** — AI watches patterns, packages them as reusable skills, compound growth | Skills system — 78 skills, `/heal-skill` audit, `/converter` cross-runtime export, SKILL-TIERS classification | Prototype built. `ao flywheel close-loop` now drafts review-only skills from repeated patterns; promotion polish is the remaining gap. |
| **Skillify** — AI watches patterns, packages them as reusable skills, compound growth | Skills system — 79 skills, `/heal-skill` audit, `/converter` cross-runtime export, SKILL-TIERS classification | Prototype built. `ao flywheel close-loop` now drafts review-only skills from repeated patterns; promotion polish is the remaining gap. |
| **Verification Agent** — adversarial AI auditing AI, VERDICT system for human review | Council architecture — `/council`, `/pre-mortem`, `/vibe`, `/post-mortem` with multi-model consensus, prediction tracking. Stage 4 behavioral validation adds holdout scenarios + satisfaction scoring in STEP 1.8. | Live on demand. STEP 1.8 fires automatically inside `/validation` when that skill is invoked. |
| **Managed Agents Dreaming** (May 2026) — scheduled session review, pattern extraction, memory curation between sessions | `/dream` + `.github/workflows/nightly.yml` proof jobs + substrate-driven scheduling when needed | Live with operator setup. The bounded private Dream lane runs harvest → forge → close-loop → defrag when the operator or substrate starts it. AgentOps itself no longer ships the daemon executor. |
| **Managed Agents Outcomes** (May 2026) — rubric-driven separate-context grader with iterate-until-pass | Live at three scopes: project — `GOALS.md` (rubric) + `ao goals measure` (each gate runs as separate subprocess; `cli/internal/goals/measure.go:132-164`) + `/evolve` (can iterate a worst-failing gate under operator limits; `skills/evolve/SKILL.md:379-388`); plan — `/pre-mortem` council judges as separate-context graders; code — `/vibe` council judges. An internal council review (2026-05-06) found these capabilities present across rubric authoring, separate-context grading, iterate-until-pass, and pinpoint-what-changed; this is an internal finding, not an audited external-parity claim. | Live at the capability layer. Empirical workbench A/B (2026-05-06): Δ=+0.0000 across 12 cases at v1 difficulty (both legs 12/12) — task difficulty floor exhausted; v2 substrate (realistic agent tasks where the hook layer differentiates) is roadmap. Counter-stat artifact: `evals/workbench/results/2026-05-06-yjzp9-counterstat.json`. |
Expand Down Expand Up @@ -176,7 +176,7 @@ The same model used in the README: bookkeeping records the work, the context com
- `ao lookup` — decay-ranked retrieval for on-demand knowledge
- `ao context assemble` — phase-scoped context packets
- `ao compile` — rebuild the knowledge wiki (mine, grow, defrag, lint)
- 78 skills — reusable context packages across Claude Code, Codex, and OpenCode
- 79 skills — reusable context packages across Claude Code, Codex, and OpenCode
- `bash <(curl -fsSL .../install.sh)` — 30 seconds, zero config

#### Layer 3: Validation Gates
Expand Down Expand Up @@ -261,7 +261,7 @@ As of 2026-05-10:

- GitHub repo: 341 stars, 33 forks, 2 open issues, last pushed 2026-05-10T03:24:01Z
- Public surface: GitHub Pages mkdocs site live at boshu2.github.io/agentops/; doctrine site live at 12factoragentops.com
- Distribution/runtime reach: 78 shared skills, 78 checked-in Codex artifacts, and 32 Codex overrides. `/validate` and `/curate` are additive in this train; legacy validation and mining skills remain until their shim/retirement gates are resolved.
- Distribution/runtime reach: 79 shared skills, 79 checked-in Codex artifacts, and 32 Codex overrides. `/validate` and `/curate` are additive in this train; legacy validation and mining skills remain until their shim/retirement gates are resolved.

**Measured operational proof:**

Expand Down
1 change: 1 addition & 0 deletions cli/embedded/skills/using-agentops/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,6 +145,7 @@ These are the skills every user needs first. Everything else is available when y
| `/crank` | Autonomous epic loop (uses swarm for each wave) |
| `/swarm` | Fresh-context parallel execution (Ralph pattern) |
| `/evolve` | Goal-driven fitness-scored improvement loop |
| `/burndown` | Bounded epic-completion loop — drive a finite target to all-merged, then stop |
| `/operating-loop-workflow` | Install + run the operating-loop multi-agent Workflow (seven-move loop) |
| `/autodev` | PROGRAM.md autonomous development contract setup and validation |
| `/dream` | Interactive Dream operator surface for setup, bedtime runs, and morning reports |
Expand Down
2 changes: 1 addition & 1 deletion docs/ARCHITECTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -351,7 +351,7 @@ All hooks can be disabled: `AGENTOPS_HOOKS_DISABLED=1` (kill switch) or per-hook
.
├── .claude-plugin/
│ └── plugin.json # Plugin manifest
├── skills/ # 78 skills (68 user-facing, 10 internal)
├── skills/ # 79 skills (69 user-facing, 10 internal)
│ ├── rpi/ # orchestration — Full RPI lifecycle orchestrator
│ ├── council/ # orchestration — Multi-model validation (core primitive)
│ ├── crank/ # orchestration — Autonomous epic execution
Expand Down
2 changes: 1 addition & 1 deletion docs/SKILLS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Skills Reference

Complete reference for all 78 AgentOps skills (68 user-facing + 10 internal).
Complete reference for all 79 AgentOps skills (69 user-facing + 10 internal).

Skills are the primitive layer of AgentOps. Higher-level entry points like
`/implement`, `/validation`, `/rpi`, and `/evolve` compose those primitives
Expand Down
8 changes: 8 additions & 0 deletions docs/contracts/context-map.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ and [CDLC](https://github.com/boshu2/agentops/blob/main/docs/cdlc.md) for the ar

- `brainstorm` — Separate goals from implementation.
- `bug-hunt` — Investigate bugs and root causes.
- `burndown` — Drive a finite epic set to all-merged, then stop.
- `complexity` — Find focused refactor hotspots.
- `council` — Run multi-judge consensus.
- `crank` — Execute epics through waves.
Expand Down Expand Up @@ -114,6 +115,7 @@ graph LR
beads -- "supplier-to" --> ratchet
brainstorm -- "shared-kernel" --> standards
bug-hunt -- "shared-kernel" --> standards
burndown -- "shared-kernel" --> standards
complexity -- "shared-kernel" --> standards
council -- "shared-kernel" --> standards
crank -- "shared-kernel" --> standards
Expand Down Expand Up @@ -186,6 +188,12 @@ graph LR
| `brainstorm` | produces | verdict.json |
| `bug-hunt` | consumes | beads |
| `bug-hunt` | consumes | standards |
| `burndown` | consumes | beads |
| `burndown` | consumes | implement |
| `burndown` | consumes | post-mortem |
| `burndown` | consumes | rpi |
| `burndown` | produces | .agents/burndown/*.json |
| `burndown` | produces | git-changes |
| `codex-team` | produces | .agents/swarm/results/*.json |
| `compile` | produces | .agents/compiled/lint-report.md |
| `complexity` | consumes | doc |
Expand Down
5 changes: 5 additions & 0 deletions docs/contracts/skill-dispositions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,11 @@ dispositions:
hexagonal_role: domain
disposition: update
rationale: "Core judgment gate; strengthen scenario and verdict self-test"
- skill: burndown
domain: "BC3 Loop"
hexagonal_role: domain
disposition: keep
rationale: "Bounded epic-completion loop; terminating counterpart to evolve (finite target → all-merged → STOP)"
- skill: crank
domain: "BC3 Loop"
hexagonal_role: domain
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/agentops-domain-evolution-bdd.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ Feature: Domain-governed AgentOps 3.0 evolution
And external-corpus-derived observations are used only through the clean-room policy

Scenario: Audit every skill before changing shipped behavior
Given the checked-in skill catalog contains 78 skills
Given the checked-in skill catalog contains 79 skills
When the evolution bootstrap audits the catalog
Then every skill is assigned exactly one primary bounded context
And each skill has a preliminary keep, update, refactor, merge-review, or cut-review disposition
Expand Down
7 changes: 4 additions & 3 deletions docs/reference/agentops-skill-domain-map.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# AgentOps Skill Domain Map

This map is the control surface for the next evolution loop. It classifies all
78 checked-in AgentOps skills before any broad rewrite, using current
79 checked-in AgentOps skills before any broad rewrite, using current
`origin/main` product direction, GOALS Directive 12, the DDD/hexagonal ADR, and
the `soc-y5vh` Loop epic.

Expand All @@ -18,9 +18,9 @@ around small provable changes.
<!-- BEGIN:audit-summary -->
| Signal | Result |
|---|---:|
| Skills audited | 78 |
| Skills audited | 79 |
| Domains classified | 5 of 5 (BC1-BC5) |
| Dispositions assigned | 78 / 78 |
| Dispositions assigned | 79 / 79 |
<!-- END:audit-summary -->

Observed gap: the catalog has strong operational kernels but weak productized
Expand Down Expand Up @@ -65,6 +65,7 @@ Disposition meanings:
| `bootstrap` | BC4 Factory | driving-adapter | update | First-run factory entrypoint; needs current 3.0/domain packet shape. |
| `brainstorm` | BC3 Loop | domain | update | Intent-shaping skill; should emit BDD-ready language. |
| `bug-hunt` | BC2 Validation | domain | update | Validation generator; needs acceptance examples and result contract. |
| `burndown` | BC3 Loop | domain | keep | Bounded epic-completion loop; terminating counterpart to evolve (finite target → all-merged → STOP). |
| `codex-team` | BC5 Runtime | supporting | update | Runtime coordination adapter; align with worktree/wave rules. |
| `compile` | BC1 Corpus | supporting | refactor | Corpus compiler is core; align read/write flows to Corpus ports. |
| `complexity` | BC2 Validation | domain | update | Generator for refactor work; add self-test and threshold evidence. |
Expand Down
42 changes: 37 additions & 5 deletions registry.json
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
{
"schema_version": 2,
"generated_at": "2026-05-29T17:02:40Z",
"generated_at": "2026-05-30T00:53:31Z",
"summary": {
"skills": 78,
"skills": 79,
"hooks": 0,
"knowledge_stores": 5,
"job_types": 0,
"eval_files": 56,
"cli_commands": 68,
"capabilities": 160
"capabilities": 161
},
"surfaces": {
"skills": [
Expand Down Expand Up @@ -132,6 +132,14 @@
"has_references": true,
"reference_count": 8
},
{
"name": "burndown",
"tier": "execution",
"path": "skills/burndown/",
"has_skill_md": true,
"has_references": false,
"reference_count": 0
},
{
"name": "complexity",
"tier": "execution",
Expand Down Expand Up @@ -1373,8 +1381,8 @@
]
},
"capability_summary": {
"total": 160,
"skills": 78,
"total": 161,
"skills": 79,
"cli_commands": 68,
"gates": 11,
"reference_impls": 3
Expand Down Expand Up @@ -1580,6 +1588,30 @@
"path": "skills/bug-hunt/",
"references": 8
},
{
"sku": "skill:burndown",
"name": "burndown",
"type": "skill",
"bounded_context": "BC3",
"hex_role": "domain",
"tier": "execution",
"purpose": "Drive a finite epic set to all-merged, then stop.",
"status": "active",
"disposition": "keep",
"consumes": [
"beads",
"rpi",
"implement",
"post-mortem"
],
"produces": [
".agents/burndown/*.json",
"git-changes"
],
"drives_commands": [],
"path": "skills/burndown/",
"references": 0
},
{
"sku": "skill:codex-team",
"name": "codex-team",
Expand Down
6 changes: 6 additions & 0 deletions skills-codex-overrides/catalog.json
Original file line number Diff line number Diff line change
Expand Up @@ -670,6 +670,12 @@
"treatment": "parity_only",
"wave": "catalog-parity",
"reason": "Scaffolds a Claude Workflow JS script from the operating-loop.js template; the authoring guidance is tool-specific reference material with no Codex-specific runtime divergence."
},
{
"name": "burndown",
"treatment": "parity_only",
"wave": "catalog-parity",
"reason": "TODO: confirm parity_only or flip to bespoke (+scaffold override dir) for burndown"
}
]
}
14 changes: 13 additions & 1 deletion skills-codex/.agentops-manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"generator": "manual-maintained",
"source_root": "skills",
"layout": "modular",
"codex_override_catalog_hash": "aef940bdfe26c2955145f7338854b193e9ef2b3baf7448e4fadebfbb26bb8fa1",
"codex_override_catalog_hash": "57959cbf3093ab0881e4769fee318b6f1aa3fbef3155dba02a8a2642f281ea36",
"codex_override_catalog": {
"version": 1,
"description": "Machine-readable Codex treatment map for the full skill catalog.",
Expand Down Expand Up @@ -648,6 +648,12 @@
"treatment": "parity_only",
"wave": "catalog-parity",
"reason": "Ship operating-loop Workflow to plugin users via installer skill; Codex twin redirects to the $rpi chain since Codex lacks the Workflow tool"
},
{
"name": "burndown",
"treatment": "parity_only",
"wave": "catalog-parity",
"reason": "Bounded epic-completion loop; codex form is canonical-derived (no bespoke shell idioms)."
}
]
},
Expand Down Expand Up @@ -688,6 +694,12 @@
"source_hash": "2f574ca482c52bb099550117a73619966ffc09a868088ca9f84c92caebf7d139",
"generated_hash": "f2492483bba6e13b0e0004a2ac431afa98708729d8ae5224e6df08b556bdd932"
},
{
"name": "burndown",
"source_skill": "skills/burndown",
"source_hash": "917f7277558d79c72c428fa6692d86d34d808a120d061341f5a393da063d4068",
"generated_hash": "ef6c65f7053775a2f7cf6c3111adca2ca1d630eece87d43bf620cc89a51fd748"
},
{
"name": "codex-team",
"source_skill": "skills/codex-team",
Expand Down
7 changes: 7 additions & 0 deletions skills-codex/burndown/.agentops-generated.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"generator": "manual-maintained",
"source_skill": "skills/burndown",
"layout": "modular",
"source_hash": "917f7277558d79c72c428fa6692d86d34d808a120d061341f5a393da063d4068",
"generated_hash": "ef6c65f7053775a2f7cf6c3111adca2ca1d630eece87d43bf620cc89a51fd748"
}
88 changes: 88 additions & 0 deletions skills-codex/burndown/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
---
name: burndown
description: 'Drive a finite epic set to all-merged, then stop.'
---
# $burndown - Bounded Epic-Completion Loop (Codex Native)

> **Quick Ref:** Drive a finite target (epic / set of epics / explicit bead list) to every in-scope bead merged on `main`, one bead per cycle, then STOP. The terminating counterpart to `$evolve` (open-ended). Output: merged PRs + a completion report.

**You must execute this workflow. Do not just describe it.**

## When to use which

| You want… | Command |
|---|---|
| Drive *this specific* epic/set to completion, then stop | **`$burndown`** |
| Always-on, whole-repo improvement (no finish line) | `$evolve` |
| One-shot **parallel** fan-out of an epic into worker waves | `$crank` |
| Define/validate the PROGRAM.md contract | `$autodev` |
| One bead, full lifecycle, once | `$rpi` |

`$burndown` is **serial and resumable** (one PR in flight); `$crank` is
**parallel and one-shot**.

## Invocation

```bash
$burndown <epic-id> # drive the epic to merged, then stop
$burndown <epic-id> --max-cycles=20 # cap bead-cycles
$burndown <epic-a> <epic-b> # finite set of epics
$burndown --beads ag-x.1,ag-x.2 # explicit bead list
$burndown <epic-id> --hold-merges # open PRs, leave green for operator
```

The **target set** = transitive in-scope (non-closed) beads of the arguments,
fixed at start. It is the loop's definition of "done."

## Per-cycle algorithm (idempotent)

Each firing does ONE of: merge an outstanding PR, finish, or advance one bead.

1. **Reconcile in-flight.** `git worktree list` + `git status`; if mid-edit → SKIP.
If a target PR is OPEN, check CI (`gh pr checks`):
- all required green → update branch, `gh pr merge <N> --squash --admin`,
record cited provenance, close the bead only if its acceptance is fully met,
remove the worktree, STOP this firing.
- pending → SKIP. red → fix-and-repush or revert; never merge red.
- `--hold-merges` → leave green PR for operator; STOP.
Never pick new work while a target PR is outstanding.
2. **Completion check.** Re-resolve the target set. All in-scope beads merged →
write completion report and STOP (DONE). `--max-cycles=N` reached → STOP with
handoff.
3. **Select ONE in-scope ready bead.** `bd ready` filtered to the target set, in
frontier order. NO open-ended ladder. No ready in-scope bead but target
incomplete → surface the blocker and STOP. Claim: `bd update <id> --status in_progress`.
4. **Drive to a PR.** Fresh worktree off `origin/main`. Codex runtime: drive the
bead with an `ao rpi` cycle or a direct TDD slice; for a wave-able bead fan
workers via NTM (`spawn_agent` / `wait_agent` / `close_agent`). First failing
test before the change. Gates: `cd cli && make test && go vet ./...`;
`env -u AGENTOPS_RPI_RUNTIME scripts/pre-push-gate.sh --fast`. Open a PR citing
the bead (trailers `Closes-scenario` / `Bounded-context` / `Evidence`).
Landing happens on a later cycle via step 1 — do not block on CI here.
5. **Log** to `.agents/burndown/cycle-history.jsonl` and loop.

## Finite stop conditions

STOPS (never dormant) on first of: target merged · `--max-cycles=N` · operator
STOP marker / `--hold-merges` · genuine blocker (remaining beads blocked on a
dependency or operator decision — surface, don't spin).

## Self-perpetuation

Fire the same `$burndown <target>` on a cadence (NTM pipeline tick, cron, or
ssh-to-bushido loop). Idempotency (step 1) makes repeated firings safe — no
stacked PRs, no double-work. Match cadence to CI latency.

## Hard constraints

No bead no PR; one vertical slice per PR; first failing test first;
worktree-mandatory; green CI is the only merge gate; dual-runtime triad for new
skills; capture via the promotion ratchet; record deferred follow-ups in bd.

## Backend Rules

- `bd` is the tracker; `bd ready --json` for in-scope selection; `bd worktree
create` per bead.
- `gh` drives PR + merge; `--admin` only overrides the up-to-date (BEHIND)
requirement when all required checks are green.
- Never send holdout `target`/`ground_truth`/PII to any cloud surface.
7 changes: 7 additions & 0 deletions skills-codex/burndown/prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# burndown

Drive a finite epic set (epic / set of epics / explicit bead list) to every in-scope bead merged on `main`, one bead per cycle, then STOP. The terminating counterpart to `$evolve`. Triggers: "burndown", "burn down", "finish the epic", "drive to all-merged", "complete the epic", "close out the epic".

## Instructions

Load and follow the skill instructions from the sibling `SKILL.md` file for this skill.
3 changes: 2 additions & 1 deletion skills/SKILL-TIERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,7 @@ These are how skills chain in practice:

## Current Skill Tiers

### User-Facing Skills (68)
### User-Facing Skills (69)

**Judgment:**

Expand Down Expand Up @@ -250,6 +250,7 @@ These are how skills chain in practice:
| **swarm** | execution | Parallelize any skill — fresh context per agent |
| **rpi** | meta | Thin wrapper: /discovery → /crank → /validation with complexity classification and loop |
| **evolve** | execution | Autonomous fitness-scored improvement loop |
| **burndown** | execution | Bounded epic-completion loop — drive a finite target to all-merged, then stop |
| **operating-loop-workflow** | execution | Install + run the operating-loop multi-agent Workflow (seven-move loop) for plugin users |
| **autodev** | execution | PROGRAM.md autonomous development contract setup and validation |
| **bug-hunt** | execution | Investigate bugs with git archaeology |
Expand Down
Loading
Loading