Skip to content

🤖 feat(goals): long-horizon auto-continuation with budgets#3235

Open
ThomasK33 wants to merge 1 commit intomainfrom
goals-8h36
Open

🤖 feat(goals): long-horizon auto-continuation with budgets#3235
ThomasK33 wants to merge 1 commit intomainfrom
goals-8h36

Conversation

@ThomasK33
Copy link
Copy Markdown
Member

Summary

Adds a workspace-scoped, persistent Goal primitive that the agent auto-continues against an explicit dollar budget after each turn ends. Inspired by OpenAI Codex's /goal but adapted to Mux's primitives: synthetic visible user messages instead of a developer role, dollar budgets instead of tokens (we already track normalized cost via sessionUsageService), a shared IdleDispatcher consolidating heartbeats and goal continuations, and conditionally-injected model tools.

Gated end-to-end by EXPERIMENT_IDS.GOALS (off by default, user-overridable). Off the experiment, the only externally observable change is the new ESM module graph; runtime behavior is identical to main.

Background

Long-horizon agent runs on Mux today require the user to nudge the agent ("continue", "what next") at every stream-end. Codex solved this with /goal, but its model is token-budget + developer-role. Mux supports many providers with wildly different token economics, has no developer role, and already has an idle scheduler shared by heartbeats — so the design adapts the spec rather than copying it. The accompanying RFC (separate goals-planning branch — not part of this PR to keep history clean) covers the full architecture, state diagram, accounting boundaries, and edge cases.

Implementation

Backend

  • WorkspaceGoalService (new) owns goal lifecycle. Persists each workspace's goal at ~/.mux/sessions/<id>/goal.json via writeFileAtomic + per-workspace file locks. Optimistic-concurrency on setGoal via an optional expectedGoalId returns a typed Result<GoalRecordV1, GoalSetError> so the frontend can retry without throwing.
  • IdleDispatcher (new, also extracted from HeartbeatService) serializes per-workspace dispatches with priority-ordered consumers — goals = 100, heartbeats = 50 — so an active goal naturally suppresses heartbeats. Global concurrency cap of 1.
  • State machine: ActivePaused (manual user message or /goal pause), ActiveBudgetLimited (cost ≥ budget), BudgetLimited → exactly one wrap-up continuation → done, *Complete (model complete_goal or /goal complete). Setting an identical objective is idempotent (preserves goalId + accounting).
  • Auto-continuation: at every agentSession stream-end, if the goal is Active and the workspace is idle, an IdleDispatcher candidate fires a synthetic user message wrapping the objective in <untrusted_objective>. 60s cooldown, gated by requireUserAcknowledgmentSinceMs if a partial assistant message was recovered on startup.
  • Sub-agent attribution: child task costs are attributed up to the parent's active goal on task completion. Children do not inherit goals (avoids recursive budget exhaustion).
  • Fork inheritance: a snapshot of objective/budget/turn-cap is copied to the child workspace, but accounting resets and the goal defaults to Paused.
  • Tools: get_goal() and complete_goal(summary) are conditionally registered when GOALS is on, the workspace has an active/budget-limited goal, and the agent has editing capabilities. create_goal was deliberately omitted — goals must be user-initiated.

Frontend

  • Left-sidebar Target-icon pill on workspaces with a goal (live status, e.g. Target $0.04 / $1.00).
  • Conditional right-sidebar Goal tab with objective, completion summary, cost/budget/remaining/turns/elapsed, recent continuations, and Pause/Resume/Mark complete/Clear.
  • Settings → Experiments has a Goals toggle; settings now also has a dedicated Goals panel.
  • Custom transcript cards for get_goal and complete_goal (replacing the generic ▶ tool_name row), styled to match the GoalTab.
  • Slash commands: /goal "<obj>", /goal pause, /goal resume, /goal complete, /goal clear, /goal budget. All have command-palette equivalents.
  • Live UI: WorkspaceGoalService.pushSnapshot invokes a callback wired in coreServices to workspaceService.emit("activity"), so the pill + tab refresh without a reload.

Telemetry

AnalyticsService.recordGoalLifecycleEvent emits a closed set of events: goal_created/replaced/cleared/paused/resumed/completed/budget_limited/budget_changed/continuation_fired/wrapup_fired/crash_gate_set. All carry buckets, never raw text.

Validation

  • make static-check green (typecheck + ESLint + Prettier + docs link checking)
  • 358 goal-specific tests across WorkspaceGoalService, IdleDispatcher, agentSession.goalAutoPause, goal.test (tools), GoalTab, goalToolUtils, slash-command parser, command palette, and the dispatcher registry
  • Live dogfood pass via dev-server-sandbox: kicked off two goals (one auto-completing, one budget-limited), confirmed the synthetic continuation fires within ~180ms of setGoal, the cost accumulator + budget gating advance correctly, and Pause/Resume/Complete/Clear all round-trip through the UI

Risks

Medium. The dispatcher serialization, accounting accumulator, and stream-end hook all touch hot paths shared with heartbeats and sub-agents. Mitigations:

  • EXPERIMENT_IDS.GOALS gates every runtime branch — off by default
  • IdleDispatcher was extracted as a pure refactor in the first wave (no behavior change to HeartbeatService) and covered by 28 dedicated tests before goal logic was layered on
  • All disk writes use workspaceFileLocks + writeFileAtomic; corrupted goal.json is treated as no goal rather than crashing the workspace
  • An integration test pins the regression where the kickoff continuation never fires after init completes (getGoalContinuationRuntimeState reporting isInitializing indefinitely)

Pains

The biggest sink was discovering, mid-dogfood, that the auto-continuation loop never fired because WorkspaceService.getGoalContinuationRuntimeState returned isInitializing: initState != null instead of initState?.status === "running". InitStateManager keeps the persisted status forever after init succeeds, so the eligibility check kept deferring with reason: "initializing". Existing tests stubbed the bridge, so the bug went undetected. Now covered by an integration test wiring the real WorkspaceService through the real dispatcher.

A separate snag: coreServices.ts (used by mux server / dev-server-sandbox) was constructing the goal service but never calling registerGoalContinuationConsumer. Now wired in both coreServices and serviceContainer.

@mintlify
Copy link
Copy Markdown

mintlify Bot commented May 5, 2026

Preview deployment for your docs. Learn more about Mintlify Previews.

Project Status Preview Updated (UTC)
Mux 🟢 Ready View Preview May 5, 2026, 11:57 AM

💡 Tip: Enable Workflows to automatically generate PRs for you.

@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa89082972

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceGoalService.ts Outdated
Comment thread src/node/services/coreServices.ts
Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First-pass review (Netero only). This is a mechanical scan; the full review panel has not yet reviewed this PR.

Netero raised 6 findings. 5 were dropped during verification: 3 cited an em-dash convention that does not exist in mux, and 2 cited a Storybook-only testing convention that mux does not follow (88 standalone .test.tsx files exist in src/browser). 1 P2 survives.

The surviving finding is a permanently empty UI section that ships to users with no path to displaying real data. The panel will review after this is addressed.

🤖 This review was automatically generated with Coder Agents.

Comment thread src/browser/features/RightSidebar/GoalTab.tsx Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

P1 issues addressed in e488c4ebd:

  • Budget wrap-up: send first, only mark budgetLimitInjectedForGoalId on accept; transient send rejection now retries on the next dispatch (regression test added).
  • coreServices: moved IdleDispatcher construction and goal continuation bridge registration into the core factory so mux run and any other createCoreServices consumer gets a working dispatcher; ServiceContainer now shares core.idleDispatcher with HeartbeatService.

Please take another look.

@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round 2 is blocked. The open finding from round 1 has no author response and the code is unchanged.

DEREM-6 (P2, GoalTab.tsx:374-380): "Recent continuations" section is a permanent placeholder with no data source. No continuations prop, no schema field, no data pipeline. The author's e488c4ebd commit addressed the two Codex P1 findings but did not address or acknowledge DEREM-6.

Further review is blocked until the author responds to DEREM-6 (fix, acknowledge, or contest) or pushes a commit addressing it. The full review panel will run once this is resolved.

🤖 This review was automatically generated with Coder Agents.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep it up!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@ThomasK33
Copy link
Copy Markdown
Member Author

@coder-agents-review

DEREM-6 addressed in the latest force-push: deleted the empty "Recent continuations" section from GoalTab.tsx along with its tautological test assertion. It was a slop placeholder — there's no schema field, no data pipeline, and no plan to wire one up. Better to ship without dead UI than to ship a permanently empty card.

The full review panel can run now.

@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a26bcdd0a4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceGoalService.ts Outdated
Comment thread src/node/services/workspaceGoalService.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Two new P1s addressed in c9848926d:

  • Strip attachments from auto-continuation send options — switched to the canonical pickStartupRetrySendOptions whitelist.
  • Clear lastUserStopAtMsByWorkspace on explicit user resume — paused → active by the user now re-enables the auto-loop.

Regression tests added for both.

@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

@coder-agents-review coder-agents-review Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Second Netero pass (round 3). The full review panel has not yet reviewed this PR.

DEREM-6 (placeholder UI) verified fixed. 4 new findings: 1 P2, 3 P3. The P2 is a test reliability issue (raw setTimeout for coordination, violating mux AGENTS.md). The P3s are dead code, missing test coverage, and a silently divergent helper. The panel will review after the P2 is addressed.

🤖 This review was automatically generated with Coder Agents.

Comment thread src/node/services/workspaceGoalService.test.ts Outdated
Comment thread src/node/services/workspaceService.ts
Comment thread src/browser/utils/slashCommands/types.ts Outdated
Comment thread src/node/services/workspaceGoalService.ts Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Two more rounds addressed in 8957831e9. P2 setTimeout-coordination + 3 P3s (cascade test coverage, dead goal-noop, divergent formatCents) all fixed. Please take another look.

@ThomasK33
Copy link
Copy Markdown
Member Author

/coder-agents-review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 99168af5e5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/orpc/router.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed setGoal router validation: getGoal, setGoal, and clearGoal now all verify the workspace exists before calling WorkspaceGoalService, and unknown IDs return safe null/false/invalid_transition results without touching goal files. Updated router coverage to include setGoal.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 762aa9dbe2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceGoalService.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed queued mid-stream goal conflicts: deferred goal mutations now persist the original expectedGoalId and replay it through setGoalImmediately, so a concurrent replacement/clear before stream-end produces a conflict/no-op instead of overwriting the newer goal. Added regression coverage.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 172a4f101d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/utils/goals/setGoalWithConflictRetry.ts Outdated
Comment thread src/browser/utils/goals/setGoalWithConflictRetry.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed both optimistic-concurrency retry comments:

  • expectedGoalId: null now explicitly means “I expected no goal,” while undefined remains “no concurrency check.” The browser retry helper passes null for empty goal state, and the backend returns a typed conflict if a goal appears before drain.
  • The browser retry helper retries only goal_conflict failures and returns invalid-transition/other errors immediately.

Added focused regression coverage and reran static-check.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a419a730d7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceGoalService.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed limit-hit unpriced kickoff validation: backend setGoal now rejects budgeted goals that would become either active or budget_limited when the kickoff model has no billable pricing, preventing zero-budget/default-budget limit-hit goals from being created and then stranding on a rejected synthetic wrap-up. Added regression coverage for zero-budget + unpriced kickoff model.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e1778dd3e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceGoalService.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed queued mid-stream unpriced budgeted goals: the streaming/projection branch now runs the same kickoff-model pricing check before storing a pending mutation, returning the typed unpriced-model error immediately instead of projecting success and dropping at stream end. Added regression coverage.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8effa41b5f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/utils/chatCommands.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed non-conflict goal error surfacing: slash goal mutations now show the backend invalid_transition/child_workspace message and reserve the generic “Goal changed in another window” copy only for goal_conflict. Added lifecycle command coverage.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7f9693c862

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/features/RightSidebar/RightSidebar.tsx
Comment thread src/browser/utils/commands/sources.ts Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed typed goal error surfacing for the Goal tab and command palette: both now preserve the generic conflict message only for goal_conflict and surface backend invalid_transition / child_workspace messages directly. Added/updated focused UI action coverage and static-check is green.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 22701eb1c8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/utils/commands/sources.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed palette goal failures: command-palette goal mutations no longer throw on typed backend errors. They now surface the backend message through showCommandFeedbackToast, preserving generic copy only for real goal_conflict. Added regression coverage for invalid-transition prompt submission.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3fc92c117a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceGoalService.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed skipped continuation re-requesting: transient executeGoalContinuation/wrap-up send rejections now schedule a retry, and currently-streaming/busy eligibility skips defer themselves, so a requireIdle race cannot strand the pending candidate.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a4ed8528f3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/agentSession.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed one-shot unpriced continuation capture: WorkspaceGoalService now validates continuation send options for budgeted goals and falls back to priced kickoff/current options when the just-finished stream options are unpriced. Added regression coverage for budgeted goals created during an unpriced one-shot stream.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5e7519d4cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/utils/chatCommands.ts Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed slash goal pricing for one-shot streams: /goal budget checks now validate the selected continuation/send model (sendMessageOptions.model) rather than the active transcript/one-shot model, matching the backend continuation fallback behavior. ChatCommands pricing tests pass and static-check is green.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 27ec856c84

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/browser/features/RightSidebar/RightSidebar.tsx Outdated
Comment thread src/browser/utils/commands/sources.ts Outdated
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed selected-model pricing in palette and GoalTab: both budget guards now validate the selected continuation/send model rather than stale transcript/one-shot model state, aligning with backend continuation fallback behavior. Focused tests and static-check pass locally.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1207e2c810

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/node/services/workspaceService.ts
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed startup recovery ordering: pending compaction follow-ups are now dispatched before goal continuations are re-armed, preventing goal continuations from racing ahead of saved compaction follow-up recovery. Startup/compaction/goal tests and static-check pass locally.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 👍

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Adds a workspace-scoped, persistent **Goal** primitive that the agent
auto-continues against an explicit dollar budget after each turn ends.
Inspired by OpenAI Codex's `/goal` but adapted to Mux's primitives.

- `/goal "<objective>" [--budget $X[.YY]] [--turn-cap N]` slash command
  plus parity in the command palette (set / pause / resume / complete /
  clear / edit budget). Workspaces hold at most one active goal.
- New `IdleDispatcher` that serializes per-workspace dispatches and is
  shared with `HeartbeatService`; goals get priority 100 (suppresses
  background heartbeats while a goal is active).
- Cost / budget / turn / elapsed accounting persisted to
  `~/.mux/sessions/<workspaceId>/goal.json` with atomic writes and an
  optimistic-concurrency `expectedGoalId` on `setGoal`.
- Auto-continuation: at every stream-end, if the goal is `Active` and
  the workspace is idle, a synthetic `<untrusted_objective>`-wrapped
  user message is injected to drive the next turn (60s cooldown,
  serialized globally).
- BudgetLimited wrap-up: once `costCents >= budgetCents`, exactly one
  graceful "stop and summarize" continuation fires before further
  dispatches are suppressed.
- Auto-pause hooks: any non-synthetic user message pauses an active
  goal before streaming; partial-message recovery on startup arms a
  user-acknowledgment gate that blocks continuations until the user
  posts again.
- Sub-agent task costs are attributed up to the parent's active goal
  on task completion (children do not inherit the goal).
- Forked workspaces inherit a snapshot of the parent goal but reset
  accounting and default to `Paused` to avoid runaway loops.
- Two model-facing tools: `get_goal` (read-only) and
  `complete_goal(summary)` — conditionally injected when the GOALS
  experiment is on, the workspace has an active/budget-limited goal,
  and the agent is exec-like with editing capabilities.
- UI: left-sidebar `Target` pill on every workspace with a goal,
  conditional right-sidebar Goal tab (status + accounting +
  Pause/Resume/Mark complete/Clear), a settings section under
  Experiments, and custom transcript cards for `get_goal` /
  `complete_goal` that mirror the GoalTab styling.
- Live snapshot fan-out via `WorkspaceService.emit("activity")` so
  pill + tab refresh without a reload.
- Telemetry: lifecycle events (`goal_created/replaced/cleared/paused/
  resumed/completed/budget_limited/budget_changed/continuation_fired/
  wrapup_fired/crash_gate_set`) emitted through `AnalyticsService`.

Gated by `EXPERIMENT_IDS.GOALS` (off by default, user-overridable).

- Dollars (cents internally), not tokens, because Mux supports many
  models with different token economics. `turnCap` is a secondary
  guardrail. Wall-clock time is display-only.
- Compaction streams are excluded from goal accounting; failed /
  aborted streams are counted (provider cost is still paid).
- Synthetic continuation prompts XML-escape the objective inside an
  `<untrusted_objective>` block and explicitly tell the model to
  treat the objective as data, not instructions.

---

_Generated with `mux` • Model: `anthropic:claude-opus-4-7` • Thinking: `max` • Cost: `$768.23`_

<!-- mux-attribution: model=anthropic:claude-opus-4-7 thinking=max costs=768.23 -->
@ThomasK33
Copy link
Copy Markdown
Member Author

@codex review

Addressed selected-model budget validation across Goal UI entry points. Palette and GoalTab now validate the selected continuation/send model (with provider config) rather than stale transcript/one-shot model state, matching backend continuation fallback behavior. Focused palette/GoalTab tests and static-check pass locally.

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Breezy!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant