feat(llm): additive per-call USD cost meter#214
Open
ianu82 wants to merge 1 commit into
Open
Conversation
Turn the token counts Anton already records per LLM call into a dollar figure a host can surface as "$ this turn / $ this task". Purely additive telemetry: no budget object, no enforcement, no control-flow change (those are explicitly later slices). - New anton/core/llm/pricing.py: a maintained per-model price table (input/output/cache USD rates per 1M tokens, matched by model-ID prefix like the existing _CONTEXT_WINDOWS table) plus compute_cost(). Unknown models price at 0.0 and None token counts are treated as 0, so an unpriced model or a missing usage field never breaks a turn (mirrors compute_context_pressure's defensive posture). - Usage gains additive cache_write_tokens / cache_read_tokens / cost_usd fields, all defaulted so every existing construction site stays valid. - Both providers populate cost_usd (and cache tokens where the SDK reports them) at all 6 Usage construction sites. Anthropic reports cache tokens separately, so they're summed; OpenAI folds cached tokens into prompt_tokens, so they're surfaced for telemetry but not double -priced. cost_usd rides on the existing StreamComplete/usage output. Adds tests/test_pricing.py (tokens x price table -> expected USD, including cache and zero/None/unknown-model cases). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a maintained per-model price table (
anton/core/llm/pricing.py, USD/1M tokens, prefix-matched; cache-aware) andcompute_cost(...).Usagegains additivecache_write_tokens/cache_read_tokens/cost_usd(all defaulted, so every existing construction site stays valid); the Anthropic + OpenAI adapters populate them.Purely additive — no agent-loop / control-flow change, no enforcement. Unknown models and zero/None tokens → $0.00 (never breaks a turn).
Intended use is operator-side / internal only (e.g. a future budget/ledger). Per the consuming product's policy, computed dollar cost is not surfaced to end users — the cowork-server harness forwards tokens consumed only and does not expose
cost_usd. This PR just makes the capability available in the library.Testing: +19 unit tests (tokens × table incl. cache/zero/None/unknown); full suite 1092 passed, 17 skipped.
Budget caps + plan object are deliberately later (control-flow) slices.
🤖 Generated with Claude Code