Skip to content

feat(cost): support per-token cost overrides in cost breakdown#5694

Open
lollinng wants to merge 1 commit into
Helicone:mainfrom
lollinng:feat/cost-override-headers
Open

feat(cost): support per-token cost overrides in cost breakdown#5694
lollinng wants to merge 1 commit into
Helicone:mainfrom
lollinng:feat/cost-override-headers

Conversation

@lollinng

Copy link
Copy Markdown

Ticket

Closes #5172

Component(s)

  • Packages (@helicone-package/cost)

Type of change

  • New feature

What this does

#5172 asks for per-request cost overrides via headers (Helicone-Input-Token-Cost / Helicone-Output-Token-Cost), because OpenRouter costs vary by the underlying provider and the registry's single rate is wrong for those requests.

This adds the cost-engine foundation that feature needs: an optional costOverride on calculateModelCostBreakdown and modelCostBreakdownFromRegistry:

costOverride?: { inputCostPerToken?: number; outputCostPerToken?: number }
  • applied per field, so a missing field falls back to the registry rate (override just input, just output, or both)
  • uses ?? so an explicit 0 is honored as a free rate rather than treated as unset
  • input/output only; cache and modality costs keep using registry rates

Unit is per-token USD, matching ModelPricing in the registry, so the value multiplies token counts directly with no conversion.

How it was tested

jest __tests__/cost/modelCostFromRegistry.test.ts passes (22 tests), including 4 new cases: both-field override, per-field partial override with registry fallback, the zero-override-is-free case (guards the ?? vs || choice), and the no-override registry path.

Proposed wiring (follow-up; I'd like your input first)

To make the headers take effect end to end, the override needs threading from the request into both cost paths:

  1. parse + validate the two headers in worker/src/lib/models/HeliconeHeaders.ts (reject negative / non-numeric -> ignore)
  2. pass it into the proxy cost call in ProxyForwarder.ts, and stash it into heliconeMeta so the authoritative async cost calc in jawn (ResponseBodyHandler.ts) honors it too

I scoped this PR to the cost engine because two points are unspecified in the issue and worth your call before I wire the proxy and jawn paths:

  • units: per-token USD (this PR) vs per-million-tokens (OpenRouter's unit). I can switch or accept both
  • unknown models: calculateModelCostBreakdown returns null when the registry has no config for a model, so an override alone won't price a model that isn't in the registry. If OpenRouter pass-through models with no registry entry are in scope, that needs an explicit branch

Happy to follow up with the worker + jawn wiring and the header doc once you confirm the units and the unknown-model behavior.

Add an optional costOverride { inputCostPerToken, outputCostPerToken } to
calculateModelCostBreakdown and modelCostBreakdownFromRegistry so callers can
override the registry's per-token input/output rates for a single request.
Overrides apply per field (a missing field falls back to the registry rate)
and use ?? so an explicit 0 is honored as free rather than treated as unset.
This is the cost-engine foundation for header-driven cost overrides (Helicone#5172).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@vercel

vercel Bot commented Jun 12, 2026

Copy link
Copy Markdown

@lollinng is attempting to deploy a commit to the Helicone Team on Vercel.

A member of the Team first needs to authorize it.

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add support for overriding cost calculations

1 participant