Skip to content

fix(core): honor Retry-After header on retried model calls#1283

Open
truffle-dev wants to merge 1 commit into
VoltAgent:mainfrom
truffle-dev:fix/retry-after-header-in-429-path-1276
Open

fix(core): honor Retry-After header on retried model calls#1283
truffle-dev wants to merge 1 commit into
VoltAgent:mainfrom
truffle-dev:fix/retry-after-header-in-429-path-1276

Conversation

@truffle-dev
Copy link
Copy Markdown
Contributor

@truffle-dev truffle-dev commented May 14, 2026

PR Checklist

  • The commit message follows the conventional-commit convention

Bugs / Features

What is the current behavior?

The retry loop in executeWithModelFallback (the single retry-delay site for streamText / generateText / streamObject / generateObject after their AI-SDK-internal retries are disabled with maxRetries: 0) always used local exponential backoff capped at 10 seconds:

const retryDelayMs = Math.min(1000 * 2 ** attemptIndex, 10000);

APICallError carries the provider's response headers, but they are dropped on the floor. So when a provider responds 429 with Retry-After: 30, the agent tries again in 1–10 seconds and gets rate-limited again, and N concurrent agents under the same provider key converge their retry windows into roughly the same instant.

What is the new behavior?

Move the retry-delay math into a small retry-after module:

  • parseRetryAfter(value, nowMs?) understands both forms in RFC 7231 §7.1.3 (delta-seconds and HTTP-date).
  • getRetryAfterMs(error, nowMs?) pulls the header off error.responseHeaders in either case (lowercase or canonical).
  • computeRetryDelayMs(error, attemptIndex, nowMs?) returns max(serverHint, exponentialFloor) when a header is present, keeping the exponential floor as a backpressure baseline so Retry-After: 0 still spaces things out. Result is capped at 5 minutes so a misconfigured or hostile server can't pin the agent.

Then agent.ts calls computeRetryDelayMs(error, attemptIndex) instead of computing the delay inline. The hook surface, log shape, and retry-vs-fallback decision are unchanged.

Tests added:

  • retry-after.spec.ts — 18 unit tests covering parsing edge cases (delta-seconds, HTTP-date, malformed values, past dates, safety cap, missing header, lowercase/canonical precedence).
  • agent.spec.ts — one integration test that verifies a Retry-After: 30 on a 429-shaped error flows through to setTimeout as 30000 ms.

fixes #1276

Notes for reviewers

  • The Math.max(serverHint, exponentialFloor) choice is deliberate: a server that returns Retry-After: 0 should still wait the exponential floor on subsequent attempts, otherwise a hot-loop retry storm is possible. If you prefer "server hint wins absolutely," I'm happy to flip it.
  • 5-minute safety cap (MAX_RETRY_AFTER_MS) is tunable; the value matches what most HTTP clients use as a sane upper bound. I kept it as a module-local constant rather than a config knob to avoid expanding the public surface in this PR.
  • executeWithModelFallback already disables AI SDK internal retries (maxRetries: 0) for all four entry points, so this is the single retry-delay site that needs the change.

Summary by cubic

Honor the provider’s Retry-After header on model retries to respect server backoff and reduce retry storms. Adds robust parsing and uses the server hint as a floor with a 5-minute safety cap; no API changes.

Written for commit b6f5b8c. Summary will update on new commits.

Summary by CodeRabbit

  • Bug Fixes
    • Improved retry handling for model calls to honor server rate-limit guidance while maintaining exponential backoff protection with a 5-minute safety cap.

Review Change Stack

The retry loop in `executeWithModelFallback` always used local exponential
backoff capped at 10 seconds, regardless of what the server asked for.
Under shared provider contention this caused concurrent agents to converge
their retry windows into the same window the provider had just told them
to wait past, amplifying load on already-overloaded endpoints.

Move the retry-delay math into a small `retry-after` module that parses
both delta-seconds and HTTP-date forms (RFC 7231 §7.1.3), takes the server
hint as a floor, keeps the exponential floor as a backpressure baseline,
and caps at 5 minutes so a misconfigured or hostile server cannot pin the
agent for hours.

Closes VoltAgent#1276.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 14, 2026

🦋 Changeset detected

Latest commit: b6f5b8c

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@voltagent/core Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 14, 2026

📝 Walkthrough

Walkthrough

This PR introduces RFC 7231 Retry-After header parsing and integrates it into Agent model retry logic. Three new utilities parse and apply server-provided retry delays as a minimum floor combined with exponential backoff. Agent's 429 retry path now respects these headers instead of ignoring them, reducing coordinated amplification under shared provider contention.

Changes

Retry-After Header Support

Layer / File(s) Summary
Retry-After RFC 7231 parsing utilities
packages/core/src/agent/retry-after.ts, packages/core/src/agent/retry-after.spec.ts
parseRetryAfter converts delta-seconds or HTTP-date values to milliseconds with validation and 5-minute clamping; getRetryAfterMs safely extracts the header from error response objects case-insensitively; computeRetryDelayMs computes the final delay as the maximum of exponential backoff and the parsed server hint.
Agent model retry integration
packages/core/src/agent/agent.ts, packages/core/src/agent/agent.spec.ts
Agent replaces its inline exponential backoff with a call to computeRetryDelayMs, allowing 429 retries to honor server-provided Retry-After headers. A new test verifies that a 429 with retry-after: 30 schedules a 30-second delay before retry.
Changeset documentation
.changeset/honor-retry-after-header.md
Documents the behavior change from purely local exponential backoff (10-second cap) to honoring server Retry-After hints with a 5-minute safety maximum.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🐰 A header came to say,
"Please wait, don't rush this way!"
No more herd stampedes at full speed—
We'll backoff politely when we need. 🎒✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(core): honor Retry-After header on retried model calls' clearly and concisely summarizes the main change: implementing Retry-After header support in model retry logic.
Description check ✅ Passed The PR description comprehensively covers all template sections: commit convention, linked issue (#1276), tests added, changeset included, current/new behavior, and detailed reviewer notes. Description template fully satisfied.
Linked Issues check ✅ Passed The PR fully addresses issue #1276 requirements: parses Retry-After header (delta-seconds/HTTP-date per RFC 7231), uses it as minimum delay floor, implements exponential backoff fallback, and applies 5-minute safety cap.
Out of Scope Changes check ✅ Passed All changes are directly in scope: new retry-after module with parsing/delay logic, agent.ts update to use it, comprehensive unit and integration tests, and changeset documentation. No unrelated modifications detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@packages/core/src/agent/retry-after.ts`:
- Around line 69-76: getRetryAfterMs currently only checks
headers["retry-after"] and headers["Retry-After"], which misses mixed-case
names; change the lookup to be case-insensitive by normalizing header keys
(e.g., iterate Object.keys(responseHeaders) and compare key.toLowerCase() ===
"retry-after") or build a lower-cased map before fetching the value, then pass
the found raw value to parseRetryAfter; update references in getRetryAfterMs to
use the normalized lookup of responseHeaders rather than the two exact keys.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 12f4e233-59be-4d06-ac53-723b2cc3d0dd

📥 Commits

Reviewing files that changed from the base of the PR and between 08414ed and b6f5b8c.

📒 Files selected for processing (5)
  • .changeset/honor-retry-after-header.md
  • packages/core/src/agent/agent.spec.ts
  • packages/core/src/agent/agent.ts
  • packages/core/src/agent/retry-after.spec.ts
  • packages/core/src/agent/retry-after.ts

Comment on lines +69 to +76
export function getRetryAfterMs(error: unknown, nowMs: number = Date.now()): number | null {
const headers = (error as { responseHeaders?: Record<string, string> } | undefined)
?.responseHeaders;
if (!headers || typeof headers !== "object") {
return null;
}
const raw = headers["retry-after"] ?? headers["Retry-After"];
return parseRetryAfter(raw, nowMs);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle Retry-After header names fully case-insensitively.

Line 75 only checks two exact key spellings. A mixed-case header key will be missed, so the server hint can be ignored unexpectedly.

Proposed fix
 export function getRetryAfterMs(error: unknown, nowMs: number = Date.now()): number | null {
   const headers = (error as { responseHeaders?: Record<string, string> } | undefined)
     ?.responseHeaders;
   if (!headers || typeof headers !== "object") {
     return null;
   }
-  const raw = headers["retry-after"] ?? headers["Retry-After"];
+  const raw = Object.entries(headers).find(
+    ([key]) => key.toLowerCase() === "retry-after",
+  )?.[1];
   return parseRetryAfter(raw, nowMs);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
export function getRetryAfterMs(error: unknown, nowMs: number = Date.now()): number | null {
const headers = (error as { responseHeaders?: Record<string, string> } | undefined)
?.responseHeaders;
if (!headers || typeof headers !== "object") {
return null;
}
const raw = headers["retry-after"] ?? headers["Retry-After"];
return parseRetryAfter(raw, nowMs);
export function getRetryAfterMs(error: unknown, nowMs: number = Date.now()): number | null {
const headers = (error as { responseHeaders?: Record<string, string> } | undefined)
?.responseHeaders;
if (!headers || typeof headers !== "object") {
return null;
}
const raw = Object.entries(headers).find(
([key]) => key.toLowerCase() === "retry-after",
)?.[1];
return parseRetryAfter(raw, nowMs);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/core/src/agent/retry-after.ts` around lines 69 - 76, getRetryAfterMs
currently only checks headers["retry-after"] and headers["Retry-After"], which
misses mixed-case names; change the lookup to be case-insensitive by normalizing
header keys (e.g., iterate Object.keys(responseHeaders) and compare
key.toLowerCase() === "retry-after") or build a lower-cased map before fetching
the value, then pass the found raw value to parseRetryAfter; update references
in getRetryAfterMs to use the normalized lookup of responseHeaders rather than
the two exact keys.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 5 files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

agent.ts — 429 retry path ignores Retry-After, coordinated amplification under shared provider contention

1 participant