fix(core): honor Retry-After header on retried model calls#1283
fix(core): honor Retry-After header on retried model calls#1283truffle-dev wants to merge 1 commit into
Conversation
The retry loop in `executeWithModelFallback` always used local exponential backoff capped at 10 seconds, regardless of what the server asked for. Under shared provider contention this caused concurrent agents to converge their retry windows into the same window the provider had just told them to wait past, amplifying load on already-overloaded endpoints. Move the retry-delay math into a small `retry-after` module that parses both delta-seconds and HTTP-date forms (RFC 7231 §7.1.3), takes the server hint as a floor, keeps the exponential floor as a backpressure baseline, and caps at 5 minutes so a misconfigured or hostile server cannot pin the agent for hours. Closes VoltAgent#1276.
🦋 Changeset detectedLatest commit: b6f5b8c The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
📝 WalkthroughWalkthroughThis PR introduces RFC 7231 ChangesRetry-After Header Support
🎯 3 (Moderate) | ⏱️ ~25 minutes
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint skipped: no ESLint configuration detected in root package.json. To enable, add Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/core/src/agent/retry-after.ts`:
- Around line 69-76: getRetryAfterMs currently only checks
headers["retry-after"] and headers["Retry-After"], which misses mixed-case
names; change the lookup to be case-insensitive by normalizing header keys
(e.g., iterate Object.keys(responseHeaders) and compare key.toLowerCase() ===
"retry-after") or build a lower-cased map before fetching the value, then pass
the found raw value to parseRetryAfter; update references in getRetryAfterMs to
use the normalized lookup of responseHeaders rather than the two exact keys.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 12f4e233-59be-4d06-ac53-723b2cc3d0dd
📒 Files selected for processing (5)
.changeset/honor-retry-after-header.mdpackages/core/src/agent/agent.spec.tspackages/core/src/agent/agent.tspackages/core/src/agent/retry-after.spec.tspackages/core/src/agent/retry-after.ts
| export function getRetryAfterMs(error: unknown, nowMs: number = Date.now()): number | null { | ||
| const headers = (error as { responseHeaders?: Record<string, string> } | undefined) | ||
| ?.responseHeaders; | ||
| if (!headers || typeof headers !== "object") { | ||
| return null; | ||
| } | ||
| const raw = headers["retry-after"] ?? headers["Retry-After"]; | ||
| return parseRetryAfter(raw, nowMs); |
There was a problem hiding this comment.
Handle Retry-After header names fully case-insensitively.
Line 75 only checks two exact key spellings. A mixed-case header key will be missed, so the server hint can be ignored unexpectedly.
Proposed fix
export function getRetryAfterMs(error: unknown, nowMs: number = Date.now()): number | null {
const headers = (error as { responseHeaders?: Record<string, string> } | undefined)
?.responseHeaders;
if (!headers || typeof headers !== "object") {
return null;
}
- const raw = headers["retry-after"] ?? headers["Retry-After"];
+ const raw = Object.entries(headers).find(
+ ([key]) => key.toLowerCase() === "retry-after",
+ )?.[1];
return parseRetryAfter(raw, nowMs);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export function getRetryAfterMs(error: unknown, nowMs: number = Date.now()): number | null { | |
| const headers = (error as { responseHeaders?: Record<string, string> } | undefined) | |
| ?.responseHeaders; | |
| if (!headers || typeof headers !== "object") { | |
| return null; | |
| } | |
| const raw = headers["retry-after"] ?? headers["Retry-After"]; | |
| return parseRetryAfter(raw, nowMs); | |
| export function getRetryAfterMs(error: unknown, nowMs: number = Date.now()): number | null { | |
| const headers = (error as { responseHeaders?: Record<string, string> } | undefined) | |
| ?.responseHeaders; | |
| if (!headers || typeof headers !== "object") { | |
| return null; | |
| } | |
| const raw = Object.entries(headers).find( | |
| ([key]) => key.toLowerCase() === "retry-after", | |
| )?.[1]; | |
| return parseRetryAfter(raw, nowMs); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/core/src/agent/retry-after.ts` around lines 69 - 76, getRetryAfterMs
currently only checks headers["retry-after"] and headers["Retry-After"], which
misses mixed-case names; change the lookup to be case-insensitive by normalizing
header keys (e.g., iterate Object.keys(responseHeaders) and compare
key.toLowerCase() === "retry-after") or build a lower-cased map before fetching
the value, then pass the found raw value to parseRetryAfter; update references
in getRetryAfterMs to use the normalized lookup of responseHeaders rather than
the two exact keys.
PR Checklist
Bugs / Features
What is the current behavior?
The retry loop in
executeWithModelFallback(the single retry-delay site forstreamText/generateText/streamObject/generateObjectafter their AI-SDK-internal retries are disabled withmaxRetries: 0) always used local exponential backoff capped at 10 seconds:APICallErrorcarries the provider's response headers, but they are dropped on the floor. So when a provider responds 429 withRetry-After: 30, the agent tries again in 1–10 seconds and gets rate-limited again, and N concurrent agents under the same provider key converge their retry windows into roughly the same instant.What is the new behavior?
Move the retry-delay math into a small
retry-aftermodule:parseRetryAfter(value, nowMs?)understands both forms in RFC 7231 §7.1.3 (delta-seconds and HTTP-date).getRetryAfterMs(error, nowMs?)pulls the header offerror.responseHeadersin either case (lowercase or canonical).computeRetryDelayMs(error, attemptIndex, nowMs?)returnsmax(serverHint, exponentialFloor)when a header is present, keeping the exponential floor as a backpressure baseline soRetry-After: 0still spaces things out. Result is capped at 5 minutes so a misconfigured or hostile server can't pin the agent.Then
agent.tscallscomputeRetryDelayMs(error, attemptIndex)instead of computing the delay inline. The hook surface, log shape, and retry-vs-fallback decision are unchanged.Tests added:
retry-after.spec.ts— 18 unit tests covering parsing edge cases (delta-seconds, HTTP-date, malformed values, past dates, safety cap, missing header, lowercase/canonical precedence).agent.spec.ts— one integration test that verifies aRetry-After: 30on a 429-shaped error flows through tosetTimeoutas 30000 ms.fixes #1276
Notes for reviewers
Math.max(serverHint, exponentialFloor)choice is deliberate: a server that returnsRetry-After: 0should still wait the exponential floor on subsequent attempts, otherwise a hot-loop retry storm is possible. If you prefer "server hint wins absolutely," I'm happy to flip it.MAX_RETRY_AFTER_MS) is tunable; the value matches what most HTTP clients use as a sane upper bound. I kept it as a module-local constant rather than a config knob to avoid expanding the public surface in this PR.executeWithModelFallbackalready disables AI SDK internal retries (maxRetries: 0) for all four entry points, so this is the single retry-delay site that needs the change.Summary by cubic
Honor the provider’s Retry-After header on model retries to respect server backoff and reduce retry storms. Adds robust parsing and uses the server hint as a floor with a 5-minute safety cap; no API changes.
Retry-After(delta-seconds or HTTP-date) fromerror.responseHeadersinexecuteWithModelFallback.retry-afterhelper (parseRetryAfter,getRetryAfterMs,computeRetryDelayMs) and tests for edge cases; agent updated to callcomputeRetryDelayMs.Written for commit b6f5b8c. Summary will update on new commits.
Summary by CodeRabbit