fix(streamHandler): prevent V8 GC of upstream Response mid-stream#1658
Open
meitalbensinai wants to merge 1 commit into
Open
fix(streamHandler): prevent V8 GC of upstream Response mid-stream#1658meitalbensinai wants to merge 1 commit into
meitalbensinai wants to merge 1 commit into
Conversation
Under concurrent streaming load, `handleStreamingMode` returns a new
`Response` wrapping `readable`. The upstream `response` is then no
longer referenced from any user-visible variable. The unawaited async
IIFE that pipes upstream → writer captures only `reader` and `writer`
in its closure, not `response` itself.
In Node's undici-backed fetch, the `ReadableStreamDefaultReader` does
not keep its parent `Response` alive — the `Response` owns the
underlying network connection. When V8 GC runs (driven by allocation
churn from concurrent streams, not absolute memory pressure), the
upstream `response` can be collected before the body finishes
streaming. The next `reader.read()` then throws
"Response object has been garbage collected", the IIFE catches it and
closes the writer, and the consumer sees a truncated stream / TLS
close.
The async IIFE itself is also a GC hazard — its promise is not
anchored anywhere. In practice the microtask queue keeps it alive, but
that is not guaranteed under aggressive GC.
Fix: pack `response`, `reader`, and `writer` into a `streamCtx` object
that the closure references explicitly, capture the IIFE promise as
`streamTask`, and anchor both on the returned `readable` (which the
caller's `Response` keeps alive for the duration of the stream).
Reference chain after the fix:
caller's Response → readable → __pkg_streamCtx → upstream response
→ __pkg_streamTask → IIFE closure
This keeps the upstream response, the reader, the writer, and the
piping task strongly referenced for the entire stream lifetime, with
zero behavior change for the happy path.
|
Confirming we hit this on 1.15.2 in production, same Response object has been garbage collected + WritableStream is closed signature. Thanks for the fix. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Streamed responses through
handleStreamingModecan be severed mid-stream under concurrent load. Root cause: V8 garbage-collects the upstreamResponseobject before its body finishes streaming, because no live reference to it is held for the stream's lifetime.This PR holds strong references to the upstream
response,reader,writer, and the piping task for the entire stream lifetime by anchoring them on the returnedreadable.Symptoms observed under load
In the gateway logs:
In the consumer (Node fetch / undici / OpenAI SDK):
The consumer sees
finishReason: 'other'and a truncated stream — often missing the final chunk(s) including any tool-call payloads.PR #1306 ("fix: handle stream close failures", merged Sep 2025) names the same
"Response object has been garbage collected"error in its description, but its scope was to wrap the secondarywriter.close()failure in a try/catch — which prevents the unhandled rejection from crashing the process, but does not prevent the primary GC of the upstreamResponse. That stream is still lost.Root cause
In
handleStreamingMode:After return, the only externally-reachable references are the caller's new
Response(wrappingreadable) andwritable. The original upstreamresponseis no longer referenced from any variable that outlives the function.The IIFE closure does capture
readerandwriter, but notresponseitself. In Node's undici-backed fetch,ReadableStreamDefaultReaderdoes not keep its parentResponsealive — theResponseis what owns the underlying network connection / dispatcher state. When V8 GC fires (driven by allocation churn from concurrent streams, not absolute memory pressure), the upstreamresponsecan be collected. The nextreader.read()then throws"Response object has been garbage collected"from inside undici / the OpenAI SDK's stream code.The IIFE promise itself is also a GC hazard — it is not anchored anywhere. In current V8 it usually stays alive via microtask-queue references, but under sustained load with aggressive GC there is no guarantee.
Why concurrency exposes this
Each concurrent stream allocates a fresh
Response, aReadableStreamDefaultReader, aTransformStreamwith its buffers, encoder/decoder state, and manyUint8Arraychunks. At a few dozen concurrent streams, GC is provoked multiple times per second. The wider the GC sweep, the higher the chance it collects an upstreamResponsewhose only strong reference (the IIFE closure) does not include it. Frequency scales roughly with concurrent stream count.Why this is not fixed by newer Node / undici / OpenAI SDK
It is a reference-management bug in how
handleStreamingModekeeps the upstreamResponsealive. Newer Node versions are more aggressive about freeing unreferenced HTTP resources, which makes the bug worse, not better.The fix
Pack the upstream
response,reader, andwriterinto astreamCtxobject that the IIFE references explicitly. Capture the IIFE promise asstreamTask. Anchor both on the returnedreadable(which the caller'sResponsekeeps alive for the duration of the stream):The chain of strong references after the fix:
As long as the caller holds the returned
Response(which it does for the entire stream duration), the entire chain is GC-protected.Applied identically to the BEDROCK and non-BEDROCK code paths.
Why this is safe
readable. The cast isas any; they are not exposed to the consumer.streamCtx— no shared state, scales linearly with concurrency.build(rollup) passes cleanly on the modified file; no new TypeScript warnings.prettier --checkpasses.What this does NOT fix
Test plan
npx rollup -cbuilds cleannpx prettier --checkpasses--expose-gc); happy to add separately if maintainers want itReported in #1659.