Skip to content

fix(streamHandler): prevent V8 GC of upstream Response mid-stream#1658

Open
meitalbensinai wants to merge 1 commit into
Portkey-AI:mainfrom
meitalbensinai:fix/streamhandler-response-gc
Open

fix(streamHandler): prevent V8 GC of upstream Response mid-stream#1658
meitalbensinai wants to merge 1 commit into
Portkey-AI:mainfrom
meitalbensinai:fix/streamhandler-response-gc

Conversation

@meitalbensinai
Copy link
Copy Markdown

@meitalbensinai meitalbensinai commented May 19, 2026

Summary

Streamed responses through handleStreamingMode can be severed mid-stream under concurrent load. Root cause: V8 garbage-collects the upstream Response object before its body finishes streaming, because no live reference to it is held for the stream's lifetime.

This PR holds strong references to the upstream response, reader, writer, and the piping task for the entire stream lifetime by anchoring them on the returned readable.

Symptoms observed under load

In the gateway logs:

Error during stream processing: openai [ undefined, 'Response object has been garbage collected' ]
Failed to close the writer: openai TypeError [ERR_INVALID_STATE]: Invalid state: WritableStream is closed

In the consumer (Node fetch / undici / OpenAI SDK):

TypeError: terminated
    at Fetch.onAborted (node:internal/deps/undici/undici:...)
    at TLSSocket.onHttpSocketClose (...)
  cause: { name: 'SocketError', message: 'other side closed' }

The consumer sees finishReason: 'other' and a truncated stream — often missing the final chunk(s) including any tool-call payloads.

PR #1306 ("fix: handle stream close failures", merged Sep 2025) names the same "Response object has been garbage collected" error in its description, but its scope was to wrap the secondary writer.close() failure in a try/catch — which prevents the unhandled rejection from crashing the process, but does not prevent the primary GC of the upstream Response. That stream is still lost.

Root cause

In handleStreamingMode:

export function handleStreamingMode(response: Response, ...) {
  const { readable, writable } = new TransformStream();
  const writer  = writable.getWriter();
  const reader  = response.body.getReader();   // ← only reader is captured

  (async () => {
    try {
      for await (const chunk of readStream(reader, ...)) {
        await writer.write(encoder.encode(chunk));
      }
    } catch (error) { console.error(...); }
    finally { try { await writer.close(); } catch (e) { console.error(...); } }
  })();   // ← IIFE promise discarded

  return new Response(readable, response);   // caller only holds THIS
}

After return, the only externally-reachable references are the caller's new Response (wrapping readable) and writable. The original upstream response is no longer referenced from any variable that outlives the function.

The IIFE closure does capture reader and writer, but not response itself. In Node's undici-backed fetch, ReadableStreamDefaultReader does not keep its parent Response alive — the Response is what owns the underlying network connection / dispatcher state. When V8 GC fires (driven by allocation churn from concurrent streams, not absolute memory pressure), the upstream response can be collected. The next reader.read() then throws "Response object has been garbage collected" from inside undici / the OpenAI SDK's stream code.

The IIFE promise itself is also a GC hazard — it is not anchored anywhere. In current V8 it usually stays alive via microtask-queue references, but under sustained load with aggressive GC there is no guarantee.

Why concurrency exposes this

Each concurrent stream allocates a fresh Response, a ReadableStreamDefaultReader, a TransformStream with its buffers, encoder/decoder state, and many Uint8Array chunks. At a few dozen concurrent streams, GC is provoked multiple times per second. The wider the GC sweep, the higher the chance it collects an upstream Response whose only strong reference (the IIFE closure) does not include it. Frequency scales roughly with concurrent stream count.

Why this is not fixed by newer Node / undici / OpenAI SDK

It is a reference-management bug in how handleStreamingMode keeps the upstream Response alive. Newer Node versions are more aggressive about freeing unreferenced HTTP resources, which makes the bug worse, not better.

The fix

Pack the upstream response, reader, and writer into a streamCtx object that the IIFE references explicitly. Capture the IIFE promise as streamTask. Anchor both on the returned readable (which the caller's Response keeps alive for the duration of the stream):

const streamCtx  = { response, reader, writer };
let   streamTask: Promise<void>;

if (proxyProvider === BEDROCK) {
  streamTask = (async () => { /* uses streamCtx.reader / streamCtx.writer */ })();
} else {
  streamTask = (async () => { /* same */ })();
}

(readable as any).__pkg_streamTask = streamTask;
(readable as any).__pkg_streamCtx  = streamCtx;

The chain of strong references after the fix:

caller's Response  →  readable  →  __pkg_streamCtx  →  upstream response
                              →  __pkg_streamTask  →  the IIFE closure

As long as the caller holds the returned Response (which it does for the entire stream duration), the entire chain is GC-protected.

Applied identically to the BEDROCK and non-BEDROCK code paths.

Why this is safe

  • Adds two non-enumerable-style properties to readable. The cast is as any; they are not exposed to the consumer.
  • No behavior change for the success path or the error path.
  • No new throws, no new code paths.
  • Each stream gets its own streamCtx — no shared state, scales linearly with concurrency.
  • build (rollup) passes cleanly on the modified file; no new TypeScript warnings.
  • prettier --check passes.

What this does NOT fix

  • Genuine upstream timeouts / TLS drops — those are real connection errors that need separate handling.
  • Out-of-memory at very high concurrent loads — the fix keeps refs alive longer, which is the goal, but the pod still has finite memory. Operators running at very high concurrency may need to raise memory limits.

Test plan

  • npx rollup -c builds clean
  • npx prettier --check passes
  • A unit test that simulates GC pressure (requires --expose-gc); happy to add separately if maintainers want it
  • Soak test with concurrent streams of varying durations

Reported in #1659.

Under concurrent streaming load, `handleStreamingMode` returns a new
`Response` wrapping `readable`. The upstream `response` is then no
longer referenced from any user-visible variable. The unawaited async
IIFE that pipes upstream → writer captures only `reader` and `writer`
in its closure, not `response` itself.

In Node's undici-backed fetch, the `ReadableStreamDefaultReader` does
not keep its parent `Response` alive — the `Response` owns the
underlying network connection. When V8 GC runs (driven by allocation
churn from concurrent streams, not absolute memory pressure), the
upstream `response` can be collected before the body finishes
streaming. The next `reader.read()` then throws
"Response object has been garbage collected", the IIFE catches it and
closes the writer, and the consumer sees a truncated stream / TLS
close.

The async IIFE itself is also a GC hazard — its promise is not
anchored anywhere. In practice the microtask queue keeps it alive, but
that is not guaranteed under aggressive GC.

Fix: pack `response`, `reader`, and `writer` into a `streamCtx` object
that the closure references explicitly, capture the IIFE promise as
`streamTask`, and anchor both on the returned `readable` (which the
caller's `Response` keeps alive for the duration of the stream).

Reference chain after the fix:
  caller's Response → readable → __pkg_streamCtx → upstream response
                              → __pkg_streamTask → IIFE closure

This keeps the upstream response, the reader, the writer, and the
piping task strongly referenced for the entire stream lifetime, with
zero behavior change for the happy path.
@jacobya-1
Copy link
Copy Markdown

jacobya-1 commented May 19, 2026

Confirming we hit this on 1.15.2 in production, same Response object has been garbage collected + WritableStream is closed signature. Thanks for the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants