Skip to content

fix(oracle): route maxTokens to maxCompletionTokens for GPT-5 family#1635

Open
fede-kamel wants to merge 1 commit into
Portkey-AI:mainfrom
fede-kamel:feat/oracle-gpt5-max-completion-tokens
Open

fix(oracle): route maxTokens to maxCompletionTokens for GPT-5 family#1635
fede-kamel wants to merge 1 commit into
Portkey-AI:mainfrom
fede-kamel:feat/oracle-gpt5-max-completion-tokens

Conversation

@fede-kamel
Copy link
Copy Markdown

@fede-kamel fede-kamel commented May 6, 2026

Summary

OCI's chat endpoint rejects maxTokens for the GPT-5 family with Use 'maxCompletionTokens' instead, so any GPT-5 request through the gateway currently fails with a 400 unless the caller already knows the OCI quirk and works around it.

This PR detects the GPT-5 family inline (/^openai\.gpt-5/i) and rewrites the parameter on the OCI chat request envelope. Everything else (gpt-4o, Llama, Cohere, Grok, Gemini) continues to use maxTokens unchanged.

Split out from #1537 per @narengogi's review.

What changed

src/providers/oracle/chatComplete.ts:

  • Add a small inline usesMaxCompletionTokens(model) helper (no separate modelConfig.ts module — kept tight).
  • In the chatRequest transform, after building the OCI envelope, swap maxTokensmaxCompletionTokens when the model is in the GPT-5 family.

src/providers/oracle/chatComplete.test.ts:

  • openai.gpt-5 and openai.gpt-5-mini route to maxCompletionTokens.
  • openai.gpt-4o, meta.llama-3.3-70b-instruct, cohere.command-r-plus-08-2024, xai.grok-3-mini, google.gemini-2.5-flash keep maxTokens.

Diff: +63 / −1, two files.

Verified

  • npm run build
  • npm run format:check
  • Type-clean (tsc --noEmit no errors in src/providers/oracle/)
  • Verified live against inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/chat for openai.gpt-5 (now succeeds; previously returned 400 "Use 'maxCompletionTokens' instead").

Related

OCI's chat endpoint rejects `maxTokens` for GPT-5+ models with
`Use 'maxCompletionTokens' instead`, so requests through the gateway
fail with a 400 unless the caller knows the OCI-specific quirk.

Detect the GPT-5 family inline (`/^openai\.gpt-5/i`) and rewrite the
parameter on the OCI chat request envelope. All other model families
(gpt-4o, Llama, Cohere, Grok, Gemini) continue to use `maxTokens`
unchanged.

Tests cover gpt-5, gpt-5-mini, gpt-4o (negative), and the Llama /
Cohere / Grok / Gemini families.
@fede-kamel
Copy link
Copy Markdown
Author

@roh26it @narengogi @VisargD — companion to #1537. Rebased on main, mergeable, e2e-verified live against openai.gpt-5 on us-chicago-1 (HTTP 200; without this fix OCI returns 400 "Use 'maxCompletionTokens' instead"). Diff is +63 / −1, two files. See #1537 for the broader context. Happy to take review feedback.

@fede-kamel
Copy link
Copy Markdown
Author

Friendly review ping @narengogi — this is the GPT-5 maxCompletionTokens split you asked for back when #1537 was trimmed. Description updated to reflect that it supersedes #1540's GPT-5 portion (this one uses an inline regex; the old one had a separate modelConfig.ts module). CI green, mergeable, ~63 lines + tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant