Skip to content

fix(huggingface): use HF inference router for default base URL#1631

Draft
cybertron288 wants to merge 1 commit into
Portkey-AI:mainfrom
cybertron288:fix/issue-1626
Draft

fix(huggingface): use HF inference router for default base URL#1631
cybertron288 wants to merge 1 commit into
Portkey-AI:mainfrom
cybertron288:fix/issue-1626

Conversation

@cybertron288
Copy link
Copy Markdown

Closes #1626

What

Update the default Hugging Face base URL from https://api-inference.huggingface.co (the legacy HF Serverless Inference API) to https://router.huggingface.co (the new HF Inference Providers OpenAI-compatible router), and drop the per-model path from the default endpoint construction since the router takes the model in the request body.

Why

The legacy serverless API endpoint with the /models/{model}/v1/chat/completions path has been superseded by HF's unified Inference Providers router (per https://huggingface.co/docs/inference-providers). Issue #1626 reports the documented Hugging Face setup no longer working through Portkey, and points at the outdated default URL.

Notes

  • Opened as draft — please review before marking ready.
  • The huggingfaceBaseUrl override path already used a bare /v1/chat/completions endpoint (no /models/{model}), so dedicated-endpoint users are unaffected.
  • Verification: no test suite was run locally — the change is a config-only edit. Reviewers should sanity-check the new URL and the assumption that the inference router accepts the OpenAI-compatible chat-completions / completions schema with model in the body.

…ey-AI#1626)

The default base URL 'https://api-inference.huggingface.co' is the
legacy HF Serverless Inference API endpoint, which has been deprecated
in favor of the unified HF Inference Providers router. The legacy
endpoint with the per-model path '/models/{model}/v1/chat/completions'
is no longer the recommended OpenAI-compatible entrypoint.

Switch the default base URL to 'https://router.huggingface.co', which
is the OpenAI-compatible inference router and accepts the model in the
request body rather than the URL path. Custom huggingfaceBaseUrl
behavior is unchanged: it already ignored the per-model path and used
'/v1/chat/completions' directly, so the unified getEndpoint matches
existing dedicated-endpoint behavior.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hugging Face provider does not work with the documented setup

1 participant