fix(huggingface): use HF inference router for default base URL#1631
Draft
cybertron288 wants to merge 1 commit into
Draft
fix(huggingface): use HF inference router for default base URL#1631cybertron288 wants to merge 1 commit into
cybertron288 wants to merge 1 commit into
Conversation
…ey-AI#1626) The default base URL 'https://api-inference.huggingface.co' is the legacy HF Serverless Inference API endpoint, which has been deprecated in favor of the unified HF Inference Providers router. The legacy endpoint with the per-model path '/models/{model}/v1/chat/completions' is no longer the recommended OpenAI-compatible entrypoint. Switch the default base URL to 'https://router.huggingface.co', which is the OpenAI-compatible inference router and accepts the model in the request body rather than the URL path. Custom huggingfaceBaseUrl behavior is unchanged: it already ignored the per-model path and used '/v1/chat/completions' directly, so the unified getEndpoint matches existing dedicated-endpoint behavior.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1626
What
Update the default Hugging Face base URL from
https://api-inference.huggingface.co(the legacy HF Serverless Inference API) tohttps://router.huggingface.co(the new HF Inference Providers OpenAI-compatible router), and drop the per-model path from the default endpoint construction since the router takes the model in the request body.Why
The legacy serverless API endpoint with the
/models/{model}/v1/chat/completionspath has been superseded by HF's unified Inference Providers router (per https://huggingface.co/docs/inference-providers). Issue #1626 reports the documented Hugging Face setup no longer working through Portkey, and points at the outdated default URL.Notes
huggingfaceBaseUrloverride path already used a bare/v1/chat/completionsendpoint (no/models/{model}), so dedicated-endpoint users are unaffected.