-
Notifications
You must be signed in to change notification settings - Fork 32
Preserve pi-acp model metadata through LiteLLM proxy #803
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -170,6 +170,48 @@ async def fake_start(**kwargs): | |
| ) | ||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_pi_acp_proxy_preserves_provider_model_metadata(monkeypatch): | ||
| """Guards PR #803: Pi metadata follows the LiteLLM alias in proxy mode.""" | ||
|
|
||
| async def fake_start(**kwargs): | ||
| return FakeLiteLLMServer("http://172.17.0.1:45678", kwargs["route"]) | ||
|
|
||
| monkeypatch.setattr(runtime_mod, "_start_host_litellm", fake_start) | ||
| provider_models = [ | ||
| { | ||
| "id": "Qwen/Qwen3-4B", | ||
| "name": "Qwen/Qwen3-4B", | ||
| "reasoning": False, | ||
| "input": ["text"], | ||
| "contextWindow": 16384, | ||
| "maxTokens": 1024, | ||
| } | ||
| ] | ||
|
|
||
| updated, provider_runtime = await ensure_litellm_runtime( | ||
| agent="pi-acp", | ||
| agent_env={ | ||
| "BENCHFLOW_PROVIDER_BASE_URL": "http://172.17.0.1:8000/v1", | ||
| "BENCHFLOW_PROVIDER_API_KEY": "dummy", | ||
| "BENCHFLOW_PROVIDER_MODELS": json.dumps(provider_models), | ||
| }, | ||
| model="vllm/Qwen/Qwen3-4B", | ||
| runtime=None, | ||
| environment="docker", | ||
| session_id="run-1", | ||
| usage_tracking="required", | ||
| ) | ||
|
|
||
| assert provider_runtime is not None | ||
| assert updated["BENCHFLOW_PROVIDER_MODEL"] == "benchflow-vllm-Qwen-Qwen3-4B" | ||
| models = json.loads(updated["BENCHFLOW_PROVIDER_MODELS"]) | ||
| alias = next(m for m in models if m["id"] == "benchflow-vllm-Qwen-Qwen3-4B") | ||
| assert alias["name"] == "benchflow-vllm-Qwen-Qwen3-4B" | ||
| assert alias["maxTokens"] == 1024 | ||
| assert alias["contextWindow"] == 16384 | ||
|
Comment on lines
+206
to
+212
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The test confirms Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time! |
||
|
|
||
|
|
||
| @pytest.mark.asyncio | ||
| async def test_runtime_reuse_and_stop(monkeypatch): | ||
| created = [] | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If
_provider_models_for_proxy_aliasreturnsNone(e.g.,BENCHFLOW_PROVIDER_MODELSis set but none of its entries have anidmatching any variant inwanted),updated["BENCHFLOW_PROVIDER_MODELS"]is left unchanged — keeping the original entry list that does not contain the alias ID. Pi-acp would then receive a model list without the proxied alias, causing the same lookup failure this PR aims to fix, with no log to indicate the miss. Adding a debug/warning log on theNonereturn paths in_provider_models_for_proxy_aliaswould make this much easier to diagnose in production.