test(agents): contract tests for _parse_native_tool_call shapes#6035
test(agents): contract tests for _parse_native_tool_call shapes#6035alvinttang wants to merge 1 commit into
Conversation
Locks in the wire-format contract for every native tool-call shape CrewAgentExecutor parses so provider-specific argument bugs (crewAIInc#4972 — Bedrock Converse 'input' silently dropped) can't quietly regress. Coverage: - OpenAI-style object (function.arguments JSON string) - Gemini-style function_call object (Struct mapping coerced to dict) - Anthropic-style object (name + input on the call object itself) - Dict variants: OpenAI nested function, Bedrock toolUseId+input, mixed payloads with empty function + populated input - Negative cases: empty input dict stays empty, fully missing args returns empty dict, unknown shape returns None Each assertion checks the parsed (call_id, func_name, func_args) tuple before pydantic validation or tool dispatch runs, so a future truthy-default regression (the original crewAIInc#4972 cause) fails fast at parse time with a clear assertion message. Refs crewAIInc#4972
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR adds a comprehensive contract test module for ChangesNative Tool Call Parsing Contract Tests
Estimated Code Review Effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Refs #4972.
@andrewclayman flagged on the issue that this kind of provider-specific tool-call bug really wants a small contract test per shape. The original fix landed in 25fcf39, but there's no unit-level coverage that would have caught the truthy-default regression at parse time. This PR adds that net.
What this locks in
Every native tool-call shape
_parse_native_tool_callhandles, asserted on the parsed tuple before pydantic validation or tool dispatch runs:function.argumentsJSON string)function_callobject (Struct mapping coerced to dict)name+inputon the call object itself)toolUseId+input, mixed payloads with empty function + populated inputinputdict stays empty, fully missing args returns empty dict, unknown shape returnsNoneRED (simulated regression)
Reintroduced the original truthy-default bug locally —
func_info.get("arguments", "{}")instead offunc_info.get("arguments")— and reran the suite:The five failures point straight at the dict-branch fall-through.
GREEN (current main)
Pure test-only PR. No production code changed.
Summary by CodeRabbit