Python: Fix duplicate item error for tool-using agents under service-side storage (#3295) by moonbox3 · Pull Request #5690 · microsoft/agent-framework

moonbox3 · 2026-05-07T10:00:11Z

Motivation and Context

Fixes #3295. When an agent uses tools and any form of service-side storage is in play (a session, a RedisHistoryProvider, a CosmosHistoryProvider, or a workflow that replays prior conversation), the second turn fails against the OpenAI Responses API with:

Error code: 400 - "Duplicate item found with id fc_xxx. Remove duplicate items from your input and try again."

Three field reports converge on the same root cause:

Multi-agent workflow with tools, full conversation replayed across executors (the original report).
Single agent with RedisHistoryProvider and tools, two turns (pamelafox).
Single agent with CosmosHistoryProvider and tools on Foundry, two turns (CristinaStn, still reproducing on 1.1.0).

In every case the prior turn's function-call item, persisted with its server-issued fc_* id in additional_properties, gets replayed inline in the next request while previous_response_id (or conversation) also points the server at the same prior response. The server sees the same fc_* id twice and rejects the request. The same shape reproduces with rs_* (reasoning), approval, and local-shell-call IDs.

The current workaround is to drop tools= from any agent that uses persistent history, which defeats the point.

Description

The fix lives at one chokepoint: the OpenAI Responses chat client's outbound message preparation, gated on the existing request_uses_service_side_storage flag. All three reported repro paths flow through this function, so a single change closes them all.

Behavior contract. When the outgoing request carries previous_response_id, conversation_id, or conversation:

function_call, reasoning, approval-request/response, and local-shell-call items are dropped from the wire input array. The server already has them from the prior response.
function_result items are kept untouched. They reference the prior call by call_id, which the server resolves through previous_response_id. This is the load-bearing wire-format contract.

When the outgoing request carries no storage marker, behavior is byte-identical to today.

This mirrors the existing reasoning-item treatment that was already in place at the same chokepoint, generalizing it to all server-issued item identities.

Defense in depth at the workflow layer. A second commit clears service_session_id in AgentExecutor.run() when the incoming AgentExecutorRequest carries function-call/result items. The wire-level "Duplicate item" error is already closed by the chat-client strip; this commit makes the executor wiring reflect intent (the executor is replaying a conversation it does not own, so it should not present continuation context from a prior response at all). The from_response handoff path is unchanged and continues to preserve service_session_id for legitimate continuation.

Live-API validation. Before merging, the wire-format contract was validated against live OpenAI and Foundry endpoints across four repro shapes:

Path	OpenAI	Foundry
Baseline (no storage)	OK	OK
Session continuation with tools	OK	OK
`RedisHistoryProvider` with tools, two turns	OK	OK
Direct API: `previous_response_id` + only `function_call_output` (no inline `function_call`)	OK	n/a

The third row is the load-bearing probe: it confirms the OpenAI Responses API does pair function_call_output.call_id against the prior response's stored function_call when previous_response_id is set, with no inline function_call required. CosmosHistoryProvider was not run separately because it shares the chat-client chokepoint with Redis; if Redis passes, Cosmos passes by the same code path.

No public API change. No persistence-format change. No new configuration knobs.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? No. Wire format change is subtractive and only on requests that would otherwise 400.

Copilot

Pull request overview

This PR targets a 400 “Duplicate item found with id fc_* …” error encountered when tool-using agents run with service-side storage (e.g., previous_response_id / conversation_id), by changing how the OpenAI Responses client prepares outbound input items during continuation requests and adding a workflow-layer safeguard.

Changes:

Update the OpenAI Responses chat client message preparation to (conditionally) strip server-issued response items from the outgoing input when service-side storage is in use.
Add a defense-in-depth change in AgentExecutor.run() to clear service_session_id when a workflow replays tool history via AgentExecutorRequest.
Update/add unit tests across OpenAI/Foundry clients and workflow tests; add an issue-tracking markdown writeup.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
python/packages/openai/agent_framework_openai/_chat_client.py	Adds a request-level storage flag to message preparation and strips server-issued items (reasoning/function_call/approval/local-shell-call) on continuation requests.
python/packages/openai/tests/openai/test_openai_chat_client.py	Updates existing tests for the new signature/behavior and adds regression tests for stripping behavior under storage.
python/packages/foundry/tests/foundry/test_foundry_chat_client.py	Updates Foundry test usage to pass the new `request_uses_service_side_storage` flag.
python/packages/core/agent_framework/_workflows/_agent_executor.py	Clears `service_session_id` when a run request includes tool items (full-history replay scenario).
python/packages/core/tests/workflow/test_full_conversation.py	Removes the previous xfail and validates `service_session_id` is cleared on full-history replay.
issues/done/3295-02-strip-rule-fix.md	Adds a tracking doc describing the intended fix/contract and acceptance criteria.

github-actions

Automated Code Review

Reviewers: 4 | Confidence: 91%

✓ Correctness

The PR correctly fixes the duplicate item error by stripping server-issued response items (function_call, reasoning, approval, local-shell-call) from the wire input when service-side storage is active. The logic is sound: strip_server_items is correctly derived from uses_service_side_storage, the function_result/function_call_output is correctly preserved (keeping call_id for server-side pairing), and the executor defense-in-depth only fires for the coordinator-replay path (AgentExecutorRequest with tool items). The from_response path is properly left unchanged. Reasoning items becoming unconditionally dropped is intentional and correct—the old code that kept them under storage+function_call was actually vulnerable to the same rs_* duplicate issue. No correctness bugs found.

✓ Security Reliability

The PR implements a well-structured fix for duplicate item errors when using service-side storage with tool-calling agents. The primary fix is at the chat-client message preparation chokepoint, gated on the existing request_uses_service_side_storage flag. A defense-in-depth change clears service_session_id in the executor when replayed tool items are detected. No security vulnerabilities, resource leaks, or unhandled failure modes were identified. The Content.additional_properties is always a dict (never None per _types.py:528-530), Message.contents is always a list (per _types.py:1744-1747), and session state mutation is safe in Python's async model. The default parameter value of request_uses_service_side_storage=True is fail-safe (strips potential duplicates rather than causing 400 errors).

✓ Test Coverage

The PR has strong test coverage. New regression tests directly validate the core fix (strip function_call/reasoning/approval/shell-call items under service-side storage, keep them without). All four affected content types have dedicated storage-on/off test cases. The executor defense-in-depth clearing is validated via an integration test (previously xfail, now passing). Existing tests were properly updated to explicitly pass the new flag. No significant test coverage gaps found.

✗ Design Approach

The PR’s new “strip all reasoning under service-side storage” rule conflicts with existing, still-authoritative OpenAI chat-client tests that require reasoning items to be preserved on continuation turns, except for attributed replay history. I did not find other design-approach issues with comparable evidence.

Suggestions

Keep the new strip rule for duplicated server-issued items (function_call, approval, local-shell-call results) but preserve the existing reasoning contract: continuation requests should still carry live reasoning items, while only attributed-replay reasoning is omitted.

Automated review by moonbox3's agents

moonbox3 · 2026-05-07T10:27:56Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/openai/agent_framework_openai
_chat_client.py	1045	139	86%	276, 289, 614–617, 621–622, 628–629, 664–670, 677–679, 700, 708, 731, 849, 948, 1007, 1009, 1011, 1013, 1079, 1093, 1173, 1183, 1188, 1231, 1342–1343, 1358, 1566, 1571, 1575–1577, 1581–1582, 1665, 1675, 1702, 1708, 1718, 1724, 1729, 1735, 1740–1741, 1760, 1763–1766, 1780, 1782, 1790–1791, 1803, 1845, 1935, 1957–1958, 1973–1974, 1992–1993, 2036, 2202, 2240–2241, 2257, 2259, 2339–2347, 2377, 2487, 2522, 2537, 2557–2567, 2580, 2591–2595, 2609, 2623–2634, 2643, 2675–2678, 2688–2689, 2700–2702, 2716–2718, 2728–2729, 2735, 2750
TOTAL	33819	3906	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
6662	30 💤	0 ❌	0 🔥	1m 42s ⏱️

…t#3295) Fixes microsoft#3295. When the OpenAI Responses chat client sends a request that carries previous_response_id / conversation_id / conversation, the server already has the prior turn's response items and rejects duplicates with "Duplicate item found with id fc_xxx". The chat client was re-sending them inline whenever the input messages still carried the items in additional_properties (workflow replay, history providers, etc.), which broke any tool-using agent with persistent history. Decisions: - Single chokepoint: _prepare_message_for_openai. When the resulting request uses service-side storage, drop function_call, reasoning, approval-request/response, and local-shell-call items from the wire input. Keep function_result with its call_id; the server pairs it to the prior function_call via that key. - function_result is preserved unconditionally except for the local-shell variant, which carries its own server-issued item id. - No public API change. Wire format change is subtractive and only on requests that would otherwise 400. - Re-pointed the strict-xfail in test_full_conversation.py from microsoft#4047 to microsoft#3295. Kept xfail because the test asserts executor-level session-id clearing, which is the defense-in-depth half tracked by 3295-03; this slice closes the wire-level half. Files: - python/packages/openai/agent_framework_openai/_chat_client.py: strip rule applied alongside the existing reasoning-item branch. - python/packages/openai/tests/openai/test_openai_chat_client.py: four new tests pin the contract (function_call, approval, local-shell-call stripped under storage; everything kept without storage). Updated pre-existing tests that exercised the storage-on path to either pass request_uses_service_side_storage=False explicitly or assert the new strip behavior. - python/packages/foundry/tests/foundry/test_foundry_chat_client.py: same explicit storage-off opt-in for the inherited test. - python/packages/core/tests/workflow/test_full_conversation.py: re-pointed xfail reason to microsoft#3295 and the executor-level follow-up. Notes for next iteration: - 3295-01 (HITL wire-format validation against live OpenAI/Foundry) was not run; it requires the user's API credentials. The PRD design is locked but the empirical confirmation is still pending. If script 3 fails on either provider, this slice may need to be revisited. - 3295-03 (clear service_session_id in AgentExecutor on full-history replay) remains open. After it lands the xfail in test_full_conversation.py can be removed. - pytest was not run in this iteration because uv-based pytest commands required interactive approval. Validation rests on careful reading; next iteration should run the openai + core test suites.

Copilot AI review requested due to automatic review settings May 7, 2026 10:00

moonbox3 added documentation Improvements or additions to documentation python labels May 7, 2026

Copilot started reviewing on behalf of moonbox3 May 7, 2026 10:01 View session

moonbox3 force-pushed the 3295-fix branch from f6601b3 to 509de0e Compare May 7, 2026 10:04

moonbox3 commented May 7, 2026

View reviewed changes

Comment thread python/packages/openai/agent_framework_openai/_chat_client.py Outdated

Copilot AI reviewed May 7, 2026

View reviewed changes

Comment thread python/packages/openai/agent_framework_openai/_chat_client.py Outdated

Comment thread python/packages/openai/tests/openai/test_openai_chat_client.py

github-actions Bot reviewed May 7, 2026

View reviewed changes

moonbox3 force-pushed the 3295-fix branch from 509de0e to 5b08fc2 Compare May 7, 2026 10:25

lokitoth reviewed May 7, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_workflows/_agent_executor.py Outdated

lokitoth approved these changes May 7, 2026

View reviewed changes

TaoChenOSU reviewed May 7, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/_workflows/_agent_executor.py Outdated

moonbox3 force-pushed the 3295-fix branch 3 times, most recently from e7e73e4 to 11dce09 Compare May 7, 2026 22:35

moonbox3 force-pushed the 3295-fix branch from 11dce09 to dee29da Compare May 7, 2026 22:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Fix duplicate item error for tool-using agents under service-side storage (#3295)#5690

Python: Fix duplicate item error for tool-using agents under service-side storage (#3295)#5690
moonbox3 wants to merge 1 commit intomicrosoft:mainfrom
moonbox3:3295-fix

moonbox3 commented May 7, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

moonbox3 commented May 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

moonbox3 commented May 7, 2026

Motivation and Context

Description

Contribution Checklist

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✗ Design Approach

Suggestions

Uh oh!

moonbox3 commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

moonbox3 commented May 7, 2026 •

edited

Loading