feat(context): 压缩上下文时保留尾部发言,以免丢失用户最新指示#8423
Conversation
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Consider making
PRESERVE_TAIL_CHARSconfigurable (e.g., via a constructor parameter or settings object) rather than a fixed module-level constant so different deployments can tune the trade-off between context detail and token usage. - In
extract_text_from_messages, onlyTextPartelements are considered from list content; if other part types (e.g., code or other structured text parts) exist or are added later, you may want to handle them explicitly or document why they are intentionally omitted from the preserved tail.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Consider making `PRESERVE_TAIL_CHARS` configurable (e.g., via a constructor parameter or settings object) rather than a fixed module-level constant so different deployments can tune the trade-off between context detail and token usage.
- In `extract_text_from_messages`, only `TextPart` elements are considered from list content; if other part types (e.g., code or other structured text parts) exist or are added later, you may want to handle them explicitly or document why they are intentionally omitted from the preserved tail.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request introduces a mechanism to preserve the tail of the original conversation text (up to 10,000 characters) when compressing context, allowing the LLM to retain both a high-level summary and recent conversation details. Feedback on these changes suggests enhancing the text extraction logic to include and format tool_calls (and importing ToolCall accordingly) to prevent losing critical tool-execution context when message content is null. Additionally, it is recommended to prepend an ellipsis to the tail text when it is truncated to signal to the LLM that the conversation history has been shortened.
| def extract_text_from_messages(messages: list[Message]) -> str: | ||
| """Extract text content from a list of messages into a single string. | ||
|
|
||
| Each message is formatted as "[role]: content" for readability. | ||
|
|
||
| Args: | ||
| messages: The messages to extract text from. | ||
|
|
||
| Returns: | ||
| A concatenated string of all text content. | ||
| """ | ||
| parts: list[str] = [] | ||
| for msg in messages: | ||
| if msg.content is None: | ||
| continue | ||
| if isinstance(msg.content, str): | ||
| parts.append(f"[{msg.role}]: {msg.content}") | ||
| elif isinstance(msg.content, list): | ||
| text_segments = [ | ||
| part.text for part in msg.content if isinstance(part, TextPart) | ||
| ] | ||
| if text_segments: | ||
| parts.append(f"[{msg.role}]: {''.join(text_segments)}") | ||
| return "\n".join(parts) |
There was a problem hiding this comment.
When compressing messages, assistant messages often contain tool_calls instead of or in addition to standard text content (with content being None or empty). If we skip messages where content is None, we completely lose the context of what tools the assistant called, which makes the subsequent tool response messages in the history highly confusing to the LLM.
We should update extract_text_from_messages to also extract and format tool_calls when they are present.
def extract_text_from_messages(messages: list[Message]) -> str:
"""Extract text content from a list of messages into a single string.
Each message is formatted as "[role]: content" for readability.
Args:
messages: The messages to extract text from.
Returns:
A concatenated string of all text content.
"""
parts: list[str] = []
for msg in messages:
msg_text = ""
if isinstance(msg.content, str):
msg_text = msg.content
elif isinstance(msg.content, list):
text_segments = [
part.text for part in msg.content if isinstance(part, TextPart)
]
if text_segments:
msg_text = "".join(text_segments)
if msg.tool_calls:
tool_calls_desc = []
for tool_call in msg.tool_calls:
if isinstance(tool_call, ToolCall):
name = tool_call.function.name
args = tool_call.function.arguments
elif isinstance(tool_call, dict):
func = tool_call.get("function", {})
name = func.get("name", "")
args = func.get("arguments", "")
else:
continue
tool_calls_desc.append(f"call tool: {name}({args})")
if tool_calls_desc:
extra = "; ".join(tool_calls_desc)
msg_text = f"{msg_text} [{extra}]" if msg_text else f"[{extra}]"
if msg_text:
parts.append(f"[{msg.role}]: {msg_text}")
return "\n".join(parts)| from typing import TYPE_CHECKING, Protocol, runtime_checkable | ||
|
|
||
| from ..message import Message | ||
| from ..message import Message, TextPart |
| tail_text = extract_text_from_messages(messages_to_summarize) | ||
| if len(tail_text) > PRESERVE_TAIL_CHARS: | ||
| tail_text = tail_text[-PRESERVE_TAIL_CHARS:] |
There was a problem hiding this comment.
When the tail text exceeds PRESERVE_TAIL_CHARS and is truncated, prepending an ellipsis (...) helps indicate to the LLM that the text has been truncated from a longer conversation history.
| tail_text = extract_text_from_messages(messages_to_summarize) | |
| if len(tail_text) > PRESERVE_TAIL_CHARS: | |
| tail_text = tail_text[-PRESERVE_TAIL_CHARS:] | |
| tail_text = extract_text_from_messages(messages_to_summarize) | |
| if len(tail_text) > PRESERVE_TAIL_CHARS: | |
| tail_text = "..." + tail_text[-PRESERVE_TAIL_CHARS:] |
Motivation / 动机
LLMSummaryCompressor在压缩上下文时,仅使用 LLM 生成的摘要作为压缩消息。摘要是高度概括的,会丢失具体的代码片段、报错信息、配置参数等细节,导致模型在后续对话中"失忆"。此 PR 在保留摘要的同时,额外保留被压缩消息尾部最后 10K 字符的原始对话内容,让模型仍能引用到压缩前最近的具体上下文。
注:codex默认保留20K,这里10K是保守估计
Modifications / 改动点
新增
PRESERVE_TAIL_CHARS常量(10000),控制保留的尾部字符数新增
extract_text_from_messages()辅助函数,从消息列表中提取文本内容修改
LLMSummaryCompressor.__call__(),在生成摘要后追加尾部原文到压缩消息中This is NOT a breaking change. / 这不是一个破坏性变更。
Screenshots or Test Results / 运行截图或测试结果
所有现有测试通过,无新增依赖。
Checklist / 检查清单
Summary by Sourcery
Preserve recent raw conversation details alongside LLM-generated summaries during context compression to reduce information loss.
New Features:
Enhancements: