Fix GQA out-of-bounds read in past KV buffer (CPU) by tianleiwu · Pull Request #29447 · microsoft/onnxruntime

tianleiwu · 2026-06-30T20:46:16Z

Description

Fixes an out-of-bounds (OOB) read in the CPU GroupQueryAttention (GQA) operator. During token generation (decode), ConcatStateChunkGQA copies seqlens_k + 1 - sequence_length rows out of the past key/value buffers. The existing validation only bounded the present buffer write (seqlens_k < present_kv_seqlen), but the present buffer can be larger than the past buffer when total_sequence_length exceeds the past sequence length. A large seqlens_k combined with a small past buffer therefore read past the end of the past key/value tensors.

Motivation and Context

present_kv_seqlen = max(total_sequence_length, past_sequence_length), so the pre-existing seqlens_k < present_kv_seqlen check does not bound the past-side read when total_sequence_length > past_sequence_length. With a crafted seqlens_k, the decode path in ConcatStateChunkGQA reads seqlens_k + 1 - sequence_length rows from the smaller past buffer, causing an OOB read.

Key Changes

File	Change
`onnxruntime/contrib_ops/cpu/bert/group_query_attention.cc`	Add a per-batch bound in the `seqlens_k` validation block: for the decode case (`past_kv_seqlen > 0 && kv_sequence_length != 0 && !is_first_prompt`), reject when `seqlens_k + 1 - sequence_length > past_kv_seqlen`.
`onnxruntime/test/contrib_ops/group_query_attention_op_test.cc`	Add regression test `SeqlensKExceedsPastBuffer_OOBRead` exercising a large `seqlens_k` against a small past buffer.

Shared KV (kv_sequence_length == 0) appends no new KV and its past read is already bounded by the present-buffer check together with the total_sequence_length <= seqlen_past_kv_cache enforcement in the apply-attention paths, so it needs no additional check.

Testing Notes

Build and run the GQA tests:

cmake --build build/Linux/Debug --target onnxruntime_provider_test -j$(nproc)
./build/Linux/Debug/onnxruntime_provider_test --gtest_filter="*GroupQueryAttention*"

All 41 CPU GQA tests pass, including the new SeqlensKExceedsPastBuffer_OOBRead regression and the existing shared-KV CPU cases.
lintrunner reports no issues on the changed files.

Copilot

Pull request overview

This PR addresses a security/correctness issue in the CPU GroupQueryAttention contrib op where seqlens_k validation did not fully bound reads from the past KV cache during non-prompt execution, potentially leading to an out-of-bounds read. It adds an additional bounds check for the number of past rows copied and introduces a regression test covering the reported scenario.

Changes:

Add a decode-path validation to ensure (seqlens_k + 1 - sequence_length) does not exceed the past KV buffer sequence length.
Add a regression test (SeqlensKExceedsPastBuffer_OOBRead) to exercise the large-seqlens_k / small-past-buffer case.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`onnxruntime/contrib_ops/cpu/bert/group_query_attention.cc`	Adds past-buffer bounds validation for decode to prevent past KV OOB reads.
`onnxruntime/test/contrib_ops/group_query_attention_op_test.cc`	Adds a regression test to ensure invalid `seqlens_k` is rejected when it would over-read past KV.

+      // Shared KV (kv_sequence_length == 0) appends no new KV and its past read is already
+      // bounded by the present-buffer check together with the total_sequence_length <=
+      // seqlen_past_kv_cache enforcement in the apply-attention paths, so it needs no check here.
+      if (past_kv_seqlen > 0 && parameters.kv_sequence_length != 0 && !parameters.is_first_prompt) {


tianleiwu added 2 commits June 30, 2026 19:45

Fix GQA oob read in past buffer

ae4bd77

refine check

b0e7209

tianleiwu requested a review from Copilot June 30, 2026 23:01

Copilot started reviewing on behalf of tianleiwu June 30, 2026 23:02 View session

Copilot AI reviewed Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix GQA out-of-bounds read in past KV buffer (CPU)#29447

Fix GQA out-of-bounds read in past KV buffer (CPU)#29447
tianleiwu wants to merge 2 commits into
microsoft:mainfrom
tianleiwu:tlwu/20260630/icm_gqa_oob

tianleiwu commented Jun 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

tianleiwu commented Jun 30, 2026

Description

Motivation and Context

Key Changes

Testing Notes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants