Skip to content

Fix rotary embedding oob issue#29014

Open
apsonawane wants to merge 8 commits into
mainfrom
asonawane/rotary
Open

Fix rotary embedding oob issue#29014
apsonawane wants to merge 8 commits into
mainfrom
asonawane/rotary

Conversation

@apsonawane

Copy link
Copy Markdown
Contributor

This pull request improves the validation logic for the RotaryEmbedding operator to prevent out-of-bounds reads when the rotary embedding dimension derived from cos_cache exceeds the input tensor's hidden_size. It also adds dedicated unit tests to verify that this validation triggers as expected.

Validation improvements:

  • Added a check in rotary_embedding_helper.h to ensure that the effective rotary embedding dimension (cos_cache width × 2) does not exceed hidden_size when rotary_embedding_dim is 0, returning an error if the condition is violated.

Unit test additions:

  • Added ContribRotaryEmbedding_RejectsCosCacheExceedsHiddenSize test in rotary_embedding_op_test.cc to verify that an invalid configuration is correctly rejected in the contrib op.
  • Added RotaryEmbedding_RejectsCosCacheExceedsHiddenSize test in rotary_embedding_op_test.cc (providers/cpu/llm) to verify the same validation in the mainline op.

@apsonawane apsonawane enabled auto-merge (squash) June 11, 2026 19:37

@tianleiwu tianleiwu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for hardening RotaryEmbedding validation. The contrib-op hidden_size guard (rank-3 / num_heads==0 path) is the meaningful fix and looks correct. A few non-blocking points:

  1. Mainline op (core/providers/cpu/llm/rotary_embedding_helper.h) additions look redundant. In both branches the new guard checks cache_width*2 > head_size immediately before the existing exact-equality check cos_cache_dims[2]/[1] != (rotary>0 ? rotary : head_size)/2. Since head_size is even here, any input failing the new guard already fails the equality check, so the mainline op was already rejecting these inputs — the new blocks only change which error message fires first. If this is defense-in-depth / clearer messaging that's fine, but it does not change behavior, and the two copy-pasted blocks could be a small shared helper.

  2. Contrib width-check rewrite is a behavior tightening. The new expected_cache_width = rotary_embedding_dim > 0 ? rotary/2 : head_size/2 rejects the case where rotary_embedding_dim > 0 and cos_cache width equals head_size/2, which the old condition accepted. This aligns contrib with mainline (good) but could reject a previously-valid model — please confirm it's intentional.

  3. Test assertions loosened. The two updated existing tests now match only "cos_cache dimension", a broad substring that also overlaps the unchanged equality-check message. Consider a more specific fragment (e.g. "exceeds head_size") so a future regression that swaps the failing validation path is still caught. New tests look good.

No correctness regressions found.

Comment thread onnxruntime/contrib_ops/cpu/bert/rotary_embedding_helper.h Outdated
Comment thread onnxruntime/core/providers/cpu/llm/rotary_embedding_helper.h Outdated

@tianleiwu tianleiwu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed at head 89def7b. My two earlier threads (redundant mainline guards; contrib width-check tightening) are addressed — the redundant guards were removed, the contrib equality check reverted to the original, and the two existing tests now assert the specific exceeds head_size fragment. Thanks.

Two points remain on the current head:

  1. Test coverage gap (inline). The new *_RejectsCosCacheExceedsHiddenSize tests set num_heads=1, so the inferred head_size equals hidden_size (64) and effective_rotary_dim=128 > head_size — i.e. they fire the first guard (exceeds head_size, which the assertion confirms), not the new hidden_size guard. The genuinely new effective_rotary_dim > hidden_size guard (the actual OOB fix this PR targets) only fires when head_size is inferred from the cache (head_size==0 on entry, e.g. a rank-3 input with num_heads unset), and is currently untested.

  2. Nit. The only change to onnxruntime/core/providers/cpu/llm/rotary_embedding_helper.h is a stray blank-line insertion — no validation was actually added to the mainline op; it already rejects this input via the existing cos_cache_dims[1]/[2] equality check that the mainline test relies on. The PR description's claim of adding mainline validation slightly overstates this; consider dropping the blank line.

Comment thread onnxruntime/test/contrib_ops/rotary_embedding_op_test.cc

@tianleiwu tianleiwu left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OOB fix is correct and well-scoped. The contrib op is the only path that infers head_size from cos_cache (when num_heads is unset on a rank-3 input), and in that path the existing equality check cannot catch the mismatch because head_size is derived from cache_width (so head_size/2 == cache_width by construction). The new effective_rotary_dim > hidden_size guard closes that gap, and the _NoNumHeads test now exercises it.

Leaving the mainline ONNX op (core/providers/cpu/llm/rotary_embedding_helper.h) unchanged is the right call: it requires num_heads for rank-3 input, so head_size is always hidden_size / num_heads and never inferred from the cache — the existing dimension-equality check already rejects the over-sized cache. The added mainline test documents that existing behavior.

No blocking issues. One minor test-naming nit inline.

Comment thread onnxruntime/test/contrib_ops/rotary_embedding_op_test.cc Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants