Fix rotary embedding oob issue#29014
Conversation
tianleiwu
left a comment
There was a problem hiding this comment.
Thanks for hardening RotaryEmbedding validation. The contrib-op hidden_size guard (rank-3 / num_heads==0 path) is the meaningful fix and looks correct. A few non-blocking points:
-
Mainline op (
core/providers/cpu/llm/rotary_embedding_helper.h) additions look redundant. In both branches the new guard checkscache_width*2 > head_sizeimmediately before the existing exact-equality checkcos_cache_dims[2]/[1] != (rotary>0 ? rotary : head_size)/2. Sincehead_sizeis even here, any input failing the new guard already fails the equality check, so the mainline op was already rejecting these inputs — the new blocks only change which error message fires first. If this is defense-in-depth / clearer messaging that's fine, but it does not change behavior, and the two copy-pasted blocks could be a small shared helper. -
Contrib width-check rewrite is a behavior tightening. The new
expected_cache_width = rotary_embedding_dim > 0 ? rotary/2 : head_size/2rejects the case whererotary_embedding_dim > 0andcos_cachewidth equalshead_size/2, which the old condition accepted. This aligns contrib with mainline (good) but could reject a previously-valid model — please confirm it's intentional. -
Test assertions loosened. The two updated existing tests now match only
"cos_cache dimension", a broad substring that also overlaps the unchanged equality-check message. Consider a more specific fragment (e.g."exceeds head_size") so a future regression that swaps the failing validation path is still caught. New tests look good.
No correctness regressions found.
…ib width-check tightening, tighten test assertions
tianleiwu
left a comment
There was a problem hiding this comment.
Re-reviewed at head 89def7b. My two earlier threads (redundant mainline guards; contrib width-check tightening) are addressed — the redundant guards were removed, the contrib equality check reverted to the original, and the two existing tests now assert the specific exceeds head_size fragment. Thanks.
Two points remain on the current head:
-
Test coverage gap (inline). The new
*_RejectsCosCacheExceedsHiddenSizetests setnum_heads=1, so the inferredhead_sizeequalshidden_size(64) andeffective_rotary_dim=128 > head_size— i.e. they fire the first guard (exceeds head_size, which the assertion confirms), not the newhidden_sizeguard. The genuinely neweffective_rotary_dim > hidden_sizeguard (the actual OOB fix this PR targets) only fires whenhead_sizeis inferred from the cache (head_size==0on entry, e.g. a rank-3 input withnum_headsunset), and is currently untested. -
Nit. The only change to
onnxruntime/core/providers/cpu/llm/rotary_embedding_helper.his a stray blank-line insertion — no validation was actually added to the mainline op; it already rejects this input via the existingcos_cache_dims[1]/[2]equality check that the mainline test relies on. The PR description's claim of adding mainline validation slightly overstates this; consider dropping the blank line.
… add rank-3 no-num_heads test for hidden_size guard
tianleiwu
left a comment
There was a problem hiding this comment.
The OOB fix is correct and well-scoped. The contrib op is the only path that infers head_size from cos_cache (when num_heads is unset on a rank-3 input), and in that path the existing equality check cannot catch the mismatch because head_size is derived from cache_width (so head_size/2 == cache_width by construction). The new effective_rotary_dim > hidden_size guard closes that gap, and the _NoNumHeads test now exercises it.
Leaving the mainline ONNX op (core/providers/cpu/llm/rotary_embedding_helper.h) unchanged is the right call: it requires num_heads for rank-3 input, so head_size is always hidden_size / num_heads and never inferred from the cache — the existing dimension-equality check already rejects the over-sized cache. The added mainline test documents that existing behavior.
No blocking issues. One minor test-naming nit inline.
This pull request improves the validation logic for the RotaryEmbedding operator to prevent out-of-bounds reads when the rotary embedding dimension derived from
cos_cacheexceeds the input tensor'shidden_size. It also adds dedicated unit tests to verify that this validation triggers as expected.Validation improvements:
rotary_embedding_helper.hto ensure that the effective rotary embedding dimension (cos_cachewidth × 2) does not exceedhidden_sizewhenrotary_embedding_dimis 0, returning an error if the condition is violated.Unit test additions:
ContribRotaryEmbedding_RejectsCosCacheExceedsHiddenSizetest inrotary_embedding_op_test.ccto verify that an invalid configuration is correctly rejected in the contrib op.RotaryEmbedding_RejectsCosCacheExceedsHiddenSizetest inrotary_embedding_op_test.cc(providers/cpu/llm) to verify the same validation in the mainline op.