Validate B/scales/zero_points shape in MatMulNBits::PrePack by apsonawane · Pull Request #29445 · microsoft/onnxruntime

apsonawane · 2026-06-30T17:54:56Z

MatMulNBits::PrePack ran at session initialization and called the MLAS pack routines using byte counts derived from the node attributes (N, K, bits, block_size) without ever comparing those attributes to the actual tensor Shape(). A crafted .onnx whose attributes overstate the real B (or scales / zero_points) extent triggered a heap-buffer-overflow READ inside MlasQNBitGemmPackQuantBData / MlasLutGemmPack during OrtApis::CreateSession (no Run() required).

The canonical shape check already lives in
matmul_nbits_helper::CheckInputs, but is invoked only from Compute() -- after PrePack has already done the OOB read, and by then the original B tensor is replaced with nullptr in the kernel context so the Compute-time check never re-validates it.

Fix: at the top of PrePack, after the existing early-return guards and before any tensor.DataRaw() read, validate the incoming initializer's Shape() against the attribute-derived shape:

B -> (N, k_blocks, blob_size)
scales -> (N * k_blocks) or (N, k_blocks)
zero_points -> uint8: (N * zp_blob) or (N, zp_blob); else
(N * k_blocks) or (N, k_blocks)

A mismatch returns INVALID_ARGUMENT so the session fails to load rather than reading past the buffer.

MatMulNBits::PrePack ran at session initialization and called the MLAS pack routines using byte counts derived from the node attributes (N, K, bits, block_size) without ever comparing those attributes to the actual tensor Shape(). A crafted .onnx whose attributes overstate the real B (or scales / zero_points) extent triggered a heap-buffer-overflow READ inside MlasQNBitGemmPackQuantBData / MlasLutGemmPack during OrtApis::CreateSession (no Run() required). The canonical shape check already lives in matmul_nbits_helper::CheckInputs, but is invoked only from Compute() -- after PrePack has already done the OOB read, and by then the original B tensor is replaced with nullptr in the kernel context so the Compute-time check never re-validates it. Fix: at the top of PrePack, after the existing early-return guards and before any tensor.DataRaw() read, validate the incoming initializer's Shape() against the attribute-derived shape: - B -> (N, k_blocks, blob_size) - scales -> (N * k_blocks) or (N, k_blocks) - zero_points -> uint8: (N * zp_blob) or (N, zp_blob); else (N * k_blocks) or (N, k_blocks) A mismatch returns INVALID_ARGUMENT so the session fails to load rather than reading past the buffer.

Copilot

Pull request overview

This PR hardens the CPU MatMulNBits contrib op against malformed models by adding early shape validation in MatMulNBits<T1>::PrePack() so that session initialization rejects inconsistent initializers before any MLAS packing routine can read past the provided buffers.

Changes:

Add attribute-derived initializer shape checks for B, scales, and zero_points at the top of MatMulNBits<T1>::PrePack().
Add new unit tests that expect session creation to fail (pre-Compute()) for mismatched initializer shapes, plus a compatibility test for legacy flattened scales/zero_points layouts.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc	Adds new PrePack-time shape validation intended to prevent OOB reads during weight packing at session init.
onnxruntime/test/contrib_ops/matmul_4bits_test.cc	Adds tests that exercise PrePack-time rejection for malformed initializer shapes and verifies legacy flattened layouts remain accepted.

apsonawane added 2 commits June 30, 2026 10:49

add unit tests

495fbd2

apsonawane requested a review from Copilot June 30, 2026 18:17

apsonawane enabled auto-merge (squash) June 30, 2026 18:17

Copilot started reviewing on behalf of apsonawane June 30, 2026 18:17 View session

Copilot AI reviewed Jun 30, 2026

View reviewed changes

Comment thread onnxruntime/contrib_ops/cpu/quantization/matmul_nbits.cc Outdated

apsonawane added 2 commits June 30, 2026 11:58

Address comments

bcba24f

Fix pipeline

43c1b16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Validate B/scales/zero_points shape in MatMulNBits::PrePack#29445

Validate B/scales/zero_points shape in MatMulNBits::PrePack#29445
apsonawane wants to merge 4 commits into
mainfrom
asonawane/edge-3

apsonawane commented Jun 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

apsonawane commented Jun 30, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants