Harden ConvTranspose pad computation with SafeInt and consistency checks by yuslepukhin · Pull Request #29446 · microsoft/onnxruntime

yuslepukhin · 2026-06-30T18:38:39Z

This pull request strengthens validation and error handling for the ConvTranspose operator in ONNX Runtime, particularly when using explicit output_shape attributes. It adds comprehensive checks to prevent inconsistent or invalid configurations, improves arithmetic safety to guard against integer overflows, and introduces a suite of targeted unit tests to verify these behaviors.

Validation and Error Handling Improvements:

Added stricter input validation in ConvTransposeAttributes::ComputePadAndOutputShape to ensure that all relevant parameters (output_shape, input size, stride, kernel, dilation, and output padding) are within valid ranges and to provide clear error messages when they are not. This includes checks that all values are positive and that output padding is non-negative.
Added a consistency check to verify that the explicit output_shape is compatible with the input dimensions and convolution parameters, preventing buffer overruns and logical inconsistencies.

Arithmetic Safety:

Updated ComputeTotalPad to use SafeInt for all intermediate arithmetic, ensuring that integer overflows are detected and handled safely instead of producing undefined behavior.

Testing Enhancements:

Added a comprehensive set of unit tests for ConvTranspose with explicit output_shape, including cases for invalid, inconsistent, and overflow-prone configurations, as well as valid edge cases (e.g., 1D, 2D, and 3D, large batch sizes, group > 1, and cases requiring padding). These tests verify that invalid configurations are rejected and that valid ones work as expected.

These changes collectively improve the robustness, correctness, and maintainability of the ConvTranspose operator's implementation and its handling of explicit output shapes.

- Wrap ComputeTotalPad arithmetic in SafeInt<int64_t> to prevent signed integer overflow (matching ComputePad and ComputeOutputShape in the same header). - Add input validation in the explicit output_shape path of ComputeTransposePadAndOutputShape (stride, kernel, dilation, adj, in_size, out_size). - Add a post-computation consistency check verifying that the forward-conv re-derivation of input size matches the actual input size, preventing pad/buffer size mismatches. - Add unit tests covering: inconsistent output_shape (1D, 2D, 3D), zero output_shape, stride/dilation variants, group>1, large batch, valid output_shape requiring padding, and arithmetic overflow (guarded by ORT_NO_EXCEPTIONS).

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

The OverflowInPadComputation test was crashing during AddInput because TensorShape::Size() overflowed for the huge W dimensions. Fix by using a valid small W shape {1,1,3,3} with kernel_shape={3,3} attribute and extreme dilation values that trigger SafeInt overflow in the pad computation arithmetic instead.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

titaiwangms · 2026-06-30T22:29:14Z

Review summary — ConvTranspose explicit `output_shape` hardening

Bottom line: the core change is sound and a genuine safety improvement. The central question — does the new derived_in != in_size check falsely reject spec-legal configs? — resolves in the PR's favor. No blocking issues. The most valuable finding is a pre-existing OOB that this PR leaves unfixed on a sibling code path (out of scope here, but worth a follow-up).

The crux, verified

Col2im (in conv_transpose.cc) is driven by the output dims and consumes exactly derived_in = (out + pads − dkernel)/stride + 1 columns from a buffer sized for in_size. So derived_in > in_size is a real heap over-read, not a cosmetic mismatch.

Algebra: derived_in = in + floor(adj/stride).

adj < stride → passes for the entire legal auto-pad range (out ≤ natural). ✔ No false rejection.
Rejected cases are exactly out > natural (spec-illegal — implies negative pads) and adj ≥ stride (only reachable when dilation > stride). In both, the existing Col2im would have over-read — so no correct behavior is lost; the check converts a latent OOB into a clean Status. ONNX spec prose is ambiguous here ("less than the corresponding stride/dilation"), and the onnx reference enforces no such bound, so the fail-safe is defensible.

Findings

🔴 Critical — pre-existing & out-of-scope (does not block this PR)

The same Col2im OOB survives on the implicit path (no output_shape, via output_padding). Reachable example: X[1,1,3,1], strides=[2,1], dilations=[3,1], kernel=[3,1], output_padding=[2,0] passes the adj < max(stride,dilation) pre-check → out=13, Col2im wants 4 columns from a 3-column buffer → over-read. This PR only guards the explicit-output_shape branch. Recommend extending the same derived_in guard (or tightening to adj < stride) to the implicit path in this or a follow-up PR.

🟡 Minor (in-scope, non-blocking)

Misleading diagnostic: when adj ≥ stride, the error blames output_shape though the real culprit is the output_padding×dilation interaction. Consider an explicit adj < stride guard or reworded message.
Missing positive test for valid explicit output_shape with output_padding > 0 — all new success tests use adj=0, so the adj term in the new check is untested on the success side.
Overflow tests use an empty expected-error substring ("") → they can pass on unrelated failures; assert a stable substring.
Test magic numbers contradict their comments (4611686018427387903 is INT64_MAX/2, not /4; 2305843009213693952 is 2^61). Prefer named constexpr expressions.
Comment vs. code: the comment describes a one-sided "exceeds" condition but the code tests !=. Harmless (equivalent in the reachable space) — a half-line note that derived_in < in_size is unreachable for adj ≥ 0 would prevent a future reader from suspecting over-strict rejection.
Failure tests exclude CUDA/WebGPU with a comment claiming they "perform their own validation," but they share ComputePadsAndOutputShape and would hit the same check — likely a coverage gap + inaccurate comment (worth confirming the routing).

🔵 Follow-up / out-of-scope (pre-existing)

WebNN (builder_utils.cc) and QNN (conv_op_builder.cc) do their own raw arithmetic / attribute parsing, bypassing both the new SafeInt protection (WebNN's in_size*stride can still overflow) and the consistency check. Consider unifying through ComputeTransposePadAndOutputShape.

✅ Confirmed clean

Dropping constexpr from ComputeTotalPad: no caller uses it in a constant-expression context. No build break.

Reviewed with a multi-model review pass (readability / code / critical / deep-spec / integration); crux verified by hand against the kernel, the ONNX 1.22 schema, and the onnx reference implementation.

- Add derived_in consistency check to the implicit path (no output_shape) to catch output_padding >= stride when dilation > stride, preventing the same Col2im OOB read on the implicit code path. - Add boundary test (output_shape=9, natural=7) for stride=2 explicit output_shape with explanation of integer division truncation at 8. - Add implicit path OOB test: output_padding=2 >= stride=2 with dilation=3 passes adj < max(stride, dilation) but fails consistency. - Add output_padding success test reusing known-correct values from ConvTranspose_2D_outputpadding_strides2. - Fix diagnostic: use output_padding instead of adj in error messages. - Add comment that derived_in < in_size is algebraically unreachable. - Use named constexpr for magic numbers in overflow tests. - Assert stable 'Integer overflow' substring instead of empty string. - Fix EP exclusion comments to accurately describe routing.

titaiwangms

Thanks!

DML does not support 3D ConvTranspose and throws 'The parameter is incorrect' during session initialization.

…cks (#29446) This pull request strengthens validation and error handling for the ConvTranspose operator in ONNX Runtime, particularly when using explicit `output_shape` attributes. It adds comprehensive checks to prevent inconsistent or invalid configurations, improves arithmetic safety to guard against integer overflows, and introduces a suite of targeted unit tests to verify these behaviors. **Validation and Error Handling Improvements:** * Added stricter input validation in `ConvTransposeAttributes::ComputePadAndOutputShape` to ensure that all relevant parameters (`output_shape`, input size, stride, kernel, dilation, and output padding) are within valid ranges and to provide clear error messages when they are not. This includes checks that all values are positive and that output padding is non-negative. * Added a consistency check to verify that the explicit `output_shape` is compatible with the input dimensions and convolution parameters, preventing buffer overruns and logical inconsistencies. **Arithmetic Safety:** * Updated `ComputeTotalPad` to use `SafeInt` for all intermediate arithmetic, ensuring that integer overflows are detected and handled safely instead of producing undefined behavior. **Testing Enhancements:** * Added a comprehensive set of unit tests for `ConvTranspose` with explicit `output_shape`, including cases for invalid, inconsistent, and overflow-prone configurations, as well as valid edge cases (e.g., 1D, 2D, and 3D, large batch sizes, group > 1, and cases requiring padding). These tests verify that invalid configurations are rejected and that valid ones work as expected. These changes collectively improve the robustness, correctness, and maintainability of the ConvTranspose operator's implementation and its handling of explicit output shapes.

yuslepukhin requested a review from Copilot June 30, 2026 18:38

Copilot started reviewing on behalf of yuslepukhin June 30, 2026 18:39 View session

Copilot AI reviewed Jun 30, 2026

yuslepukhin requested a review from Copilot June 30, 2026 21:06

Copilot started reviewing on behalf of yuslepukhin June 30, 2026 21:07 View session

Copilot AI reviewed Jun 30, 2026

titaiwangms requested a review from Copilot June 30, 2026 22:19

Copilot started reviewing on behalf of titaiwangms June 30, 2026 22:20 View session

Copilot AI reviewed Jun 30, 2026

View reviewed changes

Comment thread onnxruntime/core/providers/cpu/nn/conv_transpose_attributes.h

Comment thread onnxruntime/test/providers/cpu/nn/conv_transpose_op_test.cc

titaiwangms previously approved these changes Jun 30, 2026

View reviewed changes

Exclude DML EP from 3D ConvTranspose test

3e9d8f2

DML does not support 3D ConvTranspose and throws 'The parameter is incorrect' during session initialization.

yuslepukhin dismissed titaiwangms’s stale review via 3e9d8f2 July 1, 2026 15:41

titaiwangms approved these changes Jul 1, 2026

View reviewed changes

yuslepukhin merged commit 6f31cd6 into main Jul 1, 2026
86 checks passed

yuslepukhin deleted the yuslepukhin/compute_total_pad branch July 1, 2026 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Harden ConvTranspose pad computation with SafeInt and consistency checks#29446

Harden ConvTranspose pad computation with SafeInt and consistency checks#29446
yuslepukhin merged 4 commits into
mainfrom
yuslepukhin/compute_total_pad

yuslepukhin commented Jun 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

titaiwangms commented Jun 30, 2026

Uh oh!

titaiwangms left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

yuslepukhin commented Jun 30, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

titaiwangms commented Jun 30, 2026

Review summary — ConvTranspose explicit output_shape hardening

The crux, verified

Findings

Uh oh!

titaiwangms left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Review summary — ConvTranspose explicit `output_shape` hardening