Fix signed-int overflow in SamplingState::Init to prevent heap-buffer-overflow#29443
Open
apsonawane wants to merge 5 commits into
Open
Fix signed-int overflow in SamplingState::Init to prevent heap-buffer-overflow#29443apsonawane wants to merge 5 commits into
apsonawane wants to merge 5 commits into
Conversation
…erflow SamplingState<T>::Init computed int total_count = batch_size * vocab_size as a bare int*int multiply with model-controlled operands, then wrapped the already-overflowed result in SafeInt<size_t>. SafeInt rejected the negative-wrap case but silently accepted positive-wrap (e.g. 4 * 0x40000001 wraps to 4), under-sizing sorted_scores / cumulative_probs. The companion next_token_scores buffer sizes the same product correctly via SafeInt<size_t>(batch_size) * vocab_size, so the later memcpy in SamplingCpuHelper::Sample copies the large size into the small buffer -- a heap-buffer-overflow WRITE triggerable by a hostile .onnx model with a com.microsoft::Sampling node. Fix: compute the product in SafeInt's checked domain by casting an operand first, matching the pattern already used for next_token_scores. Apply the same operand-first pattern to the batch_size * max_iter site and to SafeInt<size_t>(batch_size + 1) (which itself could wrap in int).
Contributor
There was a problem hiding this comment.
Pull request overview
This PR hardens SamplingState::Init in the generation/transformers greedy-search implementation by moving buffer element-count computations into SafeInt<size_t> so integer overflow can’t lead to under-allocation and downstream memory errors.
Changes:
- Compute
batch_size * vocab_sizeusingSafeInt<size_t>to prevent overflow before buffer allocation. - Reuse the checked
total_countacross CPU/CUDA allocations inSamplingState.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request improves the safety of buffer size calculations in the
SamplingStateinitialization logic by ensuring that all multiplications involvingbatch_sizeandvocab_sizeare safely performed usingSafeInt<size_t>. This prevents potential integer overflow bugs that could lead to under-allocated buffers and memory errors.Buffer allocation safety improvements:
batch_sizeandvocab_sizenow useSafeInt<size_t>to ensure checked arithmetic, preventing silent integer overflows that could cause heap-buffer-overflow issues. This includes allocations for both CPU and CUDA buffers inSamplingState. [1] [2]h_sampled_allnow also safely castsmax_itertosize_tbefore multiplication, further protecting against overflow.These changes make the code more robust and secure, especially when handling large or model-controlled input sizes.