Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions Lib/test/test_pyexpat.py
Original file line number Diff line number Diff line change
Expand Up @@ -712,6 +712,20 @@ def test_change_size_2(self):
parser.Parse(xml2, True)
self.assertEqual(self.n, 4)

@support.requires_resource('cpu')
@support.requires_resource('walltime')
def test_heap_overflow(self):
Comment thread
ByteFlowing1337 marked this conversation as resolved.
Outdated
# See https://github.com/python/cpython/issues/148441
parser = expat.ParserCreate()
parser.buffer_text = True
parser.buffer_size = 2**31 - 1 # INT_MAX
Comment thread
ByteFlowing1337 marked this conversation as resolved.
Outdated
def handler(text):
pass
N = 2049 * (1 << 20) - 3 # 2,148,532,221 bytes of character data
parser.CharacterDataHandler = handler
Comment thread
ByteFlowing1337 marked this conversation as resolved.
Outdated
xml_data = b"<r>" + b"A" * N + b"</r>"
self.assertEqual(parser.Parse(xml_data, True), 1)

class ElementDeclHandlerTest(unittest.TestCase):
def test_trigger_leak(self):
# Unfixed, this test would leak the memory of the so-called
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Fix heap buffer overflow in pyexpat CharacterDataHandler, which is caused by
two signed intergets added up.
Comment thread
ByteFlowing1337 marked this conversation as resolved.
Outdated
2 changes: 1 addition & 1 deletion Modules/pyexpat.c
Original file line number Diff line number Diff line change
Expand Up @@ -393,7 +393,7 @@ my_CharacterDataHandler(void *userData, const XML_Char *data, int len)
if (self->buffer == NULL)
call_character_handler(self, data, len);
else {
if ((self->buffer_used + len) > self->buffer_size) {
if (len > (self->buffer_size - self->buffer_used)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a fix to an integer overflow rather than a buffer overflow. Am missing something?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cause is indeed an integer overflow, but I think the ASAN report says "heap-buffer overflow" in this case (so the result is a buffer overflow at the end). I'll just write "a crash" (people don't really care about the cause/exact result: if it crashes, it's bad; they can read the issue for more details).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the title to reflect what the PR did but I'll stay vague in the NEWS.

if (flush_character_buffer(self) < 0)
return;
/* handler might have changed; drop the rest on the floor
Expand Down
Loading