Open
Conversation
88fdd3d to
dc037dc
Compare
dc037dc to
2ddd07f
Compare
Fuzzes the CPython _json C module (Modules/_json.c) through JSONDecoder.decode() and JSONDecoder.raw_decode(), dispatched per input via FuzzedDataProvider. Input bytes are decoded as latin-1 so every byte value maps to a distinct code point, preserving the full 0–255 byte space at the parser boundary — in contrast to json.py, which feeds UTF-8 with errors="replace" and collapses any invalid sequence to U+FFFD, sharply shrinking the effective input space. It also reaches raw_decode()'s trailing-data position reporting that json.py never calls, and drops the dumps/loads roundtrip to focus purely on decoder hardening rather than re-encoding already-valid objects.
2ddd07f to
2cf3a3a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fuzzes the CPython _json C module (Modules/_json.c) through JSONDecoder.decode() and JSONDecoder.raw_decode(), dispatched per input via FuzzedDataProvider. Input bytes are decoded as latin-1 so every byte value maps to a distinct code point, preserving the full 0–255 byte space at the parser boundary — in contrast to json.py, which feeds UTF-8 with errors="replace" and collapses any invalid sequence to U+FFFD, sharply shrinking the effective input space. It also reaches raw_decode()'s trailing-data position reporting that json.py never calls, and drops the dumps/loads roundtrip to focus purely on decoder hardening rather than re-encoding already-valid objects.