Skip to content

Reject URLs with multiple brackets in host component#1661

Open
rodrigobnogueira wants to merge 2 commits intoaio-libs:masterfrom
rodrigobnogueira:fix-malformed-bracketed-hostnames
Open

Reject URLs with multiple brackets in host component#1661
rodrigobnogueira wants to merge 2 commits intoaio-libs:masterfrom
rodrigobnogueira:fix-malformed-bracketed-hostnames

Conversation

@rodrigobnogueira
Copy link
Copy Markdown
Member

What do these changes do?

Fixes a parsing edge case where URLs containing multiple bracket characters in the host authority component were silently canonicalized to an unintended host.

For example, split_netloc used str.partition to extract content between the first [ and first ]. When more than one bracket pair was present, this picked up content like :localhost[ which contains a colon and passed the IPv6 content check — ultimately resolving to localhost after bracket-stripping in the encoder.

Both split_url() and split_netloc() now validate that:

  • exactly one [ and one ] appear in the authority/hostinfo, and
  • [ starts the host subcomponent (per RFC 3986 IP-literal grammar)

This is a companion fix to #1654, which addressed text before a single bracket pair; this addresses multiple bracket pairs.

Are there changes in behavior for the user?

Yes — previously accepted malformed URLs like http://[:localhost[]].google:80 now raise ValueError: Invalid IPv6 URL, consistent with other malformed IPv6 literals.

Related issues and pull requests

Complements #1654.

@psf-chronographer psf-chronographer Bot added the bot:chronographer:provided There is a change note present in this PR label Apr 21, 2026
@rodrigobnogueira rodrigobnogueira force-pushed the fix-malformed-bracketed-hostnames branch from 5156ddf to 702ae36 Compare April 21, 2026 16:29
Fixes host-confusion parsing where URLs containing multiple bracket
characters in the authority (e.g. http://[:localhost[]].google:80)
were silently canonicalized to an unintended host.

Both split_url() and split_netloc() now raise ValueError when:
- more than one '[' or ']' appears in the netloc/hostinfo, or
- '[' does not start the host subcomponent (per RFC 3986 IP-literal)

Adds 7 regression tests covering the affected code paths.
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 21, 2026

Merging this PR will not alter performance

✅ 99 untouched benchmarks


Comparing rodrigobnogueira:fix-malformed-bracketed-hostnames (9584179) with master (e522ab0)

Open in CodSpeed

@rodrigobnogueira
Copy link
Copy Markdown
Member Author

Just a heads-up regarding the CI failure: the pyupgrade pre-commit check is crashing on the Python 3.14 job with a TypeError: cannot use a bytes pattern on a string-like object.

This is a known upstream incompatibility between pyupgrade and recent Python 3.14 pre-releases (where tokenize.cookie_re was changed to a bytes pattern). It's an unconditional crash inside pyupgrade itself and is unrelated to the code changes in this PR.

pyupgrade....................................................................Failed
- hook id: pyupgrade
- exit code: 1

Traceback (most recent call last):
  File "/pc/clone/GRltiHWiQruk2pUmRoCwuw/py_env-python3.14/bin/pyupgrade", line 7, in <module>
    sys.exit(main())
             ~~~~^^
  File "/pc/clone/GRltiHWiQruk2pUmRoCwuw/py_env-python3.14/lib/python3.14/site-packages/pyupgrade/_main.py", line 393, in main
    ret |= _fix_file(filename, args)
           ~~~~~~~~~^^^^^^^^^^^^^^^^
  File "/pc/clone/GRltiHWiQruk2pUmRoCwuw/py_env-python3.14/lib/python3.14/site-packages/pyupgrade/_main.py", line 327, in _fix_file
    contents_text = _fix_tokens(contents_text)
  File "/pc/clone/GRltiHWiQruk2pUmRoCwuw/py_env-python3.14/lib/python3.14/site-packages/pyupgrade/_main.py", line 297, in _fix_tokens
    tokenize.cookie_re.match(token.src)
    ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^
TypeError: cannot use a bytes pattern on a string-like object

@rodrigobnogueira rodrigobnogueira force-pushed the fix-malformed-bracketed-hostnames branch from 2195606 to a52a3dd Compare April 21, 2026 16:36
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.48%. Comparing base (2f180d1) to head (9584179).
⚠️ Report is 2 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1661   +/-   ##
=======================================
  Coverage   99.47%   99.48%           
=======================================
  Files          30       30           
  Lines        5942     5988   +46     
  Branches      283      286    +3     
=======================================
+ Hits         5911     5957   +46     
  Misses         22       22           
  Partials        9        9           
Flag Coverage Δ
CI-GHA 99.48% <100.00%> (+<0.01%) ⬆️
MyPy 97.65% <100.00%> (+0.01%) ⬆️
OS-Linux 99.71% <100.00%> (+<0.01%) ⬆️
OS-Windows 98.42% <100.00%> (+<0.01%) ⬆️
OS-macOS 98.57% <100.00%> (+<0.01%) ⬆️
Py-3.10.11 98.40% <100.00%> (+<0.01%) ⬆️
Py-3.10.20 99.63% <100.00%> (+<0.01%) ⬆️
Py-3.11.15 99.63% <100.00%> (+<0.01%) ⬆️
Py-3.11.9 98.40% <100.00%> (+<0.01%) ⬆️
Py-3.12.10 98.40% <100.00%> (+<0.01%) ⬆️
Py-3.12.13 99.63% <100.00%> (+<0.01%) ⬆️
Py-3.13.12 ?
Py-3.13.13 99.68% <100.00%> (?)
Py-3.13.13t 99.68% <100.00%> (+<0.01%) ⬆️
Py-3.14.3 ?
Py-3.14.4 99.68% <100.00%> (?)
Py-3.14.4t 99.68% <100.00%> (+<0.01%) ⬆️
Py-pypy3.10.16-7.3.19 ?
Py-pypy3.11.15-7.3.21 99.29% <100.00%> (?)
VM-macos-latest 98.57% <100.00%> (+<0.01%) ⬆️
VM-ubuntu-latest 99.71% <100.00%> (+<0.01%) ⬆️
VM-windows-latest 98.42% <100.00%> (+<0.01%) ⬆️
pytest 99.73% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:chronographer:provided There is a change note present in this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant