-
Notifications
You must be signed in to change notification settings - Fork 2k
Go: add experimental query for IDNA digit-fold IP-literal smuggling #21784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
astrogilda
wants to merge
5
commits into
github:main
Choose a base branch
from
astrogilda:experimental-go-idna-ip-literal-smuggle
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,397
−0
Open
Changes from 4 commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
92e52e2
Go: add experimental query for IDNA digit-fold IP-literal smuggling
astrogilda 6392c52
docs: align IDNA digit-fold qhelp/qll/ql with implemented model
astrogilda 771b433
test: tighten IDNA digit-fold fixture coverage and refresh baseline
astrogilda 611b7cc
style: gofmt negatives.go fixture
astrogilda e8e046e
Fix QL formatting
owen-mc File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
153 changes: 153 additions & 0 deletions
153
go/ql/src/experimental/CWE-918/IdnaIpLiteralSmuggle.qhelp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,153 @@ | ||
| <!DOCTYPE qhelp PUBLIC | ||
| "-//Semmle//qhelp//EN" | ||
| "qhelp.dtd"> | ||
| <qhelp> | ||
|
|
||
| <overview> | ||
| <p> | ||
| The Go module <code>golang.org/x/net/idna</code> implements UTS-46 IDNA | ||
| processing. On the <code>Lookup</code> and <code>Display</code> profiles | ||
| (and any profile constructed via <code>idna.New(idna.MapForLookup(), ...)</code>), | ||
| both <code>(*Profile).ToASCII</code> and <code>(*Profile).ToUnicode</code> | ||
| apply an NFKC-based character map that folds <strong>100 distinct | ||
| non-ASCII Unicode digit codepoints</strong> to their ASCII equivalents. | ||
| The 100 codepoints partition into the following Unicode-block ranges: | ||
| </p> | ||
| <ul> | ||
| <li>Latin-1 superscripts (U+00B2, U+00B3, U+00B9): 3 codepoints</li> | ||
| <li>Mathematical superscripts (U+2070, U+2074..U+2079): 7 codepoints</li> | ||
| <li>Mathematical subscripts (U+2080..U+2089): 10 codepoints</li> | ||
| <li>Circled digits (U+2460..U+2468, U+24EA): 10 codepoints</li> | ||
| <li>Fullwidth digits (U+FF10..U+FF19): 10 codepoints</li> | ||
| <li>Mathematical Alphanumeric Symbols digits, spanning bold, | ||
| double-struck, sans-serif, sans-serif-bold, and monospace styles | ||
| (U+1D7CE..U+1D7FF): 50 codepoints</li> | ||
| <li>Segmented digits (U+1FBF0..U+1FBF9): 10 codepoints</li> | ||
| </ul> | ||
| <p> | ||
| Devanagari digits (U+0966..U+096F) are <strong>not</strong> in scope: | ||
| empirical testing against <code>golang.org/x/net/idna v0.53.0</code> | ||
| confirms they do not fold to ASCII via UTS-46. The | ||
| <code>Registration</code> profile is structurally covered by the rule | ||
| but disallows every fold codepoint at the rune-validation stage, so a | ||
| caller that respects the returned <code>error</code> never sees a | ||
| smuggled literal from that profile in practice. | ||
| </p> | ||
| <p> | ||
| The library contains no IP-literal detection. A caller that applies UTS-46 | ||
| mapping to an attacker-controlled host string and consumes the result in a | ||
| network sink without rechecking against IP-literal parsers receives a | ||
| valid ASCII IPv4 literal back as the "domain name" output. Any downstream | ||
| allowlist check, SSRF guard, NoProxy match, or TLS-SNI router that does | ||
| not re-check the post-IDNA result is bypassed. The anti-pattern also | ||
| applies to callers that do a pre-IDNA <code>net.ParseIP</code> check and | ||
| think it is sufficient: the smuggled host is not ASCII, so the pre-IDNA | ||
| check rejects it as non-IP, and the post-IDNA value (now a numeric | ||
| literal) reaches the sink unguarded. | ||
| </p> | ||
| <p> | ||
| IPv6 is out of scope: <code>:</code> is a UTS-46 disallowed character; | ||
| bare-IPv6 inputs are rejected by IDNA rune-validation before any | ||
| digit-fold mapping runs. | ||
| </p> | ||
| <p> | ||
| Sinks where the smuggled literal becomes exploitable include | ||
| <code>net.JoinHostPort</code>, <code>net.Dial</code>, | ||
| <code>(*http.Request).URL.Host</code>, <code>(*tls.Config).ServerName</code>, | ||
| <code>(*http.Cookie).Domain</code>, and any HTTP client request URL | ||
| constructed from the mapped value. | ||
| </p> | ||
| </overview> | ||
|
|
||
| <recommendation> | ||
| <p> | ||
| Either: | ||
| </p> | ||
| <ol> | ||
| <li> | ||
| Use a strict IDNA profile option that returns an error if the mapped | ||
| output parses as an IP literal, if your IDNA library exposes one. | ||
| </li> | ||
| <li> | ||
| Apply the explicit safe pattern: after the IDNA mapping call, strip | ||
| trailing dots from the result and parse it. Reject if | ||
| <code>net.ParseIP</code> returns a non-<code>nil</code> address, or if | ||
| <code>netip.ParseAddr</code> returns no error (note the inverted | ||
| convention: <code>netip.ParseAddr</code> reports a successfully parsed | ||
| address via <code>err == nil</code>, not via a non-zero return). The | ||
| trailing-dot strip is required because <code>"0.¹.0.0."</code> maps to | ||
| <code>"0.1.0.0."</code>, which a bare <code>net.ParseIP</code> rejects | ||
| on its own yet is still an IP literal for routing purposes; the strip | ||
| exposes the literal so the parser sees it. | ||
| </li> | ||
| </ol> | ||
| </recommendation> | ||
|
|
||
| <example> | ||
| <p> | ||
| Vulnerable pattern. The host string is mapped through the IDNA profile | ||
| and reaches a network sink with no post-IDNA IP-literal recheck: | ||
| </p> | ||
|
|
||
| <sample src="IdnaIpLiteralSmuggleBad.go"/> | ||
|
|
||
| <p> | ||
| Safe pattern. Post-IDNA trailing-dot strip followed by | ||
| <code>net.ParseIP</code> recheck: | ||
| </p> | ||
|
|
||
| <sample src="IdnaIpLiteralSmuggleGood.go"/> | ||
|
|
||
| <p> | ||
| The safe pattern accepts three trailing-dot strip forms. They are | ||
| <strong>not</strong> equivalent in coverage: | ||
| </p> | ||
| <ul> | ||
| <li><code>strings.TrimRight(ace, ".")</code>: strict form. Strips | ||
| all trailing dots, so the multi-dot residue produced when UTS-46 | ||
| maps the fullwidth dot U+FF0E or the ideographic dot U+3002 next | ||
| to ASCII dots is fully removed.</li> | ||
| <li><code>strings.TrimSuffix(ace, ".")</code>: lenient form. Strips | ||
| only one trailing dot. Sufficient for the canonical | ||
| <code>"0.1.0.0."</code> shape but leaves residue if multiple | ||
| trailing dots were produced by mapping.</li> | ||
| <li><code>if strings.HasSuffix(ace, ".") { ace = ace[:len(ace)-1] }</code>: | ||
| manual single-dot slice. Behaves identically to | ||
| <code>TrimSuffix</code> in coverage and inherits the same | ||
| multi-dot-residue limitation.</li> | ||
| </ul> | ||
| <p> | ||
| Callers whose threat model includes the multi-trailing-dot variant | ||
| should prefer <code>strings.TrimRight</code>. After the strip, parse | ||
| with <code>netip.ParseAddr</code> (preferred) or <code>net.ParseIP</code> | ||
| and reject if the value parses as an IP literal (<code>err == nil</code> | ||
| for the former, non-<code>nil</code> return for the latter). | ||
| </p> | ||
| </example> | ||
|
|
||
| <references> | ||
|
|
||
| <li> | ||
| Unicode Technical Standard #46 (IDNA Compatibility Processing): | ||
| <a href="https://www.unicode.org/reports/tr46/">https://www.unicode.org/reports/tr46/</a> | ||
| </li> | ||
| <li> | ||
| <code>golang.org/x/net/idna</code> package documentation: | ||
| <a href="https://pkg.go.dev/golang.org/x/net/idna">https://pkg.go.dev/golang.org/x/net/idna</a> | ||
| </li> | ||
| <li> | ||
| WHATWG URL Standard, <code>ends_in_a_number</code> host parser check | ||
| (prior art for IP-literal detection in URL parsers): | ||
| <a href="https://url.spec.whatwg.org/#ends-in-a-number-checker">https://url.spec.whatwg.org/#ends-in-a-number-checker</a> | ||
| </li> | ||
| <li> | ||
| CWE-918: Server-Side Request Forgery (SSRF): | ||
| <a href="https://cwe.mitre.org/data/definitions/918.html">https://cwe.mitre.org/data/definitions/918.html</a> | ||
| </li> | ||
| <li> | ||
| CWE-020: Improper Input Validation: | ||
| <a href="https://cwe.mitre.org/data/definitions/20.html">https://cwe.mitre.org/data/definitions/20.html</a> | ||
| </li> | ||
|
|
||
| </references> | ||
| </qhelp> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| /** | ||
| * @name IDNA digit-fold IP-literal smuggling via UTS-46 NFKC mapping | ||
| * @description An untrusted hostname flows through | ||
| * `(*golang.org/x/net/idna.Profile).ToASCII` or `.ToUnicode` | ||
| * on a digit-folding profile (which folds 100 non-ASCII | ||
| * Unicode digit codepoints to ASCII via UTS-46 NFKC) and | ||
| * reaches a security-relevant hostname sink without a | ||
| * post-IDNA IP-literal recheck. A caller that omits the | ||
| * recheck (or only runs `net.ParseIP` BEFORE the mapping | ||
| * call) will accept a smuggled IPv4 literal such as | ||
| * `"0.¹.0.0"` (which maps to `"0.1.0.0"`). Scope is IPv4 | ||
| * only because IPv6 colons are rejected by IDNA | ||
| * rune-validation before UTS-46 mapping runs. | ||
| * @id go/idna-ip-literal-smuggle | ||
| * @kind path-problem | ||
| * @problem.severity warning | ||
| * @security-severity 8.1 | ||
| * @precision high | ||
| * @tags security | ||
| * experimental | ||
| * external/cwe/cwe-918 | ||
| * external/cwe/cwe-020 | ||
| * @requires codeql/go-all >= 0.6.0 | ||
| */ | ||
|
|
||
| import go | ||
| import IdnaIpLiteralSmuggle | ||
| import Flow::PathGraph | ||
|
|
||
| from | ||
| Flow::PathNode source, | ||
| Flow::PathNode sink | ||
| where Flow::flowPath(source, sink) | ||
| select sink.getNode(), source, sink, | ||
| "Untrusted hostname from $@ flows through a `golang.org/x/net/idna` mapping call (which performs UTS-46 NFKC digit folding) and reaches this hostname sink without a post-IDNA `net.ParseIP` (or `netip.ParseAddr`) recheck on the trailing-dot-stripped value.", | ||
| source.getNode(), "this user-controlled value" | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.