[test] Add cross-surface resource-naming conformance ratchet#6450
[test] Add cross-surface resource-naming conformance ratchet#6450joewiz wants to merge 3 commits into
Conversation
First increment of the naming conformance harness (resource-naming tasking, PR A).
Boots ExistWebServer (REST + WebDAV/Jackrabbit + XML-RPC on one port), stores a corpus
of "awkward" resource names through WebDAV (raw HTTP PUT, explicit on-the-wire
encoding), reports what name actually landed in storage (via eXist's native REST
collection listing — the WebDAV test conf.xml registers no XQuery modules, so the
harness depends on none), and reads each back by the requested name via WebDAV and REST.
It prints a matrix of current behavior (visible in the CI log) and enforces a RATCHET:
the set of names that fail to round-trip cross-surface must exactly equal the documented
KNOWN_FAILURES allowlist. So merging this immediately guards every already-correct name
against regression; when a naming fix lands and a listed name starts round-tripping, the
test fails and tells you to remove it from the allowlist (locking in the improvement);
and a name that is not listed can never silently regress. Verified both failure
directions ("regressed" and "now round-trips — remove from KNOWN_FAILURES").
The harness is module- and WebDAV-client-independent: the collection is created with a
REST PUT (auto-create), probes are raw HTTP, and probe content is valid XML (corpus
names end in .xml, which eXist parses on store).
Current allowlist (5 names that don't yet round-trip cross-surface): plus, at,
ampersand, parens, apostrophe — i.e. several sub-delim/reserved characters. XML-RPC and
the full N×N cross-surface matrix are TODO follow-ups noted in the class Javadoc.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Composing the webdav URL using the URI constructor would make the custom encoding logic obsolete if used by a function like this:
Locations as such
final URL url = URI.create(webdavBase() + encodePathSegments(dbPath)).toURL();
Could then replaced with
final URL url = createWebdavUrl(dbPath);
Using this method:
private URL createWebdavUrl(String dbPath {
return new URI("http", null, "localhost", existWebServer.getPort(), "/webdav"+ dbPath, null, null).toURL();
}
Also I would prefer to use the new HttpClient API introduced with Java 11
Address review feedback on eXist-db#6450: switch the harness from HttpURLConnection to java.net.http.HttpClient, and build request URLs with the multi-argument java.net.URI constructor instead of a custom path-segment encoder. This changes what goes on the wire for RFC 3986 sub-delimiters: the URI constructor leaves '+ @ & ( ) '' literal in the path (as a browser or curl does), whereas the previous URLEncoder-based helper percent-encoded them. Under conventional encoding, every corpus name -- including those five plus the non-ASCII names -- round-trips cross-surface between WebDAV and REST. The five "failures" the earlier revision recorded were artifacts of percent-encoding sub-delimiters, not of eXist's storage. KNOWN_FAILURES is therefore now empty: the ratchet guards the all-green state and fails if any corpus name ever stops round-tripping cross-surface. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
| private static final String MARKER = "naming-probe-content"; | ||
| private static final String CONTENT = "<probe>" + MARKER + "</probe>"; | ||
|
|
||
| private static final HttpClient HTTP = HttpClient.newHttpClient(); |
There was a problem hiding this comment.
Create this within the createTestCollection() method as the client needs to be closed at the end of the test. (Within an @AfterClass annotated method.
Address review feedback on eXist-db#6450: create the HttpClient in createTestCollection() (@BeforeClass) and close it in removeTestCollection() (@afterclass), rather than holding it in a static final field. HttpClient is AutoCloseable as of Java 21, so it is closed once after the test class runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
[This response was co-authored with Claude Code. -Joe] Thanks Patrick — adopted both: the harness now uses Your suggestion turned out to do more than simplify the code — it corrected the test. The So those five "failures" were artifacts of the harness percent-encoding sub-delimiters, not eXist mis-storing names. eXist may store a percent-encoded form (the (One genuinely interesting residual the percent-encoded run hinted at: Pushed as |
[This PR was co-authored with Claude Code. -Joe]
Summary
Adds a test that pins how eXist handles "awkward" resource names across API surfaces — the missing coverage for the resource-naming-stability work (issues #3795, #3665, #1824, #5299, #1612). It changes no behavior. It prints the current behavior as a matrix and enforces a ratchet: the set of names that fail to round-trip cross-surface must exactly equal a documented
KNOWN_FAILURESallowlist.The effect: the moment this merges, every name that already works is guarded against regression, and as each naming fix lands you remove an entry from the allowlist to lock the improvement in. The test can neither silently regress (an unlisted name that breaks fails the build) nor silently drift (a listed name that starts working fails the build and tells you to remove it).
There is no cross-surface special-character test in the suite today; this is the first.
What it does
For a corpus of awkward leaf names (space,
+, literal%, literal%20,#,@,&, parentheses, apostrophe,café, Cyrillic, CJK), per name it:PUT, so the on-the-wire encoding is explicit);KNOWN_FAILURES).Design notes:
conf.xmlregisters no builtin-modules, so the harness uses the REST collection listing rather thanxmldb:*.PUT(auto-create) and every probe is raw HTTP..xml, which eXist parses on store.What the matrix shows today
So today: every name with a non-unreserved character is stored percent-encoded; WebDAV self-reads always round-trip; reading by the requested name via REST fails specifically for sub-delim/reserved characters (
+,@,&,(/),'). Those five are the currentKNOWN_FAILURESallowlist.What devs see in CI
Tests run: 1, Failures: 0.Scope / follow-ups (noted in the class Javadoc)
/xmlrpc).Where it lives
In the
extensions/webdavtest module — the only place that can drive REST and WebDAV and XML-RPC against one database in one test (module dependency direction means exist-core can't reach the WebDAV extension). A dedicated cross-surface integration-test module would be the alternative if preferred.Test plan
Tests run: 1, Failures: 0); matrix prints to the build log.Related
+decode), xmldb:rename() function fails if the collection name contains spaces #5299 (spaces), URLs with non-ascii characters cause exception #1612 (non-ASCII).