[feature] Native EXPath HTTP Client (java.net.http + Methanol); remove Apache HttpClient#6482
[feature] Native EXPath HTTP Client (java.net.http + Methanol); remove Apache HttpClient#6482joewiz wants to merge 8 commits into
Conversation
Rewrites the HTTP-client side of the integration test suite onto the JDK java.net.http.HttpClient, removing Apache HttpClient from the test scope (the WebDAV tests are migrated separately in the following commit). - AbstractHttpTest: the shared test HTTP infrastructure now builds on java.net.http.HttpClient, exposing newHttpClient(), authenticatedRequest(), basicAuthorizationHeader(), executeForStatus(), executeForStatusAndBody(), assertRequestResponse() and the HttpResponseResult record (no Apache HC). - Integration tests across exist-core, restxq, file and persistentlogin migrated to those helpers; behavior and assertions unchanged. Requests that need preemptive HTTP Basic auth set the Authorization header explicitly. - GetParameterTest sends a multipart/form-data body, so it uses Methanol's MultipartBodyPublisher (Methanol added to the BOM and as an exist-core test dependency; exist-core also now publishes its test-jar for the new module). - Dropped the now-unused Apache HttpClient test dependencies from exist-core, restxq, file and persistentlogin. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ip tests The WebDAV integration tests were the last test code pulling in Apache HttpClient: they drove the server through the milton-client WebDAV library, which depends on Apache HttpClient. Replace them with JDK java.net.http round-trip tests and finish removing the now-unused Milton stack (the WebDAV server itself already runs on Apache Jackrabbit since the Jackrabbit migration). Why delete rather than port the old tests: their WebDAV *protocol* coverage (COPY, MOVE, LOCK/UNLOCK, PROPFIND, DELETE) is already provided, far more thoroughly, by the litmus compliance suite that runs in container CI (exist-docker/src/test/bats/ 04-webdav-litmus.bats -- 98/98: basic 16, copymove 13, props 33, locks 36). The old milton-client JUnit tests were both redundant with litmus and coupled to the milton client library this PR removes. WebDavRoundTripTest is intentionally narrower and complementary: it covers the eXist-specific concern litmus does not -- that content stored over WebDAV round-trips faithfully through eXist's storage and serialization. - Delete the milton-client-based tests (Copy/Delete/Lock/Rename/Replace/Serialization/ StoreAndRetrieve/CData) and the com.ettrema cache + AlwaysBasicPreAuth test helpers. - Add WebDavRoundTripTest with a small WebDavHttpClient helper (JDK java.net.http, PUT/GET/DELETE against /webdav/db). It asserts exact round-trip of an XML DOCTYPE, an XML declaration, a CDATA section, namespace declarations, non-ASCII content, and a binary document. - standalone webapp web.xml: map the WebDAV path to the Jackrabbit ExistWebdavServlet instead of the long-deleted MiltonWebDAVServlet (a dangling reference on develop that the round-trip test now depends on being correct). - Drop the milton-client (and its Apache httpclient/httpcore) dependencies from the webdav pom, the com.bradmcevoy log4j2 logger, and the ettrema dependabot group. - BUILD.md: document running the WebDAV round-trip tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds extensions/modules/http-client: a native implementation of the EXPath HTTP Client (http:send-request, namespace http://expath.org/ns/http-client) built on the JDK java.net.http.HttpClient, augmented by Methanol. No Apache HttpClient and no third-party EXPath HTTP library. - The client is a Methanol client with autoAcceptEncoding (advertises Accept-Encoding and transparently decodes gzip/deflate) and a read (inactivity) timeout. - Multipart request bodies are built with Methanol's MultipartBodyPublisher (the JDK client has no multipart support); each http:body part keeps its own Content-Type. - ResponseHandler relies on the client for transfer decoding rather than hand-rolling gzip handling. Includes a self-contained integration test (in-process com.sun.net.httpserver target plus embedded eXist): 70 send-request tests and 43 content-type unit tests, all green. This is the production home of the work prototyped at eXist-db/exist-http-client. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
d6b5b3f to
0496911
Compare
|
[This response was co-authored with Claude Code. -Joe] A note on the WebDAV test change in this PR, since it may raise eyebrows in review: The deleted Their WebDAV protocol coverage (COPY, MOVE, LOCK/UNLOCK, PROPFIND, DELETE) is, however, already provided — and far more thoroughly — by the litmus compliance suite that runs in container CI (
|
|
@joewiz the repeated failures in the docker job are a concern. Exist has errors in the logs upon a clean boot. |
line-o
left a comment
There was a problem hiding this comment.
I bet we have these constants in some package we already depend on.
…p http-client-java Switches the http://expath.org/ns/http-client registration in every conf.xml to the new native module (org.exist.xquery.modules.httpclient.HttpClientModule) and removes the old implementation, which depended on Apache HttpClient and the third-party EXPath http-client-java library. - extensions/expath: delete the old SendRequestFunction, HttpClientModule, EXistResult, EXistTreeBuilder and the now-orphaned org.expath.tools adapters; the module keeps only its Zip functions. Drop the http-client-java, tools-java, apache-mime4j-core and HC4 httpcore dependencies (and the now-unused junit). - exist-distribution: add the new exist-http-client module as a runtime dependency so it ships in the assembled distribution (and container). The old HttpClientModule shipped inside exist-expath; without this, the repointed conf.xml registration fails to load the class on a clean boot (ClassNotFoundException). - restxq: its XQSuite tests resolve http:send-request from the EXPath HTTP client, so swap the exist-expath test dependency for the new exist-http-client module (the client no longer lives in exist-expath). - debuggee: migrate HttpSession's XDEBUG_SESSION form POST from Apache HttpClient to the JDK java.net.http.HttpClient; drop the Apache HttpClient dependency from its pom. No production code uses Apache HttpClient after this change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e build With the EXPath HTTP client now native (java.net.http + Methanol), the integration tests on the JDK client, debuggee migrated, and the milton-client WebDAV tests replaced, nothing in the code base uses Apache HttpClient. Remove it from the build: - exist-parent BOM: drop the httpcore/httpclient/httpmime/fluent-hc (Apache HC4) managed dependencies and the apache.httpcomponents.* version properties; drop the now-unused milton-api/milton-client/milton-servlet managed dependencies and their version properties (Methanol stays). - exist-installer: drop the removed version properties from the IzPack includeProperties. `git grep org.apache.httpcomponents` over the poms and `import org.apache.hc` / `org.apache.http` over the sources now return nothing. Full reactor build is green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
0496911 to
8217ed6
Compare
Addresses review feedback: replace the remaining magic HTTP status integers in the migrated integration tests with java.net.HttpURLConnection.HTTP_* constants, matching the rest of the migrated suite (most of which already uses them; removing Apache HttpClient's HttpStatus.SC_* had left a few call sites on raw ints). - ControllerTest: 200 -> HTTP_OK, 201 -> HTTP_CREATED - JmxRemoteTest: 200 -> HTTP_OK - WebDavRoundTripTest: 201 -> HTTP_CREATED, 200 -> HTTP_OK Scoped to the tests this PR already touches; a wider sweep of status-code literals across the rest of the codebase is left as a separate change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
[This response was co-authored with Claude Code. -Joe] Thanks for catching it — root-caused and fixed. The clean-boot error was: This was a real packaging bug, not a test-only issue: the old Fixed by adding |
…hentication The EXPath HTTP Client spec (3.3) defines @send-authorization=false (the default) as: send the request without credentials, and only if the server answers with a 401 challenge, resend with the credentials. The native module previously implemented preemptive auth only — it required @send-authorization='true' and otherwise ignored the credentials — so the spec's default auth mode did not work. That is a regression versus both eXist's previous EXPath client and the BaseX reference implementation, which both perform challenge-response. - RequestBuilder: withhold credentials unless @send-authorization is true; add challengeResponse() to answer a 401 WWW-Authenticate challenge, computing a Basic or RFC 2617 Digest (MD5, qop=auth) Authorization header from the request credentials. build() now accepts an explicit Authorization header for the re-send. The body and auth assembly is split into helpers to keep NPath in check. - SendRequestFunction: on a 401 with credentials present and not yet sent, re-send once with the computed Authorization header. - Tests: add basicChallengeResponse and digestChallengeResponse — the test server issues the challenge and validates the Digest response hash end-to-end (72 module tests pass). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
line-o
left a comment
There was a problem hiding this comment.
LGTM!
This will make our lives much easier going forward.
It will be interesting to learn if we get failures in apps making heavy use of http-client.
The remaining codacy warning is in a test class which I would vote to ignore for the moment.
|
Does this PR fix #4256 |
eXist-db#4256) Adds connectionErrorIsCatchableAsExpathHC001, mirroring the reproducer in eXist-db#4256: a send-request to an unreachable host must surface the EXPath error expath-err:HC001 (namespace http://expath.org/ns/error), catchable from XQuery, rather than the raw org.expath.httpclient.HttpClientException the old Apache-based client raised. The native module maps connection/IO failures to HC001, so the issue is fixed; this test pins it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
[This response was co-authored with Claude Code. -Joe] Yes — it fixes #4256. The old client raised a raw On apps that make heavy use of Agreed on the remaining Codacy warning — it's the in-process test server's setup method; happy to leave it for now (it's actually lower than before this PR's additions). Thanks for the review! |
| <artifactId>jsr305</artifactId> | ||
| </dependency> | ||
|
|
||
| <dependency> |
There was a problem hiding this comment.
did not expect unit to disappear here. No tests remaining?
There was a problem hiding this comment.
[This response was co-authored with Claude Code. -Joe]
Right — HttpClientTest was the only test in exist-expath, and it exercised the old Apache-HttpClient-based EXPath client that this PR removes. With it gone, there are no test classes or test runner left in the module (only the log4j2.xml/conf.xml test resources), so junit is genuinely unused here. The HTTP client's tests now live in the new exist-http-client module.
Those leftover log4j2.xml/conf.xml test resources are now orphaned too — happy to strip them for a fully clean src/test, or leave them in case the module grows tests again. Whichever you'd prefer.
| */ | ||
| package org.exist.xquery.modules.httpclient; | ||
|
|
||
| import com.sun.net.httpserver.HttpServer; |
There was a problem hiding this comment.
I'd stay out of the com.sun packages. These are not guaranteed to be present.
a better route and sustainable route is to use wire mock.
There was a problem hiding this comment.
[This response was co-authored with Claude Code. -Joe]
Fair flag — sharing what we found so you and the devs can steer it. com.sun.net.httpserver here is the exported public API of the JDK jdk.httpserver module (documented since JDK 9), not a sun.* internal, so it is present in every standard JDK; it let the tests stand up a tiny in-process target with no extra dependency. There's no existing precedent in eXist's own tests either way — neither com.sun.net.httpserver nor WireMock is used today. WireMock would be the more conventional choice but adds a test dependency and a rewrite of the in-process server. No strong preference from our side; happy to go with whatever you and the team settle on as the house convention for this kind of HTTP test target.
|
I see that on some locations "preemptive authorization" is applied. For running the tests this makes sense as this will potentially save a handshake, But I am not sure if this is switched on for every outbound HTTPcall, as this might leak sensitive data. |
|
About performance/heavy load as asked by @line-o ; I need to deep dive here. If we are able to use the connection pool and have good (default) Executor management, I expect it will work out well. But the proof is in the pudding. |
|
[This response was co-authored with Claude Code. -Joe] On preemptive authorization: it's opt-in per request, not applied to every outbound call. Credentials are sent preemptively only when the request sets On performance / connection pooling and Executor management: good idea, and it looks readily achievable — the per-request client construction is the single seam where a shared, pooled client (with sensible Executor defaults, proxy, etc.) would slot in. Probably cleanest as a follow-on, but we're happy to fold it into this PR if you'd prefer it here. And agreed — the proof is in the pudding; glad to benchmark once there's a pooled path to measure. |
[This PR was co-authored with Claude Code. -Joe]
Summary
Replaces eXist's EXPath HTTP Client (
http:send-request) with a native implementation built on the JDKjava.net.http.HttpClient, augmented by Methanol (thanks to @dizzzz for the suggestion to investigate Methanol), and removes Apache HttpClient from the code base entirely — production and tests. Along the way it brings the client closer to the EXPath spec: full authentication (preemptive and challenge-response, Basic and Digest) and spec-compliant error codes.After this PR,
git grep org.apache.httpcomponentsover the poms andimport org.apache.http/import org.apache.hcover the sources both return nothing.This is the rewrite requested in #5771, and it also fixes #4256 (non-spec-compliant errors). It supersedes #6473 (which migrated to Apache HttpClient 5 as an intermediate step) and #6474 (the draft HC5 EXPath prerelease) — both can be closed once this lands. With the native module now living in core, the eXist-db/exist-http-client prototype repo can also be archived after merge.
Why
eXist's EXPath HTTP client depended on Apache HttpClient (HC4) and the third-party
http-client-javaEXPath library, and the integration test suite drove the server through HC4's fluent client (plus, for WebDAV, themilton-clientlibrary, which itself pulls in Apache HttpClient). Methanol is a thin, Apache-2.0 augmentation of the JDK client: it adds transparent gzip/deflate decoding, a read (inactivity) timeout,MediaType, andMultipartBodyPublisher— the pieces the bare JDK client lacks — while pulling in no Apache HttpClient.What changed
Native EXPath HTTP Client module (
extensions/modules/http-client)org.exist.xquery.modules.httpclientmodule implementinghttp:send-request(namespacehttp://expath.org/ns/http-client) onjava.net.http.HttpClient. Thehttp:requestelement, namespace, and function signatures are unchanged, so existing queries are unaffected.autoAcceptEncoding(advertisesAccept-Encodingand transparently decodes gzip/deflate) and a read timeout.MultipartBodyPublisher(the JDK client has no multipart support); eachhttp:bodypart keeps its ownContent-Type.Authentication (per EXPath spec §3.3)
@send-authorization='true'sends Basic credentials preemptively.Authorizationheader — Basic or RFC 2617 Digest (MD5,qop=auth). This challenge-response mode was missing from the initial port; it matches the previous eXist client and the BaseX reference implementation.Error handling (fixes #4256)
expath-err:HC001, timeouts asexpath-err:HC006, in the EXPath error namespace (http://expath.org/ns/error) — catchable from XQuery, rather than the raworg.expath.httpclient.HttpClientExceptionthe old client raised.Old EXPath client removed
org.expath.existHTTP client (SendRequestFunction,HttpClientModule, theEXistResult/EXistTreeBuilderadapters, and the orphanedorg.expath.toolsmodel classes). Theexist-expathmodule keeps only its Zip functions. Dropshttp-client-java,tools-java,apache-mime4j-coreand the HC4httpcoredependency.debuggee'sHttpSession(theXDEBUG_SESSIONform POST) is migrated to the JDK client.conf.xmlre-pointshttp://expath.org/ns/http-clientto the native module.Tests migrated off Apache HttpClient
AbstractHttpTestnow builds onjava.net.http.HttpClient; the integration tests across exist-core, restxq, file and persistentlogin use it.GetParameterTest(multipart) uses Methanol'sMultipartBodyPublisher.milton-client. They are replaced with JDKjava.net.httpround-trip tests (WebDavRoundTripTest), and the now-unused Milton stack is removed (the WebDAV server already runs on Apache Jackrabbit). The standalone webappweb.xmlis corrected to map the WebDAV path to the JackrabbitExistWebdavServlet(develop still referenced the long-deletedMiltonWebDAVServlet).Build
exist-distributionbundles the newexist-http-clientmodule (runtime dependency) so the repointedconf.xmlresolves the module class at runtime.exist-parentBOM: drop the Apache HttpClient (HC4) and Milton managed dependencies and their version properties; add Methanol.Spec references
http:send-request, thehttp:requestelement, the response format, authentication (§3.3), and error codes (HC001/HC006).Test plan
http:send-requestintegration tests (in-processcom.sun.net.httpservertarget + embedded eXist) and 43ContentTypeHelperunit tests — all green. Coverage includes HTTP methods, request/response bodies, multipart, gzip/deflate decoding, Content-Type classification (XML/HTML/text/JSON/binary), redirects, timeouts (HC006), status codes, multiple response headers, and authentication: Basic preemptive, Basic and Digest challenge-response (the test server validates the Digest response hash end-to-end), and connection errors catchable asexpath-err:HC001(EXPath HttpClient errors are not spec compliant #4256).WebDavRoundTripTestgreen — exact round-trip of XML DOCTYPE, XML declaration, CDATA, namespaces, non-ASCII content, and a binary document, over the JDK client against the Jackrabbit servlet.clean install) green; full unit suite green; container CI (incl. litmus WebDAV compliance) exercised.Notes for reviewers
NPathComplexityon the test classSendRequestFunctionTest.startHttpServer()(the in-process test server's endpoint registration). It is below where it started before this PR's auth additions; per review it can be ignored for now.milton-clientis an Apache HttpClient consumer, so retiring those tests is required. Their WebDAV protocol coverage is provided by the litmus compliance suite in container CI (98/98);WebDavRoundTripTestcovers the complementary eXist content-fidelity angle.http-client-javaremoves eXist's dependency on the upstream EXPath HTTP client, so merging this removes the eXist-related motivation from Cut a release from main (Apache HttpComponents 5) expath/expath-http-client-java#59 (that issue can remain open on its own merits).Closes #5771
Closes #4256