Skip to content

Commit 0db826a

Browse files
committed
Merge branch '3.7-dev' into 3.8-dev
Note that this moves the session reuse entry from 3.8.1 to 3.7.6 where it belongs.
2 parents e0a65c7 + 8e286c5 commit 0db826a

5 files changed

Lines changed: 250 additions & 133 deletions

File tree

docs/src/dev/provider/index.asciidoc

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1196,6 +1196,16 @@ in 3.5.0, but has been added back as of 3.5.2. Servers wishing to be compatible
11961196
this message (which is what Gremlin Server does as of 3.5.0). Drivers wishing to be compatible with servers prior to
11971197
3.3.11 may continue to send the message on calls to `close()`, otherwise such code can be removed.
11981198
1199+
NOTE: As of 3.7.6/3.8.1, the session lifecycle contract described above where sessions are cleaned up only when the
1200+
underlying connection closes can be modified by the `closeSessionPostGraphOp` server setting. When enabled, the
1201+
server will close a session after a successful TX_COMMIT or TX_ROLLBACK bytecode request, independent of the
1202+
connection state. This allows multiple short-lived transaction sessions to share a single WebSocket connection over
1203+
time, which is required by the Java GLV's `reuseConnectionsForSessions` option. This setting defaults to `false`
1204+
to preserve the established 3.5.0 behavior. Providers implementing session support should be aware that when this
1205+
setting is enabled, sessions and connections no longer have a one-to-one lifecycle relationship as a single connection
1206+
may host many sessions sequentially. Providers wishing to support this capability are recommended to use the same
1207+
`closeSessionPostGraphOp` configuration name for consistency across the TinkerPop ecosystem.
1208+
11991209
**`authentication` operation arguments**
12001210
12011211
[width="100%",cols="2,2,9",options="header"]

docs/src/reference/gremlin-applications.asciidoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1016,7 +1016,7 @@ The following table describes the various YAML configuration options that Gremli
10161016
|authorization.authorizer |The fully qualified classname of an `Authorizer` implementation to use. |_none_
10171017
|authorization.config |A `Map` of configuration settings to be passed to the `Authorizer` when it is constructed. The settings available are dependent on the implementation. |_none_
10181018
|channelizer |The fully qualified classname of the `Channelizer` implementation to use. A `Channelizer` is a "channel initializer" which Gremlin Server uses to define the type of processing pipeline to use. By allowing different `Channelizer` implementations, Gremlin Server can support different communication protocols (e.g. WebSocket). |`WebSocketChannelizer`
1019-
|closeSessionPostGraphOp |Controls whether a `Session` will be closed by the server after a successful TX_COMMIT or TX_ROLLBACK bytecode request. |_false_
1019+
|closeSessionPostGraphOp |Controls whether a `Session` will be closed by the server after a successful TX_COMMIT or TX_ROLLBACK bytecode request. This setting should be enabled when clients use the `reuseConnectionsForSessions` option (see <<gremlin-java-connection-reuse>>), which allows transaction sessions to share pooled connections. Without this setting, sessions opened by `reuseConnectionsForSessions` will not be cleaned up after commit or rollback and will remain open on the server until the session timeout expires, leading to unnecessary resource consumption. Defaults to `false` for backward compatibility. |_false_
10201020
|enableAuditLog |The `AuthenticationHandler`, `AuthorizationHandler` and processors can issue audit logging messages with the authenticated user, remote socket address and requests with a gremlin query. For privacy reasons, the default value of this setting is false. The audit logging messages are logged at the INFO level via the `audit.org.apache.tinkerpop.gremlin.server` logger, which can be configured using the `logback.xml` file. |_false_
10211021
|graphManager |The fully qualified classname of the `GraphManager` implementation to use. A `GraphManager` is a class that adheres to the TinkerPop `GraphManager` interface, allowing custom implementations for storing and managing graph references, as well as defining custom methods to open and close graphs instantiations. To prevent Gremlin Server from starting when all graphs fails, the `CheckedGraphManager` can be used.|`DefaultGraphManager`
10221022
|graphs |A `Map` of `Graph` configuration files where the key of the `Map` becomes the name to which the `Graph` will be bound and the value is the file name of a `Graph` configuration file. |_none_

docs/src/reference/gremlin-variants.asciidoc

Lines changed: 113 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -888,6 +888,112 @@ Please see the link:https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/
888888
Transactions with Java are best described in <<transactions,The Traversal - Transactions>> section of this
889889
documentation as Java covers both embedded and remote use cases.
890890
891+
[[gremlin-java-connection-reuse]]
892+
=== Connection Reuse for Transactions
893+
894+
By default, each call to `g.tx()` opens a new dedicated WebSocket connection for the session that backs the
895+
transaction. For workloads that issue many short-lived transactions, the overhead of repeatedly establishing and
896+
tearing down WebSocket connections can become significant, particularly when the client and server are separated by
897+
network latency.
898+
899+
The `reuseConnectionsForSessions` option on `Cluster.Builder` changes this behavior so that transaction sessions
900+
borrow connections from the existing connection pool instead of creating dedicated ones. When a transaction commits or
901+
rolls back, the borrowed connection is returned to the pool and becomes available for the next transaction.
902+
903+
==== Enabling Connection Reuse
904+
905+
This feature requires configuration on both the client and the server (if using Gremlin Server).
906+
907+
On the client, enable `reuseConnectionsForSessions` when building the `Cluster`:
908+
909+
[source,java]
910+
----
911+
Cluster cluster = Cluster.build("localhost")
912+
.reuseConnectionsForSessions(true)
913+
.create();
914+
GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using(cluster));
915+
----
916+
917+
The same setting can be specified in a YAML configuration file used with `Cluster.open()`:
918+
919+
[source,yaml]
920+
----
921+
reuseConnectionsForSessions: true
922+
----
923+
924+
On servers based on the Gremlin Server, enable `closeSessionPostGraphOp` so that sessions are closed immediately after
925+
a commit or rollback completes:
926+
927+
[source,yaml]
928+
----
929+
# gremlin-server.yaml
930+
closeSessionPostGraphOp: true
931+
----
932+
933+
IMPORTANT: Both settings must be configured together. If `reuseConnectionsForSessions` is enabled on the client but
934+
`closeSessionPostGraphOp` is not enabled on the server, sessions will not be cleaned up after commit or rollback.
935+
These leaked sessions will accumulate on the server until the configured session timeout is reached, consuming server
936+
resources unnecessarily.
937+
938+
NOTE: Some Remote Gremlin Providers may handle session cleanup automatically and may not require explicit
939+
`closeSessionPostGraphOp` configuration. Consult the provider's documentation to determine whether this behavior is
940+
enabled by default, requires explicit configuration, or is unsupported.
941+
942+
==== Usage
943+
944+
The transaction API itself does not change when connection reuse is enabled. The standard pattern of `begin`, mutate,
945+
and `commit` or `rollback` applies:
946+
947+
[source,java]
948+
----
949+
Cluster cluster = Cluster.build("localhost")
950+
.reuseConnectionsForSessions(true)
951+
.create();
952+
GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using(cluster));
953+
954+
GraphTraversalSource gtx = g.tx().begin();
955+
gtx.addV("person").property("name", "marko").iterate();
956+
gtx.addV("software").property("name", "lop").iterate();
957+
gtx.tx().commit();
958+
----
959+
960+
After `commit()` or `rollback()`, the connection is returned to the pool. A subsequent call to `g.tx().begin()` will
961+
borrow a connection from the pool again, potentially reusing the same underlying WebSocket connection:
962+
963+
A `GraphTraversalSource` obtained from `begin()` cannot be reused after its transaction has been committed or rolled
964+
back. Attempting to do so will result in an exception. A fresh call to `g.tx().begin()` is required for each new
965+
transaction.
966+
967+
==== Concurrent Transactions
968+
969+
Multiple transactions can be open simultaneously. Each transaction gets its own server-side session regardless of
970+
whether the underlying connections are shared. Because the underlying connection is borrowed rather than created, other
971+
settings on the `Cluster` such as `minConnectionPoolSize` and `maxSimultaneousUsagePerConnection` will have an effect
972+
on how the connection gets borrowed. These settings may need to be tweaked if there are many concurrent transactions.
973+
974+
==== Restrictions
975+
976+
Connection reuse for transactions has the following restrictions:
977+
978+
* It is designed for short-lived transaction sessions that follow the begin/mutate/commit-or-rollback pattern. It
979+
should not be used for classic long-running sessions such as those used with a remote console. For long-running
980+
sessions, use the standard `cluster.connect(sessionId)` approach described in the
981+
<<sessions,Considering Sessions>> Section.
982+
* It is not compatible with `HttpChannelizer`. Attempting to call `tx()` when the driver is configured with
983+
`HttpChannelizer` will throw an `IllegalStateException`. This restriction applies regardless of the
984+
`reuseConnectionsForSessions` setting.
985+
986+
==== When to Use
987+
988+
Connection reuse provides the greatest benefit when:
989+
990+
* Network latency between client and server is significant (e.g. cross-region deployments).
991+
* Transactions are lightweight (few operations per transaction).
992+
* Many short-lived transactions are issued in sequence or concurrently.
993+
994+
For local deployments or transactions that perform substantial graph mutations, the connection setup overhead is a
995+
smaller proportion of the total transaction time and the benefit is correspondingly smaller.
996+
891997
[[gremlin-java-serialization]]
892998
=== Serialization
893999
@@ -2941,15 +3047,13 @@ therefore cardinality functions that take a value like `list()`, `set()`, and `s
29413047
[[gremlin-python-limitations]]
29423048
=== Limitations
29433049
2944-
* Traversals that return a `Set` *might* be coerced to a `List` in Python. In the case of Python, number equality
2945-
is different from JVM languages which produces different `Set` results when those types are in use. When this case
2946-
is detected during deserialization, the `Set` is coerced to a `List` so that traversals return consistent
2947-
results within a collection across different languages. If a `Set` is needed then convert `List` results
2948-
to `Set` manually.
2949-
* Traversals that return a `Set` containing non-hashable items, such as `Dictionary`, `Set` and `List`, will be coerced
2950-
into a `List` during deserialization. Python requires set elements to be hashable, for which Gremlin does not. If a
2951-
`Set` is needed, convert elements to hashable equivalents manually (e.g. `dict` to `HashableDict`, `list` to `tuple`,
2952-
`set` to `frozenset`).
3050+
* Traversals that return a `Set` may be coerced to a `List` in Python in two cases. First, when the `Set` contains
3051+
mixed numeric types (e.g. `int` and `float`), because Python number equality differs from the JVM — a Java `Set` of
3052+
`[1, 1.0d]` has two elements, but Python considers `1 == 1.0` and would collapse them to one, so the `Set` is coerced to
3053+
a `List` to preserve all elements consistently across languages. Second, when the `Set` contains non-hashable items such
3054+
as `Dictionary`, `Set`, or `List`, because Python requires set elements to be hashable while Gremlin does not, the `Set`
3055+
is also coerced to a `List`. For this case, if a `Set` is needed, convert elements to hashable equivalents manually
3056+
(e.g. `dict` to `HashableDict`, `list` to `tuple`, `set` to `frozenset`).
29533057
* Gremlin is capable of returning `Dictionary` results that use non-hashable keys (e.g. Dictionary as a key) and Python
29543058
does not support that at a language level. Using GraphSON 3.0 or GraphBinary (after 3.5.0) makes it possible to return
29553059
such results. In all other cases, Gremlin that returns such results will need to be re-written to avoid that sort of

docs/src/upgrade/release-3.7.x.asciidoc

Lines changed: 126 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ Gremlin Javascript now supports Node 22 and 24 alongside Node 20.
6161
6262
Gremlin Go has been upgraded to Go version 1.25.
6363
64-
==== Python Set Deserialization with Non-Hashable Elements
64+
==== Python Set-to-List Fallback
6565
6666
Traversals that return a `Set` containing non-hashable items (such as `Dictionary`, `Set`, or `List`) previously caused
6767
a `TypeError` during deserialization in Gremlin-Python. These results are now coerced to a `List` to avoid errors. This
@@ -70,10 +70,135 @@ Python hashable types manually (e.g. `dict` to `HashableDict`, `list` to `tuple`
7070
7171
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3232[TINKERPOP-3232]
7272
73+
==== Remote Transaction Improvements
74+
75+
The Java driver now supports reusing existing pooled WebSocket connections for session-based requests rather than
76+
establishing a dedicated connection per session. This behavior is controlled by the `Cluster.Builder` option
77+
`reuseConnectionsForSessions`, which defaults to `false`.
78+
79+
When enabled, a `Client.SessionedChildClient` will attempt to borrow a connection from the connection pool of a standard
80+
`Client` rather than opening its own WebSocket connection. This avoids the overhead of the TCP handshake and WebSocket
81+
upgrade for each session, which can be significant when issuing many short-lived transactions.
82+
83+
[source,java]
84+
----
85+
// Enable connection reuse for sessions
86+
Cluster cluster = Cluster.build(host)
87+
.reuseConnectionsForSessions(true)
88+
.create();
89+
----
90+
91+
This feature was designed specifically for use with remote transactions, where sessions are short-lived and terminate
92+
after a `commit()` or `rollback()`. It should not be used for classic long-running session use cases where a session
93+
is used for purposes other than transactions such as remote console.
94+
95+
===== Server Configuration
96+
97+
When using `reuseConnectionsForSessions`, the server should be configured to close sessions immediately after a graph
98+
operation such as commit() or rollback() completes. Without this behavior, sessions may remain open until the session
99+
timeout expires, potentially leading to a buildup of idle sessions on the server side.
100+
101+
Some remote graph providers handle this automatically and require no additional configuration. For the reference Gremlin
102+
Server, this is controlled by the `closeSessionPostGraphOp` setting, which should be set to true. Users of other graph
103+
providers should consult their provider's documentation to determine whether this behavior is enabled by default,
104+
requires explicit configuration or is unsupported.
105+
106+
[source,yaml]
107+
----
108+
# gremlin-server.yaml
109+
closeSessionPostGraphOp: true
110+
----
111+
112+
IMPORTANT: Failing to enable `closeSessionPostGraphOp` on the server when using `reuseConnectionsForSessions` on the
113+
client will result in sessions that are not properly cleaned up. These leaked sessions will accumulate until the
114+
configured `sessionLifetimeTimeout` is reached, consuming server resources unnecessarily.
115+
116+
===== Performance
117+
118+
Performance was measured with an ad-hoc benchmark application. The application executes a configurable number of
119+
complete transaction lifecycles (begin, mutate, commit) and reports throughput and latency percentiles. Each transaction
120+
opens a session, submits one or more `addV()` operations, commits, and closes the session.
121+
122+
The benchmark varies the following parameters:
123+
124+
* *Concurrent clients* (`threads`): The number of threads issuing transactions simultaneously. A value of 1 means
125+
transactions are executed sequentially by a single client. Higher values simulate multiple application threads or
126+
service instances issuing transactions concurrently against the same server.
127+
* *Connection pool size* (`pool`): The number of WebSocket connections maintained in the pool when
128+
`reuseConnectionsForSessions` is enabled. When reuse is disabled, each session creates its own dedicated connection
129+
and this parameter does not apply (shown as `n/a`).
130+
* *Transaction weight* (`weight`): "light" transactions perform a single `addV()` plus commit. "heavy" transactions
131+
perform ten `addV()` operations plus commit, simulating a more substantial unit of work per transaction.
132+
133+
Tests were conducted both locally (client and server on the same machine) and remotely (client on the US west coast,
134+
server on the US east coast) to isolate the effect of network latency on connection setup overhead. Each scenario
135+
executed 1000 transactions after a warmup phase of 50 transactions.
136+
137+
*Local Results (same machine)*
138+
139+
[cols="3,1,1,1", options="header"]
140+
|=========================================================
141+
|Configuration |No-Reuse (tx/s) |Best-Reuse (tx/s) |Speedup
142+
|1 client, light |23.1 |26.7 |1.16x
143+
|8 clients, light |25.2 |28.5 |1.13x
144+
|16 clients, light |25.4 |27.9 |1.10x
145+
|1 client, heavy |26.0 |26.9 |1.03x
146+
|8 clients, heavy |26.4 |27.9 |1.06x
147+
|16 clients, heavy |25.8 |26.5 |1.03x
148+
|=========================================================
149+
150+
*Remote Results (west coast to east coast)*
151+
152+
[cols="3,1,1,1", options="header"]
153+
|=========================================================
154+
|Configuration |No-Reuse (tx/s) |Best-Reuse (tx/s) |Speedup
155+
|1 client, light |3.6 |7.6 |2.10x
156+
|8 clients, light |15.6 |23.0 |1.48x
157+
|16 clients, light |15.4 |25.3 |1.64x
158+
|1 client, heavy |1.4 |1.8 |1.26x
159+
|8 clients, heavy |9.2 |10.8 |1.17x
160+
|16 clients, heavy |14.5 |15.9 |1.10x
161+
|=========================================================
162+
163+
The "Best-Reuse" column reflects the highest throughput observed across all tested pool sizes (2, 4, and 8 connections)
164+
for each scenario.
165+
166+
The benefit of connection reuse is most pronounced in remote scenarios with light transactions. When the network
167+
round-trip cost is high and the transaction payload is small, the WebSocket connection setup overhead represents a
168+
larger proportion of the total transaction time. In the single-client remote light workload, connection reuse yielded a
169+
2.10x throughput improvement, as the connection handshake cost dominated the per-transaction time. With 16 concurrent
170+
clients in the same remote light scenario, throughput improved from 15.4 tx/s to 25.3 tx/s (1.64x), as the connection
171+
pool amortized the setup cost across many parallel sessions.
172+
173+
As transaction weight increases, the relative benefit diminishes because the graph operations themselves become the
174+
bottleneck rather than connection setup. In the local heavy workload scenarios, the improvement was only 3-6%, as the
175+
connection overhead was already negligible relative to the cost of the graph mutations. Even in the remote heavy
176+
scenarios, the improvement ranged from 10-26%, as the ten `addV()` operations per transaction shifted the time
177+
distribution toward server-side processing.
178+
179+
In summary, `reuseConnectionsForSessions` provides the greatest benefit when:
180+
181+
* Network latency between client and server is significant (remote deployments)
182+
* Transactions are lightweight (few operations per transaction)
183+
* Many short-lived transactions are issued in sequence or concurrently
184+
185+
See: link:https://issues.apache.org/jira/browse/TINKERPOP-3213[TINKERPOP-3213]
186+
73187
=== Upgrading for Providers
74188
75189
==== Graph System Providers
76190
191+
===== Session Changes
192+
193+
An option has been added to the Java GLV (`reuseConnectionsForSessions`) that allows for borrowing open WebSocket
194+
connections for sessions. This is primarily to reduce the overhead of new connection setup per session. This can lead
195+
to large performance gains in remote transaction scenarios where there are many small mutation traversals.
196+
197+
This option is disabled by default on the driver but providers may want to add an option that will allow sessions to end
198+
on the successful completion of a graph operation (commit/rollback). This will prevent a buildup of sessions if a user
199+
has enabled this option as the driver will *not* close the underlying WebSocket connection as a signal to end the
200+
session. Gremlin Server has added an option like this called `closeSessionPostGraphOp`. Remote graph providers are
201+
encouraged to add the same functionality.
77202
78203
==== Graph Driver Providers
79204

0 commit comments

Comments
 (0)