Skip to content

perf(txpool): tombstone eviction-order removals in AA 2D pool#5598

Draft
mattsse wants to merge 1 commit into
mainfrom
mattsse/txpool-saturation-opt
Draft

perf(txpool): tombstone eviction-order removals in AA 2D pool#5598
mattsse wants to merge 1 commit into
mainfrom
mattsse/txpool-saturation-opt

Conversation

@mattsse

@mattsse mattsse commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

At high TPS a large share of the txpool maintenance thread is spent in AA2dPool::on_state_updates removing mined transactions from the eviction-order sets: every removal recomputed the transaction's priority (effective_tip_per_gas) and performed a keyed BTreeSet removal. Removals now only flip a liveness flag. Tombstoned entries are skipped during eviction scans and pops, and each set is compacted once more than half of its entries are stale, making removal O(1) amortized.

For regular 2D transactions the flag lives on the already Arc-shared AA2dInternalTransaction, so tombstoning adds no allocation to the insertion path. Expiring nonce transactions carry an Arc'd flag shared between the pool entry and its eviction-order keys. As a side effect, BestAA2dTransactions snapshots now skip expiring nonce transactions that were mined or evicted after the snapshot was taken instead of yielding them.

Includes a new criterion benchmark (cargo bench -p tempo-transaction-pool --bench aa_2d_pool) covering insertion into a pool at capacity (every insert evicts) and state updates that mine many transactions at once, for both regular 2D and expiring nonce transactions.

Re-measured after #5602 and #5603 landed, against current main (median of six interleaved runs, pool saturated at 10k transactions):

scenario main this PR change
on_state_updates/expiring_mined 3.23 ms 2.59 ms -20%
on_state_updates/2d_mined 5.45 ms 4.49 ms -18%
add_at_capacity / add_fill unchanged within noise

Trade-off: tombstoned entries keep their transaction Arcs alive until the next compaction, so the eviction-order sets can transiently hold up to twice as many entries as the pool.

@mattsse

mattsse commented Jun 10, 2026

Copy link
Copy Markdown
Contributor Author

derek bench

@decofe

decofe commented Jun 10, 2026

Copy link
Copy Markdown
Member

cc @mattsse

❌ Benchmark complete: Regression View job

❌ Bench Comparison: Regression

Refs: main vs mattsse/txpool-saturation-opt
Criteria: 95% run-bootstrap CI must clear floor; cells show delta (+/-CI/floor).

Configuration

  • Derek command: derek bench mode=e2e preset=tip20 duration=90 bloat=100 tps=50000 accounts=1000 max-concurrent-requests=500 baseline=main feature=mattsse/txpool-saturation-opt baseline-hardfork="" feature-hardfork="" gas-limit=1000000000 run-pairs=3 otlp=true metrics=false no-cache=false force-bloat=false
  • Bloat: 100000 MiB
  • Preset: tip20
  • Target TPS: 50000
  • Duration: 90s
  • Run pairs: 3
  • Baseline blocks: 480
  • Feature blocks: 483

Tempo Metrics

Metric Baseline Feature Delta
TPS Mean 21891 21529 -1.65% ❌ (+/-0.48/floor 0.55)
Gas Throughput [Mgas/s] 1112.9 1094.4 -1.66% ❌ (+/-0.48/floor 0.50)
Block Time Mean [ms] 555.7 552.6 -0.56% ⚪ (+/-0.34/floor 0.40)
Block Time P50 [ms] 555.0 553.0 -0.36% ⚪ (+/-0.39/floor 0.70)
Block Time P90 [ms] 584.0 584.0 +0.00% ⚪ (+/-1.03/floor 0.70)
Block Time P99 [ms] 603.0 619.0 +2.65% ⚪ (+/-2.05/floor 1.60)
Serialized Block Size / Tx P50 [B/tx] 251.1 251.1 +0.00% ⚪ (+/-0.00/floor 0.70)
Serialized Block Size / Tx P90 [B/tx] 251.1 251.1 +0.00% ⚪ (+/-0.00/floor 0.70)
Serialized Block Size / Tx P99 [B/tx] 251.1 251.1 +0.00% ⚪ (+/-0.00/floor 0.70)

Builder

Metric Baseline Feature Delta
Gas Throughput [Mgas/s] 2627.5 2565.1 -2.37% ❌ (+/-0.60/floor 0.70)
P50 [ms] 233.9 234.1 +0.09% ⚪ (+/-0.40/floor 0.35)
P90 [ms] 248.8 250.8 +0.80% ⚪ (+/-0.80/floor 0.70)
P99 [ms] 290.6 297.0 +2.20% ⚪ (+/-1.92/floor 0.95)
Builder details
Metric Baseline Feature Delta
Finish P50 [ms] 16.8 16.3 -2.98%
Finish P90 [ms] 26.2 25.2 -3.82%
Finish P99 [ms] 33.0 33.2 +0.61%
Pool Fetch P50 [ms] 8.6 11.9 +38.37%
Pool Fetch P90 [ms] 22.1 26.6 +20.36%
Pool Fetch P99 [ms] 28.1 54.8 +95.02%
Included Tx Exec P50 [ms] - -
Included Tx Exec P90 [ms] - -
Included Tx Exec P99 [ms] - -
Invalid Tx Exec P50 [ms] - -
Invalid Tx Exec P90 [ms] - -
Invalid Tx Exec P99 [ms] - -
Invalid Tx Attempts P50 0.0 0.0 0.00%
Invalid Tx Attempts P90 0.0 0.0 0.00%
Invalid Tx Attempts P99 0.0 0.0 0.00%
Invalid Tx Skips 0 0 0.00%
Nonce Too Low Skips 0 0 0.00%
Serialized Block Size P50 [KiB] 2987.7 2930.8 -1.90%
Serialized Block Size P90 [KiB] 3326.2 3252.1 -2.23%
Serialized Block Size P99 [KiB] 3531.3 3545.8 +0.41%
Fill Overhead P50 [ms] - -
Fill Overhead P90 [ms] - -
Fill Overhead P99 [ms] - -
Fill Idle P50 [ms] 0.0 0.0 0.00%
Fill Idle P90 [ms] 52.0 54.0 +3.85%
Fill Idle P99 [ms] 132.0 136.0 +3.03%

Validator

Metric Baseline Feature Delta
Gas Throughput [Mgas/s] 2642.3 2625.1 -0.65% ⚪ (+/-0.36/floor 0.65)
P50 [ms] 235.8 232.8 -1.27% ⚪ (+/-0.79/floor 1.55)
P90 [ms] 257.1 254.0 -1.21% ⚪ (+/-0.95/floor 1.55)
P99 [ms] 270.7 267.9 -1.03% ⚪ (+/-1.97/floor 2.05)

Observability

At high TPS the txpool maintenance thread spends a large share of its time
removing mined transactions from the eviction-order BTreeSets: every removal
recomputed the transaction's priority (effective tip) and performed a keyed
BTreeSet removal. Removals now only flip a liveness flag; tombstoned entries
are skipped during eviction scans and compacted once more than half of a set
is stale, making removal O(1) amortized.

For regular 2D transactions the flag lives on the Arc-shared
AA2dInternalTransaction, so tombstoning adds no allocation to the insertion
path. Expiring nonce transactions carry an Arc'd flag shared between the
pool entry and its eviction-order keys. As a side effect, best-transaction
snapshots now skip expiring nonce transactions that were mined or evicted
after the snapshot was taken.

Benchmarked with the new aa_2d_pool benchmark against main (median of six
interleaved runs, pool saturated at 10k transactions):
- on_state_updates/expiring_mined: 3.23ms -> 2.59ms (-20%)
- on_state_updates/2d_mined:       5.45ms -> 4.49ms (-18%)
- add benches unchanged within noise
@mattsse mattsse force-pushed the mattsse/txpool-saturation-opt branch from 3aea50f to 5454528 Compare June 10, 2026 23:49
Thegreatsura pushed a commit to Thegreatsura/tempo that referenced this pull request Jun 11, 2026
…empoxyz#5602)

Profiling a saturated AA 2D pool at 50k TPS showed the `txs_by_sender`
lookups account for roughly a third of insertion time: every insert did
a `get` for the per-sender limit check plus a separate `entry` to
increment the count. These are now a single map operation.

Two more lookups removed on the same path: the `by_hash` duplicate check
is skipped for expiring nonce transactions since the expiring nonce hash
entry already rejects duplicates, and the descendant promotion scan
exits once it reaches an already-pending transaction.

Measured with the benchmark from tempoxyz#5598 (10k transaction pool): filling
the pool improves ~40% for both expiring and regular 2D nonce
transactions, inserting into a 2D pool at capacity ~16%.

Extracted from tempoxyz#5598.
Thegreatsura pushed a commit to Thegreatsura/tempo that referenced this pull request Jun 11, 2026
…5603)

Resolving the active hardfork walked the chain spec's fork schedule and
cloned the chain spec `Arc` once per validated transaction in
`validate_one` and once per inserted AA 2D transaction in
`add_validated_transaction`.

The validator now caches the active hardfork as an `AtomicU8` holding
its `TempoHardfork::VARIANTS` index, updated in `on_new_head_block`
where the EVM environment for the new tip is already constructed. The
new `TempoHardfork::variant_index`/`from_variant_index` helpers handle
the atomic round-trip.

Extracted from tempoxyz#5598.
@mattsse

mattsse commented Jun 11, 2026

Copy link
Copy Markdown
Contributor Author

derek bench

@decofe

decofe commented Jun 11, 2026

Copy link
Copy Markdown
Member

cc @mattsse

⚪ Benchmark complete: No Difference View job

⚪ Bench Comparison: No Difference

Refs: main vs mattsse/txpool-saturation-opt
Criteria: 95% run-bootstrap CI must clear floor; cells show delta (+/-CI/floor).

Configuration

  • Derek command: derek bench mode=e2e preset=tip20 duration=90 bloat=100 tps=50000 accounts=1000 max-concurrent-requests=500 baseline=main feature=mattsse/txpool-saturation-opt baseline-hardfork="" feature-hardfork="" gas-limit=1000000000 run-pairs=3 otlp=true metrics=false no-cache=false force-bloat=false
  • Bloat: 100000 MiB
  • Preset: tip20
  • Target TPS: 50000
  • Duration: 90s
  • Run pairs: 3
  • Baseline blocks: 476
  • Feature blocks: 477

Tempo Metrics

Metric Baseline Feature Delta
TPS Mean 21680 21449 -1.07% ⚪ (+/-0.75/floor 0.55)
Gas Throughput [Mgas/s] 1102.1 1090.4 -1.06% ⚪ (+/-0.75/floor 0.50)
Block Time Mean [ms] 560.9 561.0 +0.02% ⚪ (+/-0.41/floor 0.40)
Block Time P50 [ms] 561.0 562.0 +0.18% ⚪ (+/-0.48/floor 0.70)
Block Time P90 [ms] 603.0 606.0 +0.50% ⚪ (+/-0.86/floor 0.70)
Block Time P99 [ms] 623.0 638.0 +2.41% ⚪ (+/-2.33/floor 1.60)
Serialized Block Size / Tx P50 [B/tx] 251.1 251.1 +0.00% ⚪ (+/-0.00/floor 0.70)
Serialized Block Size / Tx P90 [B/tx] 251.1 251.1 +0.00% ⚪ (+/-0.00/floor 0.70)
Serialized Block Size / Tx P99 [B/tx] 251.1 251.1 +0.00% ⚪ (+/-0.00/floor 0.70)

Builder

Metric Baseline Feature Delta
Gas Throughput [Mgas/s] 2620.2 2599.1 -0.81% ⚪ (+/-1.22/floor 0.95)
P50 [ms] 233.3 234.0 +0.30% ⚪ (+/-0.69/floor 0.45)
P90 [ms] 250.7 249.9 -0.32% ⚪ (+/-1.42/floor 0.90)
P99 [ms] 297.2 306.1 +2.99% ⚪ (+/-1.83/floor 1.25)
Builder details
Metric Baseline Feature Delta
Finish P50 [ms] 17.3 16.5 -4.62%
Finish P90 [ms] 25.0 23.9 -4.40%
Finish P99 [ms] 29.8 30.5 +2.35%
Pool Fetch P50 [ms] 9.6 12.0 +25.00%
Pool Fetch P90 [ms] 21.5 25.9 +20.47%
Pool Fetch P99 [ms] 26.8 49.6 +85.07%
Included Tx Exec P50 [ms] - -
Included Tx Exec P90 [ms] - -
Included Tx Exec P99 [ms] - -
Invalid Tx Exec P50 [ms] - -
Invalid Tx Exec P90 [ms] - -
Invalid Tx Exec P99 [ms] - -
Invalid Tx Attempts P50 0.0 0.0 0.00%
Invalid Tx Attempts P90 0.0 0.0 0.00%
Invalid Tx Attempts P99 0.0 0.0 0.00%
Invalid Tx Skips 0 0 0.00%
Nonce Too Low Skips 0 0 0.00%
Serialized Block Size P50 [KiB] 2988.4 2954.6 -1.13%
Serialized Block Size P90 [KiB] 3355.6 3337.7 -0.53%
Serialized Block Size P99 [KiB] 3550.2 3551.9 +0.05%
Fill Overhead P50 [ms] - -
Fill Overhead P90 [ms] - -
Fill Overhead P99 [ms] - -
Fill Idle P50 [ms] 0.0 0.0 0.00%
Fill Idle P90 [ms] 56.0 54.0 -3.57%
Fill Idle P99 [ms] 133.0 146.0 +9.77%

Validator

Metric Baseline Feature Delta
Gas Throughput [Mgas/s] 2576.2 2583.6 +0.29% ⚪ (+/-0.69/floor 0.65)
P50 [ms] 244.8 243.6 -0.49% ⚪ (+/-1.79/floor 1.55)
P90 [ms] 274.6 273.3 -0.47% ⚪ (+/-1.40/floor 1.55)
P99 [ms] 291.6 289.7 -0.65% ⚪ (+/-2.47/floor 2.05)

Observability

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants