Bulk vtxo pubkey lookup + chunked WatchScripts on startup#1084
Bulk vtxo pubkey lookup + chunked WatchScripts on startup#1084bitcoin-coder-bob wants to merge 7 commits into
Conversation
restoreWatchingVtxos and the matching Stop() path previously fired one GetVtxoPubKeysByCommitmentTxid call per sweepable round. At 1000 rounds this took ~46s on sqlite (1001 queries); at 10000 it did not complete in 3 minutes. The view's LIKE-based membership scan over the group_concat'd commitments column scales worse than O(N). Add a bulk variant GetVtxoPubKeysByCommitmentTxids on the VtxoRepository interface and all three backends (sqlite, postgres, badger). Replace both N+1 loops in internal/core/application/service.go with a single bulk call. New sqlc query uses sqlc.slice in sqlite (twice, because the generator only rewrites the first occurrence per query) and ANY($1::text[]) in postgres. New migration adds an index on vtxo_commitment_txid(commitment_txid) so the bulk join is fast. Benchmark (Go testing.B, sqlite + badger, localhost): rounds=10 per-txid 3.08ms bulk 0.30ms 10x rounds=100 per-txid 257ms bulk 2.27ms 113x rounds=1000 per-txid 45.46s bulk 35.1ms 1297x rounds=5000 bulk 266ms DB calls drop from 1+N to 2 regardless of N. The cross-process WatchScripts gRPC call from arkd to arkd-wallet also hit the default 4 MiB max-message-size at 100k+ scripts. Chunk the scripts list in walletclient (WatchScripts and UnwatchScripts) at watchScriptsChunkSize (default 2000, ~150 KiB per call). Tests added: - TestChunkStrings + TestWalletClientWatchScriptsChunking + TestWalletClientUnwatchScriptsChunking exercise chunk boundaries, large inputs, and mid-chunk error short-circuit - TestVtxoPubKeysBulkMatchesLoop seeds 50 rounds and asserts the bulk result equals the union of per-txid results - Bulk method coverage added inline in testVtxoRepository for the existing service_test loop - BenchmarkGetVtxoPubKeysByCommitmentTxid_PerTxidLoop and BenchmarkGetVtxoPubKeysByCommitmentTxids_Bulk for repeatable A/B Throwaway instrumentation kept on the branch (will be removed in a follow-up commit once we're done re-bench'ing): nbxplorer RoundTripper profiler, startup phase markers in arkd and arkd-wallet, cmd/arkd-bench-seed for seeding synthetic sweepable rounds.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughAdds a bulk VTXO pubkey lookup across commitment txids (Badger/Postgres/SQLite), migrations and sqlc queries, refactors service Stop/restoreWatchingVtxos to use the bulk path and validate returned pubkeys, and implements wallet client WatchScripts/UnwatchScripts chunking with tests and benchmarks. ChangesBulk VTXO Pubkey Query Optimization
Application Service Refactoring
Wallet Client Script Chunking
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review: arkd#1084 — Bulk vtxo pubkey query + chunked WatchScripts
The core optimization (N+1 → 2 DB calls, chunked gRPC) is well-motivated and the benchmark numbers are compelling. Tests are thorough. However, there are issues that must be addressed before merge.
🔴 Must Fix
1. Badger amount filter: >= in query vs > in dedup loop
internal/infrastructure/db/badger/vtxo_repo.go — new GetVtxoPubKeysByCommitmentTxids
The badgerhold query uses .Ge(amountFilter) (≥), but the Go dedup loop uses strict >:
// Lines in new method — two occurrences:
if vtxo.Amount > amountFilter {
pubkeyMap[vtxo.PubKey] = struct{}{}
}The SQL backends use WHERE v.amount >= $1 consistently. A VTXO with amount == amountFilter will be included by sqlite/postgres but excluded by badger. This is copied from the pre-existing singular method (same bug there), but since you're touching this code, please fix both.
Today's callers pass 0 so it's not biting in production, but the contract is >= per the SQL definitions, and a future caller passing a real threshold will get inconsistent results across backends.
2. Throwaway instrumentation must not merge to master
The PR description says "Throwaway instrumentation kept on the branch (will be removed in a follow-up commit once we're done re-bench'ing)" — but this PR targets master. The following must be removed or split to a separate non-merge branch:
cmd/arkd-bench-seed/main.go(entire binary, 255 lines)internal/core/application/startup_profile.go(20 lines + allstartupProfilePhase()calls inservice.go)pkg/arkd-wallet/core/infrastructure/nbxplorer/profile.go(205 lines, includesinit()that logs on every import)pkg/arkd-wallet/core/infrastructure/nbxplorer/service.go— thenewProfilingHTTPClient()swap adds a regex-heavyRoundTripperto every NBXplorer HTTP call in productionpkg/arkd-wallet/config/config.go—PhaseMark()callspkg/arkd-wallet/core/application/wallet/service.go—PhaseMark()call + importpkg/arkd-wallet/interface/grpc/handlers/wallet_handler.go—PhaseMark()+LogSummary()calls + imports
This is ~500 lines of debug code that adds runtime overhead (mutex + append on every HTTP call, 6 compiled regexes at init), pollutes logs with [startup-profile] noise, and leaks a local developer path (/home/bob/.claude/plans/...).
3. chunkStrings silent pass-through on size <= 0
internal/infrastructure/wallet/wallet_client.go
if size <= 0 || len(in) == 0 {
return [][]string{in}
}If watchScriptsChunkSize is accidentally set to 0 or negative, this silently sends the entire slice as one chunk — exactly the scenario chunking is supposed to prevent. This should either panic (it's a programming error) or clamp to a safe default with a log warning.
🟡 Should Fix
4. watchScriptsChunkSize is a mutable package-level var
internal/infrastructure/wallet/wallet_client.go
var watchScriptsChunkSize = 2000Exported mutability for testability is a known Go antipattern. Any goroutine could race on this. Options: unexported const + test-only override via a struct field, or a //go:build !production guard.
5. SQLite dual-slice parameter is fragile
internal/infrastructure/db/sqlite/vtxo_repo.go
The caller must pass the same slice as both CommitmentTxids and CommitmentTxidsAlt. If someone later passes different slices, results will silently diverge. Add a comment on the SelectVtxoPubKeysByCommitmentTxidsParams struct or add a validation check.
✅ Looks Good
- Semantic equivalence of bulk vs singular queries: The switch from
vtxo_vw+ LIKE tovtxo+ EXISTS againstvtxo_commitment_txidis correct and actually more robust (no comma-parsing edge cases). Confirmed byTestVtxoPubKeysBulkMatchesLoop. - New index
idx_vtxo_commitment_txid_commitment_txid— correct, matches the EXISTS subquery join pattern. - Chunked WatchScripts/UnwatchScripts — clean implementation with good test coverage (boundary, error short-circuit, reassembly).
- No cross-repo breakage —
VtxoRepositoryis internal to arkd; no external repos import it. The new method is additive. - Test coverage — bulk-matches-loop equivalence test, inline service_test coverage, chunk boundary tests, error propagation tests. Well done.
⚠️ Protocol-Critical: Human Review Required
This PR modifies restoreWatchingVtxos() and Stop() in service.go — these paths determine which VTXOs are monitored for sweeps after restart and shutdown. A bug here means missed sweep notifications → users lose funds. The refactoring looks semantically correct to me, but per policy this requires explicit human sign-off before merge.
There was a problem hiding this comment.
Follow-up Review (new commits since last review)
The new commit 4cd3a099 is a gofmt pass on cmd/arkd-bench-seed/main.go only. None of the 3 must-fix issues from my previous review have been addressed. Repeating them here for tracking:
🔴 Still Open — Must Fix
1. Badger > vs >= amount filter — internal/infrastructure/db/badger/vtxo_repo.go
Two occurrences of vtxo.Amount > amountFilter should be >= to match sqlite/postgres WHERE v.amount >= $1. A VTXO with amount == amountFilter is silently dropped by badger but kept by SQL backends.
2. Throwaway instrumentation still in the diff — ~500 lines of debug/profiling code (startup_profile.go, profile.go, PhaseMark() calls, cmd/arkd-bench-seed/, newProfilingHTTPClient()) targeting master. This adds runtime overhead (mutex + append per HTTP call, 6 compiled regexes at init) and leaks a local dev path. Must be removed before merge.
3. chunkStrings silent pass-through on size <= 0 — internal/infrastructure/wallet/wallet_client.go:~line with size <= 0
Still returns the entire slice as one chunk when size is ≤ 0, defeating the purpose of chunking. Should panic or clamp to a safe default.
🟡 Still Open — Should Fix
4. watchScriptsChunkSize mutable package var — data race risk if tests run in parallel.
5. SQLite dual-slice parameter fragility — same slice passed as both CommitmentTxids and CommitmentTxidsAlt with no guard or comment.
⚠️ Protocol-Critical: Human Review Still Required
restoreWatchingVtxos() and Stop() determine which VTXOs are monitored for sweeps. A bug here = missed sweep = user fund loss. Requires explicit human sign-off before merge regardless of code quality.
Please address the must-fix items before the next push.
There was a problem hiding this comment.
Follow-up Review — commit 4f9b556
All 3 must-fix and both should-fix items from my previous reviews are resolved:
🔴 Must Fix — ✅ Resolved
1. Badger > → >= amount filter — Fixed in both GetVtxoPubKeysByCommitmentTxid (singular) and GetVtxoPubKeysByCommitmentTxids (bulk) in internal/infrastructure/db/badger/vtxo_repo.go. All 4 occurrences now use >= to match the sqlite/postgres WHERE v.amount >= $1 contract. Good comments added explaining the rationale.
2. Throwaway instrumentation removed — All ~500 lines of debug code deleted:
cmd/arkd-bench-seed/main.go✅internal/core/application/startup_profile.go✅pkg/arkd-wallet/core/infrastructure/nbxplorer/profile.go✅- All
startupProfilePhase()/PhaseMark()/LogSummary()call sites ✅ newProfilingHTTPClient()→ plain&http.Client{}✅
3. chunkStrings size ≤ 0 handling — Now panics with a descriptive message. TestChunkStringsBadSizePanics covers {0, -1, -1000}. Production path is guarded by effectiveChunkSize() which always returns > 0. Correct design.
🟡 Should Fix — ✅ Resolved
4. Mutable package var → struct field — watchScriptsChunkSize var replaced with const defaultWatchScriptsChunkSize + per-instance chunkSize field on walletDaemonClient. Tests use newTestClient(fake, size) — no global mutation, parallel-safe. New test default_chunk_size_used_when_unset confirms the fallback path.
5. SQLite dual-slice fragility — Expanded doc comment on GetVtxoPubKeysByCommitmentTxids now clearly explains why two fields exist and that they MUST receive the same slice. Inline comment at the call site reinforces this. The wrapper method is the single call path, preventing misuse.
No new issues found in 4f9b556
The diff is a clean subtraction of debug code plus targeted fixes. No new logic, no new code paths, no behavioral changes beyond the corrections above.
⚠️ Protocol-Critical: Human Review Still Required
This PR modifies restoreWatchingVtxos() and Stop() in service.go — these paths determine which VTXOs are monitored for sweeps after restart/shutdown. A bug here means missed sweep notifications → users lose funds. The refactoring looks semantically equivalent to me (confirmed by TestVtxoPubKeysBulkMatchesLoop), but per policy this requires explicit human sign-off before merge.
Approving from my side — over to a human reviewer for the final ✅.
|
@arkanaai @coderabbitai review |
Arkana review —
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: #!/bin/bash
# Get PR details and file list
gh pr view 1084 --repo arkade-os/arkd --json title,body,files,additions,deletions,commits 2>/dev/null | jq '{title, additions, deletions, files: [.files[].path]}'Length of output: 279 🏁 Script executed: #!/bin/bash
# Get the full diff
gh pr diff 1084 --repo arkade-os/arkd 2>/dev/nullLength of output: 155 🏁 Script executed: #!/bin/bash
# Fetch PR file list via GitHub API
gh api repos/arkade-os/arkd/pulls/1084/files --paginate 2>/dev/null | jq '[.[] | {filename, status, additions, deletions}]'Length of output: 2808 🏁 Script executed: #!/bin/bash
# Fetch the raw diff via GitHub API
gh api repos/arkade-os/arkd/pulls/1084 \
-H "Accept: application/vnd.github.v3.diff" 2>/dev/nullLength of output: 43206
Medium – exported
|
There was a problem hiding this comment.
Code Review — #1084
Reviewer: Arkana (automated)
Verdict: Request changes (one medium finding + protocol-critical flag requiring human sign-off)
Overall Assessment
Excellent PR. The N+1 → bulk query conversion is well-motivated with concrete benchmarks, the chunked gRPC calls solve a real message-size failure, the >= parity fix in badger is correct, and test coverage is thorough. The sqlc dual-slice workaround for SQLite is well-documented.
Most of this is ready to merge. Two findings below — one medium, one informational — plus the mandatory human review flag.
🟡 MEDIUM — SQLite parameter count limit at high round counts
Files: internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go, internal/infrastructure/db/sqlite/vtxo_repo.go
The SQLite bulk query expands commitment_txids and commitment_txids_alt into individual ? placeholders. Since both slices are the same data, total bound params = 1 + 2N where N = number of sweepable rounds.
modernc.org/sqlite (v1.33.1 per go.mod) compiles SQLite with SQLITE_MAX_VARIABLE_NUMBER = 32766. That means:
- 10,000 rounds → 20,001 params ✅
- 16,383 rounds → 32,767 params ❌ exceeds limit, query will fail
Granted, 16K+ sweepable (unsettled/unswept) rounds is extreme, but this is a silent cliff that would crash startup with an opaque SQLite error. Two options:
- Preferred: Batch the txid slice in the Go wrapper (e.g., chunks of 5000) and deduplicate results in Go. The postgres path doesn't need this since
ANY($2::text[])is a single bind. - Acceptable: Document the limit and add a log warning if
len(commitmentTxids) > 15000.
Not a ship-blocker for typical deployments, but worth hardening before this code encounters a long-running node that hasn't swept in a while.
ℹ️ INFO — Behavior change in Stop() error handling
File: internal/core/application/service.go:428-437
Old code: per-txid loop with continue on error — partial results were still passed to stopWatchingVtxos. If 1 of 1000 lookups failed, the other 999 rounds' scripts were still unwatched.
New code: bulk query with if err != nil { warn } else { stopWatching } — all-or-nothing. If the single bulk query fails, zero scripts get unwatched.
This is acceptable because: (a) a single query is less failure-prone than N queries, (b) shutdown unwatching is best-effort (the wallet daemon handles reconnect), and (c) the retry loop in stopWatchingVtxos already handles transient gRPC failures. Just flagging for awareness.
✅ Things that look correct
-
Bulk SQL queries (postgres + sqlite): Both correctly query
vtxodirectly (notvtxo_vw), check bothv.commitment_txidand thevtxo_commitment_txidjoin table, and returnDISTINCT. The postgres version reuses the same$2::text[]bind cleanly. -
New index:
idx_vtxo_commitment_txid_commitment_txidonvtxo_commitment_txid(commitment_txid)— correct column,IF NOT EXISTSmakes it idempotent, down migration drops it. Both sqlite and postgres migrations are identical, which is correct. -
Badger
>=parity fix (internal/infrastructure/db/badger/vtxo_repo.go:327,332): The old>was a real bug. The badgerhold query used.Ge(amountFilter)(inclusive) but the Go dedup loop used>(exclusive), silently dropping VTXOs withamount == amountFilter. Currently harmless since both callers passamountFilter=0, but the fix makes the contract correct for future callers. Good catch. -
Badger bulk implementation: The
MatchFuncapproach for theCommitmentTxidsslice intersection is the right call given badgerhold's limitations. Thewantedmap lookup is O(1) per txid, keeping the MatchFunc efficient. The redundant>= amountFiltercheck in the dedup loop (after.Ge(amountFilter)in the query) is harmless belt-and-suspenders. -
Chunked WatchScripts/UnwatchScripts: Clean implementation.
defaultWatchScriptsChunkSize = 2000at ~75 bytes/script ≈ 150 KiB per RPC is conservative. TheeffectiveChunkSize()fallback pattern is clean. Panicking onsize <= 0is the right call for an internal helper. -
Partial chunk failure + retry: If chunk 3/5 fails in
WatchScripts, the error propagates up. InstopWatchingVtxos, the outerfor { ... continue }loop retries all chunks — re-watching already-watched scripts is idempotent. Correct. -
Test coverage:
TestVtxoPubKeysBulkMatchesLoopis the most important test (asserts bulk == loop union). ThechunkStringstable tests + panic test + fake gRPC client tests are comprehensive. The inlineservice_test.goadditions cover nil slice, non-existent txid, dedup assertion, and amount filtering. -
No cross-repo impact:
VtxoRepositoryis internal. No SDKs or external consumers implement it. The singularGetVtxoPubKeysByCommitmentTxidis preserved on the interface — no breakage.
🔴 PROTOCOL-CRITICAL FLAG
This PR touches VTXO script watching — the mechanism by which arkd detects on-chain VTXO movements (unilateral exits, forfeit transactions, sweep confirmations). If scripts are not watched after restart, the server will miss on-chain events and fail to respond to exits or trigger forfeits.
The code change is correct and strictly better than the status quo, but per repo policy: a human must review and approve protocol-critical changes before merge.
@bitcoin-coder-bob — please address the SQLite parameter limit concern above, then this is ready for human sign-off.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/infrastructure/db/service_test.go (1)
1027-1042: ⚡ Quick winConsider verifying the complete union result.
The test checks deduplication correctly and verifies that
tapkey1,tapkey2, andtapkey3are present. However, it doesn't verify that keys fromcommitmentTxid(specificallypubkeyandpubkey2from the setup at lines 676-745) are also in the result. Adding assertions for those would provide more complete coverage of the union semantics described in the comment.✨ Optional enhancement to verify complete union
for k, n := range seen { require.Equalf(t, 1, n, "duplicate pubkey %s in bulk result", k) } + // Verify complete union: keys from both commitment txids require.Contains(t, bulkKeys, "tapkey1") require.Contains(t, bulkKeys, "tapkey2") require.Contains(t, bulkKeys, "tapkey3") + require.Contains(t, bulkKeys, pubkey) + require.Contains(t, bulkKeys, pubkey2)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/infrastructure/db/service_test.go` around lines 1027 - 1042, The test currently verifies deduplication and presence of tapkey1/2/3 after calling svc.Vtxos().GetVtxoPubKeysByCommitmentTxids(ctx, []string{otherCommitmentTxid, commitmentTxid}, 0) but misses asserting that the keys from commitmentTxid (setup variables pubkey and pubkey2) are included; update the test to also require.Contains(t, bulkKeys, pubkey) and require.Contains(t, bulkKeys, pubkey2) (or assert the full expected union length) after the deduplication check so the union semantics are fully validated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/core/application/service.go`:
- Around line 3648-3653: The loop that builds watch scripts only checks
len(key)==64 but doesn't validate hex content; update the validation so that
each key is decoded (e.g., hex.DecodeString(key)) and ensure it yields exactly
32 bytes before appending to scripts (keep the existing fmt.Sprintf("5120%s",
key) for valid keys), thereby skipping non-hex or wrong-size values to prevent
WatchScripts failures during startup recovery.
---
Nitpick comments:
In `@internal/infrastructure/db/service_test.go`:
- Around line 1027-1042: The test currently verifies deduplication and presence
of tapkey1/2/3 after calling svc.Vtxos().GetVtxoPubKeysByCommitmentTxids(ctx,
[]string{otherCommitmentTxid, commitmentTxid}, 0) but misses asserting that the
keys from commitmentTxid (setup variables pubkey and pubkey2) are included;
update the test to also require.Contains(t, bulkKeys, pubkey) and
require.Contains(t, bulkKeys, pubkey2) (or assert the full expected union
length) after the deduplication check so the union semantics are fully
validated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d52cf679-b965-4026-98ef-705e3e199efa
📒 Files selected for processing (17)
internal/core/application/service.gointernal/core/domain/vtxo_repo.gointernal/infrastructure/db/badger/vtxo_repo.gointernal/infrastructure/db/postgres/migration/20260527150000_vtxo_commitment_txid_index.down.sqlinternal/infrastructure/db/postgres/migration/20260527150000_vtxo_commitment_txid_index.up.sqlinternal/infrastructure/db/postgres/sqlc/queries/query.sql.gointernal/infrastructure/db/postgres/sqlc/query.sqlinternal/infrastructure/db/postgres/vtxo_repo.gointernal/infrastructure/db/service_test.gointernal/infrastructure/db/sqlite/migration/20260527150000_vtxo_commitment_txid_index.down.sqlinternal/infrastructure/db/sqlite/migration/20260527150000_vtxo_commitment_txid_index.up.sqlinternal/infrastructure/db/sqlite/sqlc/queries/query.sql.gointernal/infrastructure/db/sqlite/sqlc/query.sqlinternal/infrastructure/db/sqlite/vtxo_repo.gointernal/infrastructure/db/vtxo_repo_bench_test.gointernal/infrastructure/wallet/wallet_client.gointernal/infrastructure/wallet/wallet_client_test.go
|
Actionable comments posted: 0 |
|
@coderabbitai @arkanaai review |
|
✅ Actions performedReview triggered.
|
Arkana Review — PR #1084: Bulk vtxo pubkey lookup + chunked WatchScriptsSolid performance work. The root causes are correctly diagnosed, the fixes are clean, and the benchmark numbers are credible. A few items worth addressing before merge: 🟡 Behavioural change in
|
There was a problem hiding this comment.
Arkana Review — PR #1084 (incremental, commit d45526b)
All three items from my previous review have been addressed:
-
Multi-batch SQLite test —
vtxo_repo_batching_test.gocovers batch sizes{1, 2, 3, rounds-1, rounds, rounds+1, 0}, exercises dedup across batch boundaries via cross-linked commitment txids in the join table, and tests min-amount filtering separately. Good boundary coverage. -
Explicit
Stop()log — Warning now states the consequence clearly: wallet may keep watching scripts until next restart. Makes the all-or-nothing behavior visible to operators. -
restoreWatchingVtxosdebug log — Now includes both script count and round count for boot diagnostics.
The export_test.go pattern is the correct Go idiom for test-only access to unexported helpers.
Approving from an automated review perspective. Since this PR touches VTXO lookup paths used in startup recovery and shutdown (protocol-adjacent), a human reviewer should still sign off before merge.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go (1)
96-108: ⚡ Quick winAdd a boundary case where
amount == min_amount.The seeded amounts (100/5000/200/7500) all sit clearly above or below
min_amount=1000, so this test passes identically whether the predicate is>= min_amountor> min_amount. Since this PR explicitly aligns the badger backend to>=for parity, a vtxo withamount == 1000would lock the inclusive boundary and guard against a future>/>=regression.💚 Suggested boundary coverage
insertVtxoRow(t, db, "vtxo-a-low", 0, "pubkey-a-low", 100, commitmentTxids[0]) insertVtxoRow(t, db, "vtxo-a-high", 0, "pubkey-a-high", 5000, commitmentTxids[0]) + insertVtxoRow(t, db, "vtxo-a-eq", 0, "pubkey-a-eq", 1000, commitmentTxids[0]) insertVtxoRow(t, db, "vtxo-b-low", 0, "pubkey-b-low", 200, commitmentTxids[1]) insertVtxoRow(t, db, "vtxo-b-high", 0, "pubkey-b-high", 7500, commitmentTxids[1]) @@ - require.ElementsMatch(t, []string{"pubkey-a-high", "pubkey-b-high"}, got) + require.ElementsMatch(t, + []string{"pubkey-a-eq", "pubkey-a-high", "pubkey-b-high"}, got)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go` around lines 96 - 108, Add a boundary case where a vtxo has amount exactly equal to min_amount (1000) so the test verifies inclusive behavior; seed an additional row with insertVtxoRow (e.g., "vtxo-equal", amount 1000) tied to one of the commitmentTxids, call sqlitedb.NewVtxoRepository and then sqlitedb.GetVtxoPubKeysByCommitmentTxidsBatched(ctx, repo, commitmentTxids, 1000, 1) and assert that the pubkey for the equal-amount vtxo is included in the returned slice (since the expected predicate is >= min_amount).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go`:
- Around line 43-54: The test currently misses the inclusive boundary for
minAmount; in TestGetVtxoPubKeysByCommitmentTxidsBatched_MinAmount add a vtxo
whose amount equals the minAmount (1000) using the same helper used for other
vtxos (e.g., insertVtxoRow) and ensure its pubkey is included in the expected
set so the query's v.amount >= min_amount behavior is asserted; keep the
existing join-table cross-link setup (insertVtxoCommitmentTxidRow) unchanged so
dedup behavior is still exercised.
---
Nitpick comments:
In `@internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go`:
- Around line 96-108: Add a boundary case where a vtxo has amount exactly equal
to min_amount (1000) so the test verifies inclusive behavior; seed an additional
row with insertVtxoRow (e.g., "vtxo-equal", amount 1000) tied to one of the
commitmentTxids, call sqlitedb.NewVtxoRepository and then
sqlitedb.GetVtxoPubKeysByCommitmentTxidsBatched(ctx, repo, commitmentTxids,
1000, 1) and assert that the pubkey for the equal-amount vtxo is included in the
returned slice (since the expected predicate is >= min_amount).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f2d80562-f2a9-45bd-a534-e0bd6f0cdab4
📒 Files selected for processing (4)
internal/core/application/service.gointernal/infrastructure/db/sqlite/export_test.gointernal/infrastructure/db/sqlite/vtxo_repo.gointernal/infrastructure/db/sqlite/vtxo_repo_batching_test.go
✅ Files skipped from review due to trivial changes (1)
- internal/infrastructure/db/sqlite/export_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- internal/core/application/service.go
- internal/infrastructure/db/sqlite/vtxo_repo.go
There was a problem hiding this comment.
Arkana Review — PR #1084 (incremental, commit 9de0251)
New commit adds an amount == min_amount boundary row to TestGetVtxoPubKeysByCommitmentTxidsBatched_MinAmount, explicitly locking the inclusive >= predicate across all backends. This was the last open item from my previous review.
All previous findings resolved. LGTM. ✅
Summary
arkd startup was dominated by an N+1 query in
restoreWatchingVtxoswhosewall-clock time grew worse than linearly with sweepable-round count, and
the cross-process
WatchScriptsgRPC call broke at ~60k scripts due tothe default 4 MiB gRPC max-message size. This PR fixes both.
Why the old code was slow
The pre-fix code in
internal/core/application/service.go(called onceduring
Start()and once duringStop()):Two compounding problems:
N+1 query pattern. For N sweepable rounds we fired N+1 SQL
queries: one for the round list, then one per round to fetch its
vtxo pubkeys. Each query crossed the sql/sqlc boundary, prepared a
statement, executed, scanned, and serialized rows back. The
per-query fixed cost dominated.
The underlying SQL did not scale. Each per-round query hit
SelectVtxoPubKeysByCommitmentTxid, which scansvtxo_vwand usesa
LIKE '%,<txid>,%'membership test over agroup_concat'dcommitmentscolumn. That column has no usable index for the LIKEpredicate, so SQLite did a full view scan per call. As the vtxo
table grew, per-call latency grew with it. We measured:
That is super-linear growth: per-call cost was multiplying by ~10×
as the table size grew by 10×.
After this lookup finished, the application unconditionally sent all
collected scripts as a single
scanner.WatchScriptscall. arkd talksto arkd-wallet via gRPC, so this turned into a single
/arkwallet.v1.WalletService/WatchScriptsRPC carrying every scriptin one message. At 100 000 scripts the payload was ~7 MB, well over
the gRPC default 4 MiB receive cap, so the RPC failed with
ResourceExhausted. The DB stall masked this in practice (no oneever got past the loop on a populated node), but the failure mode was
real.
How we fixed it
1. Bulk DB query
New
GetVtxoPubKeysByCommitmentTxids(ctx, txids []string, minAmount uint64) ([]string, error)on theVtxoRepositoryinterface, implemented inthe sqlite, postgres, and badger backends. Returns the deduplicated
set of vtxo pubkeys for any of the given commitment_txids. The
application now calls it once with the full list from
GetSweepableRounds. Both N+1 loops (the startup-timerestoreWatchingVtxosand the shutdown-timeStop()path) arecollapsed.
The new sqlc query reads
vtxodirectly instead of going throughvtxo_vw, because the view'sgroup_concataggregation is whatforced the
LIKEmembership scan in the first place. The bulk querymatches against either the vtxo's root
commitment_txidor againstvtxo_commitment_txidjoined on the supplied txid set.IN (sqlc.slice('commitment_txids'))for the directcolumn match; an
EXISTS (SELECT 1 FROM vtxo_commitment_txid ... IN (sqlc.slice('commitment_txids_alt')))for the join-table match.Two slice placeholders are needed because sqlc's sqlite generator
rewrites only the first occurrence of a given slice name per query;
the Go caller passes the same
[]stringto both fields, enforcedby the wrapper method.
= ANY(@commitment_txids::text[])for both clauses(postgres reuses the bound array without ceremony).
The script-build loop in
restoreWatchingVtxosnow validates eachpubkey via
hex.DecodeStringand rejects values that do not decodeto exactly 32 bytes, so a single corrupted DB row cannot poison the
WatchScripts payload on startup recovery.
2. New index
vtxo_commitment_txidhad a composite primary key(vtxo_txid, vtxo_vout, commitment_txid), so lookups bycommitment_txidalone could not use the PK index. Migration20260527150000_vtxo_commitment_txid_indexaddsidx_vtxo_commitment_txid_commitment_txidonvtxo_commitment_txid(commitment_txid)so the bulk join is anindex seek instead of a table scan.
3. Chunked WatchScripts / UnwatchScripts
internal/infrastructure/wallet/wallet_client.gonow splits thescripts list into chunks of
defaultWatchScriptsChunkSize(2000,~150 KiB per RPC) and sends them sequentially. The chunk size is a
private struct field on
walletDaemonClientthat defaults to thepackage const; only tests construct clients with a smaller override.
chunkStringspanics onsize <= 0rather than silently returningthe input as one chunk, since the production constructor never
exposes a path to a non-positive size. Same treatment for
UnwatchScripts. Caller-visible semantics are unchanged for anyscript count below the default cap.
4. Badger
>=parity fixThe badger backend's existing
GetVtxoPubKeysByCommitmentTxidandthe new bulk variant both had a strict
>comparison in their deduploops, while the badgerhold query used
.Ge(amountFilter)and thesqlite/postgres queries use
WHERE v.amount >= $1. A VTXO withamount == amountFilterwas silently dropped on badger but kept onthe SQL backends. Both occurrences fixed to
>=so all threebackends agree on the contract.
5. SQLite parameter-limit batching
modernc.org/sqlitecaps total bound parameters per query atSQLITE_MAX_VARIABLE_NUMBER = 32766. Our bulk query binds thecommitment_txids list into two distinct slice placeholders (see #1),
so the total parameter count per call is
1 + 2N. Above ~16 380rounds a single call would fail at bind time with an opaque "too many
SQL variables" error.
The sqlite wrapper now splits the input into batches of
sqliteVtxoPubKeysBatchSize = 5000(= 10 001 bound params worstcase, generous headroom) and merges the deduplicated results in Go.
For typical inputs (≤ 5000 txids) the loop runs exactly once and is
indistinguishable from a single underlying query call. Postgres
doesn't need batching because
ANY(@commitment_txids::text[])binds the array as one parameter.
The batching is exposed as a private helper that takes
batchSizeas an argument, so a future regression test can drive the multi-batch
path against a low batch size without touching production code.
Tunable constants
Two constants control the chunking / batching behaviour. Both are
currently declared at package scope in their respective source files
with sensible defaults; happy to relocate either to the config layer
(env var via
viperininternal/config/config.go, or a knob on theappropriate service struct) if reviewers prefer deployment-time
overrides without recompiling.
defaultWatchScriptsChunkSizeinternal/infrastructure/wallet/wallet_client.goWatchScripts/UnwatchScriptsgRPC call. At ~75 bytes/script this is ~150 KiB per payload, comfortably under the default gRPC 4 MiB cap.sqliteVtxoPubKeysBatchSizeinternal/infrastructure/db/sqlite/vtxo_repo.go1 + 2N) stay underSQLITE_MAX_VARIABLE_NUMBER = 32766. At 5000 the worst case is 10 001 params, ~3× headroom.Both are also test-injectable today:
defaultWatchScriptsChunkSizeisoverridden via the private
chunkSizefield onwalletDaemonClient,and
sqliteVtxoPubKeysBatchSizeis passed as abatchSizeargument tothe testable inner helper
getVtxoPubKeysByCommitmentTxidsBatched.Expected and measured performance
Per-iteration
restoreWatchingVtxoscost on sqlite (Gotesting.B,-benchtime=1x, local laptop):DB-call count is constant at 2 regardless of N (one
GetSweepableRounds, oneGetVtxoPubKeysByCommitmentTxids) up tosqliteVtxoPubKeysBatchSize, and2 + ceil(N / 5000)calls beyondthat. The same savings apply to graceful-shutdown
stopWatchingVtxos, so shutdown on a populated node goes fromminutes to sub-second.
End-to-end startup measurement at the (1000 rounds, 10 vtxos/round)
scale:
appsvc.restoreWatchingVtxoswall-clock dropped from 42.17sto 0.6s in our integration test (the higher number includes
GetSweepableRoundsand the WatchScripts HTTP RT, which thebenchmark isolates out).
Tests
internal/infrastructure/wallet/wallet_client_test.go- 14sub-tests covering
chunkStrings,WatchScriptsandUnwatchScripts: nil/empty input → no gRPC calls, exact chunkboundary, single chunk under limit, splits above boundary, large
inputs round-trip intact, mid-chunk error short-circuits subsequent
chunks, default chunk size used when unset on the client, and a
TestChunkStringsBadSizePanicsthat asserts the helper panics onsize 0 / -1 / -1000
internal/infrastructure/db/vtxo_repo_bench_test.go:TestVtxoPubKeysBulkMatchesLoopseeds 50 sweepable rounds andasserts the bulk method returns the exact deduplicated union of
per-txid results
BenchmarkGetVtxoPubKeysByCommitmentTxid_PerTxidLoopatrounds={10, 100, 1000}
BenchmarkGetVtxoPubKeysByCommitmentTxids_Bulkatrounds={10, 100, 1000, 5000}
internal/infrastructure/db/service_test.go- bulk method coverageadded inline in
testVtxoRepository: single-txid match, multi-txidunion (asserting
tapkey1/tapkey2/tapkey3from one commitmenttxid plus
pubkey/pubkey2from the other, with explicit no-dupsassertion), amount filter, nil slice, non-existent txid
To re-run benchmarks:
-benchtime=1xis the recommended setting: the slow per-txid case isinherently slow and we want a single representative measurement, not
Go's auto-scaling. With
-timeout=10mno run timed out.Files
Production:
internal/core/domain/vtxo_repo.go- interface methodinternal/infrastructure/db/sqlite/sqlc/query.sql+ regeneratedartifact
internal/infrastructure/db/postgres/sqlc/query.sql+ regeneratedartifact
internal/infrastructure/db/sqlite/vtxo_repo.go- bulk method plusparameter-limit batching helper
internal/infrastructure/db/postgres/vtxo_repo.go- bulk methodinternal/infrastructure/db/badger/vtxo_repo.go- bulk method,plus
>=parity fix on both the pre-existing singular method andthe new bulk method
internal/infrastructure/db/{sqlite,postgres}/migration/20260527150000_vtxo_commitment_txid_index.{up,down}.sqlinternal/core/application/service.go- both N+1 loops replaced,plus hex-decode validation of each tap key before script build
internal/infrastructure/wallet/wallet_client.go- chunking withstruct-field chunk size
Tests:
internal/infrastructure/wallet/wallet_client_test.gointernal/infrastructure/db/vtxo_repo_bench_test.gointernal/infrastructure/db/service_test.go(inline additions)Summary by CodeRabbit
Performance
New Features
Bug Fixes
Tests