Skip to content

Bulk vtxo pubkey lookup + chunked WatchScripts on startup#1084

Open
bitcoin-coder-bob wants to merge 7 commits into
masterfrom
bob/nbexplorer-enhancement
Open

Bulk vtxo pubkey lookup + chunked WatchScripts on startup#1084
bitcoin-coder-bob wants to merge 7 commits into
masterfrom
bob/nbexplorer-enhancement

Conversation

@bitcoin-coder-bob
Copy link
Copy Markdown
Collaborator

@bitcoin-coder-bob bitcoin-coder-bob commented May 27, 2026

🏷️ Branch naming nit: the branch is called bob/nbexplorer-enhancement,
which is misleading. The PR isn't about NBXplorer; it's about the
arkd-side bulk vtxo pubkey lookup and the chunked WatchScripts gRPC.
The name is a leftover from where the investigation started (profiling
NBXplorer usage to find what was slow on boot). Not worth renaming
mid-review since GitHub's branch-rename API auto-closed the PR when I
tried it, and the review history is more valuable than a tidy branch
name. The branch is deleted on merge anyway.

Summary

arkd startup was dominated by an N+1 query in restoreWatchingVtxos whose
wall-clock time grew worse than linearly with sweepable-round count, and
the cross-process WatchScripts gRPC call broke at ~60k scripts due to
the default 4 MiB gRPC max-message size. This PR fixes both.

Why the old code was slow

The pre-fix code in internal/core/application/service.go (called once
during Start() and once during Stop()):

commitmentTxIds, _ := s.repoManager.Rounds().GetSweepableRounds(ctx)
for _, commitmentTxId := range commitmentTxIds {
    tapKeys, _ := s.repoManager.Vtxos().
        GetVtxoPubKeysByCommitmentTxid(ctx, commitmentTxId, 0)
    // ...
}

Two compounding problems:

  1. N+1 query pattern. For N sweepable rounds we fired N+1 SQL
    queries: one for the round list, then one per round to fetch its
    vtxo pubkeys. Each query crossed the sql/sqlc boundary, prepared a
    statement, executed, scanned, and serialized rows back. The
    per-query fixed cost dominated.

  2. The underlying SQL did not scale. Each per-round query hit
    SelectVtxoPubKeysByCommitmentTxid, which scans vtxo_vw and uses
    a LIKE '%,<txid>,%' membership test over a group_concat'd
    commitments column. That column has no usable index for the LIKE
    predicate, so SQLite did a full view scan per call. As the vtxo
    table grew, per-call latency grew with it. We measured:

    Rounds Scripts (= rounds × 10) restoreWatchingVtxos wall-clock per-call avg
    0 0 <1ms (skipped)
    10 100 16ms 1.6ms
    100 1 000 307ms 3.1ms
    1 000 10 000 46.9s 46ms
    10 000 100 000 did not complete in 3 min n/a

    That is super-linear growth: per-call cost was multiplying by ~10×
    as the table size grew by 10×.

After this lookup finished, the application unconditionally sent all
collected scripts as a single scanner.WatchScripts call. arkd talks
to arkd-wallet via gRPC, so this turned into a single
/arkwallet.v1.WalletService/WatchScripts RPC carrying every script
in one message. At 100 000 scripts the payload was ~7 MB, well over
the gRPC default 4 MiB receive cap, so the RPC failed with
ResourceExhausted. The DB stall masked this in practice (no one
ever got past the loop on a populated node), but the failure mode was
real.

How we fixed it

1. Bulk DB query

New GetVtxoPubKeysByCommitmentTxids(ctx, txids []string, minAmount uint64) ([]string, error) on the VtxoRepository interface, implemented in
the sqlite, postgres, and badger backends. Returns the deduplicated
set of vtxo pubkeys for any of the given commitment_txids. The
application now calls it once with the full list from
GetSweepableRounds. Both N+1 loops (the startup-time
restoreWatchingVtxos and the shutdown-time Stop() path) are
collapsed.

The new sqlc query reads vtxo directly instead of going through
vtxo_vw, because the view's group_concat aggregation is what
forced the LIKE membership scan in the first place. The bulk query
matches against either the vtxo's root commitment_txid or against
vtxo_commitment_txid joined on the supplied txid set.

  • sqlite: IN (sqlc.slice('commitment_txids')) for the direct
    column match; an EXISTS (SELECT 1 FROM vtxo_commitment_txid ... IN (sqlc.slice('commitment_txids_alt'))) for the join-table match.
    Two slice placeholders are needed because sqlc's sqlite generator
    rewrites only the first occurrence of a given slice name per query;
    the Go caller passes the same []string to both fields, enforced
    by the wrapper method.
  • postgres: = ANY(@commitment_txids::text[]) for both clauses
    (postgres reuses the bound array without ceremony).

The script-build loop in restoreWatchingVtxos now validates each
pubkey via hex.DecodeString and rejects values that do not decode
to exactly 32 bytes, so a single corrupted DB row cannot poison the
WatchScripts payload on startup recovery.

2. New index

vtxo_commitment_txid had a composite primary key
(vtxo_txid, vtxo_vout, commitment_txid), so lookups by
commitment_txid alone could not use the PK index. Migration
20260527150000_vtxo_commitment_txid_index adds
idx_vtxo_commitment_txid_commitment_txid on
vtxo_commitment_txid(commitment_txid) so the bulk join is an
index seek instead of a table scan.

3. Chunked WatchScripts / UnwatchScripts

internal/infrastructure/wallet/wallet_client.go now splits the
scripts list into chunks of defaultWatchScriptsChunkSize (2000,
~150 KiB per RPC) and sends them sequentially. The chunk size is a
private struct field on walletDaemonClient that defaults to the
package const; only tests construct clients with a smaller override.
chunkStrings panics on size <= 0 rather than silently returning
the input as one chunk, since the production constructor never
exposes a path to a non-positive size. Same treatment for
UnwatchScripts. Caller-visible semantics are unchanged for any
script count below the default cap.

4. Badger >= parity fix

The badger backend's existing GetVtxoPubKeysByCommitmentTxid and
the new bulk variant both had a strict > comparison in their dedup
loops, while the badgerhold query used .Ge(amountFilter) and the
sqlite/postgres queries use WHERE v.amount >= $1. A VTXO with
amount == amountFilter was silently dropped on badger but kept on
the SQL backends. Both occurrences fixed to >= so all three
backends agree on the contract.

5. SQLite parameter-limit batching

modernc.org/sqlite caps total bound parameters per query at
SQLITE_MAX_VARIABLE_NUMBER = 32766. Our bulk query binds the
commitment_txids list into two distinct slice placeholders (see #1),
so the total parameter count per call is 1 + 2N. Above ~16 380
rounds a single call would fail at bind time with an opaque "too many
SQL variables" error.

The sqlite wrapper now splits the input into batches of
sqliteVtxoPubKeysBatchSize = 5000 (= 10 001 bound params worst
case, generous headroom) and merges the deduplicated results in Go.
For typical inputs (≤ 5000 txids) the loop runs exactly once and is
indistinguishable from a single underlying query call. Postgres
doesn't need batching because ANY(@commitment_txids::text[])
binds the array as one parameter.

The batching is exposed as a private helper that takes batchSize
as an argument, so a future regression test can drive the multi-batch
path against a low batch size without touching production code.

Tunable constants

Two constants control the chunking / batching behaviour. Both are
currently declared at package scope in their respective source files
with sensible defaults; happy to relocate either to the config layer
(env var via viper in internal/config/config.go, or a knob on the
appropriate service struct) if reviewers prefer deployment-time
overrides without recompiling.

Constant File Default Purpose
defaultWatchScriptsChunkSize internal/infrastructure/wallet/wallet_client.go 2000 Max scripts per WatchScripts / UnwatchScripts gRPC call. At ~75 bytes/script this is ~150 KiB per payload, comfortably under the default gRPC 4 MiB cap.
sqliteVtxoPubKeysBatchSize internal/infrastructure/db/sqlite/vtxo_repo.go 5000 Max commitment_txids per bulk query call so total bound params (1 + 2N) stay under SQLITE_MAX_VARIABLE_NUMBER = 32766. At 5000 the worst case is 10 001 params, ~3× headroom.

Both are also test-injectable today: defaultWatchScriptsChunkSize is
overridden via the private chunkSize field on walletDaemonClient,
and sqliteVtxoPubKeysBatchSize is passed as a batchSize argument to
the testable inner helper getVtxoPubKeysByCommitmentTxidsBatched.

Expected and measured performance

Per-iteration restoreWatchingVtxos cost on sqlite (Go testing.B,
-benchtime=1x, local laptop):

Rounds Scripts Old (per-txid loop) New (bulk) Speedup DB calls before DB calls after
10 100 3.08ms 0.30ms 10× 11 2
100 1 000 257ms 2.27ms 113× 101 2
1 000 10 000 45.46s 35.1ms 1297× 1 001 2
5 000 50 000 (not measured; extrapolation > 4 min) 266ms (>900×) 5 001 2
10 000 100 000 did not complete in 3 min <1s in our integration run (>>1000×) 10 001 2

DB-call count is constant at 2 regardless of N (one
GetSweepableRounds, one GetVtxoPubKeysByCommitmentTxids) up to
sqliteVtxoPubKeysBatchSize, and 2 + ceil(N / 5000) calls beyond
that. The same savings apply to graceful-shutdown
stopWatchingVtxos, so shutdown on a populated node goes from
minutes to sub-second.

End-to-end startup measurement at the (1000 rounds, 10 vtxos/round)
scale: appsvc.restoreWatchingVtxos wall-clock dropped from 42.17s
to 0.6s in our integration test (the higher number includes
GetSweepableRounds and the WatchScripts HTTP RT, which the
benchmark isolates out).

Tests

  • internal/infrastructure/wallet/wallet_client_test.go - 14
    sub-tests covering chunkStrings, WatchScripts and
    UnwatchScripts: nil/empty input → no gRPC calls, exact chunk
    boundary, single chunk under limit, splits above boundary, large
    inputs round-trip intact, mid-chunk error short-circuits subsequent
    chunks, default chunk size used when unset on the client, and a
    TestChunkStringsBadSizePanics that asserts the helper panics on
    size 0 / -1 / -1000
  • internal/infrastructure/db/vtxo_repo_bench_test.go:
    • TestVtxoPubKeysBulkMatchesLoop seeds 50 sweepable rounds and
      asserts the bulk method returns the exact deduplicated union of
      per-txid results
    • BenchmarkGetVtxoPubKeysByCommitmentTxid_PerTxidLoop at
      rounds={10, 100, 1000}
    • BenchmarkGetVtxoPubKeysByCommitmentTxids_Bulk at
      rounds={10, 100, 1000, 5000}
  • internal/infrastructure/db/service_test.go - bulk method coverage
    added inline in testVtxoRepository: single-txid match, multi-txid
    union (asserting tapkey1/tapkey2/tapkey3 from one commitment
    txid plus pubkey/pubkey2 from the other, with explicit no-dups
    assertion), amount filter, nil slice, non-existent txid

To re-run benchmarks:

go test -count=1 -run=NONE -bench BenchmarkGetVtxoPubKeys \
  -benchtime=1x -timeout=10m ./internal/infrastructure/db/

-benchtime=1x is the recommended setting: the slow per-txid case is
inherently slow and we want a single representative measurement, not
Go's auto-scaling. With -timeout=10m no run timed out.

Files

Production:

  • internal/core/domain/vtxo_repo.go - interface method
  • internal/infrastructure/db/sqlite/sqlc/query.sql + regenerated
    artifact
  • internal/infrastructure/db/postgres/sqlc/query.sql + regenerated
    artifact
  • internal/infrastructure/db/sqlite/vtxo_repo.go - bulk method plus
    parameter-limit batching helper
  • internal/infrastructure/db/postgres/vtxo_repo.go - bulk method
  • internal/infrastructure/db/badger/vtxo_repo.go - bulk method,
    plus >= parity fix on both the pre-existing singular method and
    the new bulk method
  • internal/infrastructure/db/{sqlite,postgres}/migration/20260527150000_vtxo_commitment_txid_index.{up,down}.sql
  • internal/core/application/service.go - both N+1 loops replaced,
    plus hex-decode validation of each tap key before script build
  • internal/infrastructure/wallet/wallet_client.go - chunking with
    struct-field chunk size

Tests:

  • internal/infrastructure/wallet/wallet_client_test.go
  • internal/infrastructure/db/vtxo_repo_bench_test.go
  • internal/infrastructure/db/service_test.go (inline additions)

Summary by CodeRabbit

  • Performance

    • Bulk DB lookups for script recovery and chunked gRPC calls for watch/unwatch to reduce repeated per-item work; watch/unwatch now split large requests and short-circuit on errors
    • Added database indices to improve query performance
  • New Features

    • Bulk retrieval of VTXO pubkeys across multiple commitment IDs with validation and deduplication
  • Bug Fixes

    • VTXO amount filter now includes values equal to the minimum threshold
  • Tests

    • New unit and benchmark tests covering bulk lookups, batching, chunking, and edge cases

Review Change Stack

restoreWatchingVtxos and the matching Stop() path previously fired one
GetVtxoPubKeysByCommitmentTxid call per sweepable round. At 1000 rounds
this took ~46s on sqlite (1001 queries); at 10000 it did not complete
in 3 minutes. The view's LIKE-based membership scan over the
group_concat'd commitments column scales worse than O(N).

Add a bulk variant GetVtxoPubKeysByCommitmentTxids on the VtxoRepository
interface and all three backends (sqlite, postgres, badger). Replace
both N+1 loops in internal/core/application/service.go with a single
bulk call. New sqlc query uses sqlc.slice in sqlite (twice, because
the generator only rewrites the first occurrence per query) and
ANY($1::text[]) in postgres. New migration adds an index on
vtxo_commitment_txid(commitment_txid) so the bulk join is fast.

Benchmark (Go testing.B, sqlite + badger, localhost):

  rounds=10    per-txid  3.08ms     bulk    0.30ms     10x
  rounds=100   per-txid  257ms      bulk    2.27ms     113x
  rounds=1000  per-txid  45.46s     bulk    35.1ms     1297x
  rounds=5000  bulk      266ms

DB calls drop from 1+N to 2 regardless of N.

The cross-process WatchScripts gRPC call from arkd to arkd-wallet
also hit the default 4 MiB max-message-size at 100k+ scripts. Chunk
the scripts list in walletclient (WatchScripts and UnwatchScripts)
at watchScriptsChunkSize (default 2000, ~150 KiB per call).

Tests added:
- TestChunkStrings + TestWalletClientWatchScriptsChunking +
  TestWalletClientUnwatchScriptsChunking exercise chunk boundaries,
  large inputs, and mid-chunk error short-circuit
- TestVtxoPubKeysBulkMatchesLoop seeds 50 rounds and asserts the bulk
  result equals the union of per-txid results
- Bulk method coverage added inline in testVtxoRepository for the
  existing service_test loop
- BenchmarkGetVtxoPubKeysByCommitmentTxid_PerTxidLoop and
  BenchmarkGetVtxoPubKeysByCommitmentTxids_Bulk for repeatable A/B

Throwaway instrumentation kept on the branch (will be removed in a
follow-up commit once we're done re-bench'ing): nbxplorer
RoundTripper profiler, startup phase markers in arkd and arkd-wallet,
cmd/arkd-bench-seed for seeding synthetic sweepable rounds.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d24c25d7-2211-41ef-b8d8-d17ecea1876d

📥 Commits

Reviewing files that changed from the base of the PR and between d45526b and 9de0251.

📒 Files selected for processing (1)
  • internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go

Walkthrough

Adds a bulk VTXO pubkey lookup across commitment txids (Badger/Postgres/SQLite), migrations and sqlc queries, refactors service Stop/restoreWatchingVtxos to use the bulk path and validate returned pubkeys, and implements wallet client WatchScripts/UnwatchScripts chunking with tests and benchmarks.

Changes

Bulk VTXO Pubkey Query Optimization

Layer / File(s) Summary
Domain interface and amount filter contract
internal/core/domain/vtxo_repo.go, internal/infrastructure/db/badger/vtxo_repo.go
VtxoRepository interface extended with GetVtxoPubKeysByCommitmentTxids bulk method; amount filtering changed from > to >=.
Database indexes for commitment_txid lookup
internal/infrastructure/db/postgres/migration/..., internal/infrastructure/db/sqlite/migration/...
Postgres and SQLite migrations add idx_vtxo_commitment_txid_commitment_txid index on vtxo_commitment_txid(commitment_txid) with IF NOT EXISTS and matching down scripts.
Badger bulk pubkey implementation
internal/infrastructure/db/badger/vtxo_repo.go
Badger repository implements bulk scans (root and slice-intersection), applies Amount >= filter, and returns deduplicated pubkeys.
Postgres sqlc query and repository
internal/infrastructure/db/postgres/sqlc/query.sql, internal/infrastructure/db/postgres/sqlc/queries/query.sql.go, internal/infrastructure/db/postgres/vtxo_repo.go
sqlc bulk query selects DISTINCT pubkeys via direct match or EXISTS against join table; repository wraps query with empty-input early-return.
SQLite sqlc query, batching, and repository
internal/infrastructure/db/sqlite/sqlc/query.sql, internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go, internal/infrastructure/db/sqlite/vtxo_repo.go, internal/infrastructure/db/sqlite/export_test.go
sqlc bulk query uses two IN clause params; repository batches commitment txids to respect bind limits, binds both params per chunk, deduplicates merged results, and exposes a test wrapper.
Bulk query tests and benchmarks
internal/infrastructure/db/service_test.go, internal/infrastructure/db/vtxo_repo_bench_test.go, internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go
Unit tests validate deduplicated union semantics and min_amount filtering; benchmarks compare per-txid loop vs bulk query; SQLite batching tests validate batch merging and thresholds.

Application Service Refactoring

Layer / File(s) Summary
Service method refactoring to bulk lookup
internal/core/application/service.go
Stop() and restoreWatchingVtxos() refactored to use a single bulk pubkey fetch for all sweepable commitment txids, validate/convert returned pubkeys to scripts, short-circuit on empty inputs/results, and call WatchScripts with chunking-aware payloads.

Wallet Client Script Chunking

Layer / File(s) Summary
Wallet client chunking implementation
internal/infrastructure/wallet/wallet_client.go
walletDaemonClient adds chunkSize field and helpers; WatchScripts and UnwatchScripts split scripts into sequential chunks and issue one gRPC call per chunk; empty input returns immediately.
Wallet client chunking tests
internal/infrastructure/wallet/wallet_client_test.go
Tests verify chunkStrings correctness and panic on invalid sizes; WatchScripts/UnwatchScripts tests validate chunking boundaries, ordering, error propagation, short-circuiting, and default chunk-size fallback.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • altafan
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 61.90% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Bulk vtxo pubkey lookup + chunked WatchScripts on startup' directly and accurately summarizes the two primary changes: (1) implementing bulk database queries for vtxo pubkey lookup to replace N+1 patterns, and (2) adding chunked gRPC message handling for WatchScripts during startup.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bob/nbexplorer-enhancement

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@bitcoin-coder-bob bitcoin-coder-bob changed the title Speed up arkd startup: bulk vtxo pubkey query + chunked WatchScripts Bulk vtxo pubkey lookup + chunked WatchScripts on startup May 27, 2026
Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: arkd#1084 — Bulk vtxo pubkey query + chunked WatchScripts

The core optimization (N+1 → 2 DB calls, chunked gRPC) is well-motivated and the benchmark numbers are compelling. Tests are thorough. However, there are issues that must be addressed before merge.


🔴 Must Fix

1. Badger amount filter: >= in query vs > in dedup loop
internal/infrastructure/db/badger/vtxo_repo.go — new GetVtxoPubKeysByCommitmentTxids

The badgerhold query uses .Ge(amountFilter) (≥), but the Go dedup loop uses strict >:

// Lines in new method — two occurrences:
if vtxo.Amount > amountFilter {
    pubkeyMap[vtxo.PubKey] = struct{}{}
}

The SQL backends use WHERE v.amount >= $1 consistently. A VTXO with amount == amountFilter will be included by sqlite/postgres but excluded by badger. This is copied from the pre-existing singular method (same bug there), but since you're touching this code, please fix both.

Today's callers pass 0 so it's not biting in production, but the contract is >= per the SQL definitions, and a future caller passing a real threshold will get inconsistent results across backends.

2. Throwaway instrumentation must not merge to master

The PR description says "Throwaway instrumentation kept on the branch (will be removed in a follow-up commit once we're done re-bench'ing)" — but this PR targets master. The following must be removed or split to a separate non-merge branch:

  • cmd/arkd-bench-seed/main.go (entire binary, 255 lines)
  • internal/core/application/startup_profile.go (20 lines + all startupProfilePhase() calls in service.go)
  • pkg/arkd-wallet/core/infrastructure/nbxplorer/profile.go (205 lines, includes init() that logs on every import)
  • pkg/arkd-wallet/core/infrastructure/nbxplorer/service.go — the newProfilingHTTPClient() swap adds a regex-heavy RoundTripper to every NBXplorer HTTP call in production
  • pkg/arkd-wallet/config/config.goPhaseMark() calls
  • pkg/arkd-wallet/core/application/wallet/service.goPhaseMark() call + import
  • pkg/arkd-wallet/interface/grpc/handlers/wallet_handler.goPhaseMark() + LogSummary() calls + imports

This is ~500 lines of debug code that adds runtime overhead (mutex + append on every HTTP call, 6 compiled regexes at init), pollutes logs with [startup-profile] noise, and leaks a local developer path (/home/bob/.claude/plans/...).

3. chunkStrings silent pass-through on size <= 0
internal/infrastructure/wallet/wallet_client.go

if size <= 0 || len(in) == 0 {
    return [][]string{in}
}

If watchScriptsChunkSize is accidentally set to 0 or negative, this silently sends the entire slice as one chunk — exactly the scenario chunking is supposed to prevent. This should either panic (it's a programming error) or clamp to a safe default with a log warning.


🟡 Should Fix

4. watchScriptsChunkSize is a mutable package-level var
internal/infrastructure/wallet/wallet_client.go

var watchScriptsChunkSize = 2000

Exported mutability for testability is a known Go antipattern. Any goroutine could race on this. Options: unexported const + test-only override via a struct field, or a //go:build !production guard.

5. SQLite dual-slice parameter is fragile
internal/infrastructure/db/sqlite/vtxo_repo.go

The caller must pass the same slice as both CommitmentTxids and CommitmentTxidsAlt. If someone later passes different slices, results will silently diverge. Add a comment on the SelectVtxoPubKeysByCommitmentTxidsParams struct or add a validation check.


✅ Looks Good

  • Semantic equivalence of bulk vs singular queries: The switch from vtxo_vw + LIKE to vtxo + EXISTS against vtxo_commitment_txid is correct and actually more robust (no comma-parsing edge cases). Confirmed by TestVtxoPubKeysBulkMatchesLoop.
  • New index idx_vtxo_commitment_txid_commitment_txid — correct, matches the EXISTS subquery join pattern.
  • Chunked WatchScripts/UnwatchScripts — clean implementation with good test coverage (boundary, error short-circuit, reassembly).
  • No cross-repo breakageVtxoRepository is internal to arkd; no external repos import it. The new method is additive.
  • Test coverage — bulk-matches-loop equivalence test, inline service_test coverage, chunk boundary tests, error propagation tests. Well done.

⚠️ Protocol-Critical: Human Review Required

This PR modifies restoreWatchingVtxos() and Stop() in service.go — these paths determine which VTXOs are monitored for sweeps after restart and shutdown. A bug here means missed sweep notifications → users lose funds. The refactoring looks semantically correct to me, but per policy this requires explicit human sign-off before merge.

Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review (new commits since last review)

The new commit 4cd3a099 is a gofmt pass on cmd/arkd-bench-seed/main.go only. None of the 3 must-fix issues from my previous review have been addressed. Repeating them here for tracking:


🔴 Still Open — Must Fix

1. Badger > vs >= amount filterinternal/infrastructure/db/badger/vtxo_repo.go
Two occurrences of vtxo.Amount > amountFilter should be >= to match sqlite/postgres WHERE v.amount >= $1. A VTXO with amount == amountFilter is silently dropped by badger but kept by SQL backends.

2. Throwaway instrumentation still in the diff — ~500 lines of debug/profiling code (startup_profile.go, profile.go, PhaseMark() calls, cmd/arkd-bench-seed/, newProfilingHTTPClient()) targeting master. This adds runtime overhead (mutex + append per HTTP call, 6 compiled regexes at init) and leaks a local dev path. Must be removed before merge.

3. chunkStrings silent pass-through on size <= 0internal/infrastructure/wallet/wallet_client.go:~line with size <= 0
Still returns the entire slice as one chunk when size is ≤ 0, defeating the purpose of chunking. Should panic or clamp to a safe default.


🟡 Still Open — Should Fix

4. watchScriptsChunkSize mutable package var — data race risk if tests run in parallel.

5. SQLite dual-slice parameter fragility — same slice passed as both CommitmentTxids and CommitmentTxidsAlt with no guard or comment.


⚠️ Protocol-Critical: Human Review Still Required

restoreWatchingVtxos() and Stop() determine which VTXOs are monitored for sweeps. A bug here = missed sweep = user fund loss. Requires explicit human sign-off before merge regardless of code quality.

Please address the must-fix items before the next push.

Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review — commit 4f9b556

All 3 must-fix and both should-fix items from my previous reviews are resolved:

🔴 Must Fix — ✅ Resolved

1. Badger >>= amount filter — Fixed in both GetVtxoPubKeysByCommitmentTxid (singular) and GetVtxoPubKeysByCommitmentTxids (bulk) in internal/infrastructure/db/badger/vtxo_repo.go. All 4 occurrences now use >= to match the sqlite/postgres WHERE v.amount >= $1 contract. Good comments added explaining the rationale.

2. Throwaway instrumentation removed — All ~500 lines of debug code deleted:

  • cmd/arkd-bench-seed/main.go
  • internal/core/application/startup_profile.go
  • pkg/arkd-wallet/core/infrastructure/nbxplorer/profile.go
  • All startupProfilePhase() / PhaseMark() / LogSummary() call sites ✅
  • newProfilingHTTPClient() → plain &http.Client{}

3. chunkStrings size ≤ 0 handling — Now panics with a descriptive message. TestChunkStringsBadSizePanics covers {0, -1, -1000}. Production path is guarded by effectiveChunkSize() which always returns > 0. Correct design.

🟡 Should Fix — ✅ Resolved

4. Mutable package var → struct fieldwatchScriptsChunkSize var replaced with const defaultWatchScriptsChunkSize + per-instance chunkSize field on walletDaemonClient. Tests use newTestClient(fake, size) — no global mutation, parallel-safe. New test default_chunk_size_used_when_unset confirms the fallback path.

5. SQLite dual-slice fragility — Expanded doc comment on GetVtxoPubKeysByCommitmentTxids now clearly explains why two fields exist and that they MUST receive the same slice. Inline comment at the call site reinforces this. The wrapper method is the single call path, preventing misuse.

No new issues found in 4f9b556

The diff is a clean subtraction of debug code plus targeted fixes. No new logic, no new code paths, no behavioral changes beyond the corrections above.


⚠️ Protocol-Critical: Human Review Still Required

This PR modifies restoreWatchingVtxos() and Stop() in service.go — these paths determine which VTXOs are monitored for sweeps after restart/shutdown. A bug here means missed sweep notifications → users lose funds. The refactoring looks semantically equivalent to me (confirmed by TestVtxoPubKeysBulkMatchesLoop), but per policy this requires explicit human sign-off before merge.

Approving from my side — over to a human reviewer for the final ✅.

@bitcoin-coder-bob
Copy link
Copy Markdown
Collaborator Author

@arkanaai @coderabbitai review

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented May 28, 2026

Arkana review — Bulk vtxo pubkey lookup + chunked WatchScripts on startup

Overall this is a well-motivated, well-documented fix. The N+1 pattern in restoreWatchingVtxos was a real scalability cliff and the benchmark numbers make the impact undeniable. The chunked gRPC path closes the second independent failure mode cleanly. A few things worth discussing before this comes out of draft:


🔴 Potential blocker — SQLite max variable count

SelectVtxoPubKeysByCommitmentTxids on SQLite expands both IN (…) clauses with one ? placeholder per txid. SQLite's SQLITE_MAX_VARIABLE_NUMBER defaults to 999 in many builds (it was raised to 32766 in SQLite ≥ 3.32, but the effective limit depends on how the embedded library was compiled). A node with > ~500 sweepable rounds will bind > 999 variables across the two slices and will hit:

SQLITE_RANGE: bind or column index out of range

at runtime (not at compile time, not in tests with 50 rounds). This deserves either:

  • a fallback that batches the IN clause itself (e.g. 400 txids per sub-query, UNIONed), or
  • a startup assertion / documented minimum SQLite version, or
  • a check that logs a warning when len(commitmentTxIds) > some_safe_threshold and falls back to the old per-txid loop.

Postgres is fine (it uses ANY($n::text[]) which has no practical array-size limit).


🟡 Notable correctness fix — badger >>=

The >>= fix in GetVtxoPubKeysByCommitmentTxid (singular, pre-existing) is a silent cross-backend divergence that was already shipped. Any badger node that called this with withMinimumAmount > 0 was silently dropping vtxos with amount == withMinimumAmount. Since the callers in this PR all pass 0, it doesn't affect the new code paths, but it does affect any future caller or any existing usage I might have missed. Worth confirming there are no callers with non-zero amountFilter in production that would now see a different (larger) result set after this fix lands.


🟡 CommitmentTxidsAlt footgun in generated SQLite params

type SelectVtxoPubKeysByCommitmentTxidsParams struct {
    MinAmount          int64
    CommitmentTxids    []string
    CommitmentTxidsAlt []string  // MUST == CommitmentTxids
}

The generated struct is exported and the constraint "both fields must be the same slice" is only documented in comments. Any future caller that goes through the generated Queries.SelectVtxoPubKeysByCommitmentTxids directly (bypassing the repo wrapper) could silently produce wrong results. Since sqlc regenerates this struct, the safest mitigation is probably to keep the repo wrapper as the sole entry point and add a //nolint or //go:generate note that the generated function is not for direct use. Or rename the exported query so it's obviously internal (selectVtxoPubKeysByCommitmentTxidsRaw etc.) — though that would fight sqlc's naming conventions. The PR's existing doc comment is good; just flagging the risk.


🟡 Progress logging removed

The old restoreWatchingVtxos had a milestone log:

log.Debugf("restore watching vtxos: %d%%...", milestone*10)

This is now gone. On a node with 10k+ rounds, startup is fast now (good!), but operators lose visibility into where startup is spending time. A single log.Infof("restoreWatchingVtxos: %d scripts for %d rounds", len(scripts), len(commitmentTxIds)) at the end of the function would replace the progress bar with a one-liner that's just as useful.


✅ Things that look good

  • Stop() semantics improved: the old code called stopWatchingVtxos even if some per-round fetches failed (it just skipped those rounds silently). The new code either gets all keys or logs and skips entirely — which is cleaner.
  • chunkStrings panic on size ≤ 0: correct call. Silent fallback to one big chunk would defeat the entire purpose.
  • fakeWalletClient error injection: the failOnCallIdx logic is correct (call is recorded before the fail check, test assertion matches).
  • Migration files: both up/down look correct; IF NOT EXISTS / IF EXISTS guards are present.
  • Postgres implementation: ANY($2::text[]) reused in both clauses is valid PostgreSQL — the planner sees it as one bind, no duplication issue.
  • TestVtxoPubKeysBulkMatchesLoop: solid correctness oracle — 50 rounds × 10 unique pubkeys, bulk == loop union.
  • Badger wanted map for O(1) intersection: correct, avoids O(N²) in the match func.

Minor nits

  • Badger GetVtxoPubKeysByCommitmentTxids applies the amount filter twice: once in the Ge(amountFilter) query and again in the Go loop. Redundant but harmless (consistent with the singular variant).
  • chunkStrings doc says returned slices share backing storage with in. This is safe for sequential gRPC sends, but worth a note that callers must not mutate in between chunk iterations (currently they don't, just future-proofing).

The SQLite variable limit is the one thing I'd want resolved before merge — it could silently break production nodes once round counts climb past a few hundred. Everything else is either style or low-risk. Happy to help think through the batching approach if useful.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get PR details and file list
gh pr view 1084 --repo arkade-os/arkd --json title,body,files,additions,deletions,commits 2>/dev/null | jq '{title, additions, deletions, files: [.files[].path]}'

Length of output: 279


🏁 Script executed:

#!/bin/bash
# Get the full diff
gh pr diff 1084 --repo arkade-os/arkd 2>/dev/null

Length of output: 155


🏁 Script executed:

#!/bin/bash
# Fetch PR file list via GitHub API
gh api repos/arkade-os/arkd/pulls/1084/files --paginate 2>/dev/null | jq '[.[] | {filename, status, additions, deletions}]'

Length of output: 2808


🏁 Script executed:

#!/bin/bash
# Fetch the raw diff via GitHub API
gh api repos/arkade-os/arkd/pulls/1084 \
  -H "Accept: application/vnd.github.v3.diff" 2>/dev/null

Length of output: 43206



@bitcoin-coder-bob: Overall this is a well-executed performance PR. The bulk-query collapse, the gRPC chunking, and especially the badger >= parity fix are all correct. The benchmarks and the TestVtxoPubKeysBulkMatchesLoop correctness assertion are exactly what is needed here. A few things worth discussing:


Medium – exported CommitmentTxidsAlt is a latent footgun (sqlite)

SelectVtxoPubKeysByCommitmentTxidsParams exposes CommitmentTxidsAlt as a public field:

// internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go
type SelectVtxoPubKeysByCommitmentTxidsParams struct {
    MinAmount          int64
    CommitmentTxids    []string
    CommitmentTxidsAlt []string   // must equal CommitmentTxids – enforced only by docs
}

The comment says "Passing different slices to the two fields would silently produce wrong results", but there is nothing preventing a future caller from doing exactly that. Since query.sql.go is regenerated, the struct cannot be changed there, but the wrapper in sqlite/vtxo_repo.go is the only intended entry point. Consider adding a runtime guard in the wrapper:

func (v *vtxoRepository) GetVtxoPubKeysByCommitmentTxids(
    ctx context.Context, commitmentTxids []string, withMinimumAmount uint64,
) ([]string, error) {
    if len(commitmentTxids) == 0 {
        return nil, nil
    }
    // CommitmentTxids and CommitmentTxidsAlt MUST be identical – see
    // SelectVtxoPubKeysByCommitmentTxidsParams for why two slice fields exist.
    taprootKeys, err := v.querier.SelectVtxoPubKeysByCommitmentTxids(ctx,
        queries.SelectVtxoPubKeysByCommitmentTxidsParams{
            MinAmount:          int64(withMinimumAmount),
            CommitmentTxids:    commitmentTxids,
            CommitmentTxidsAlt: commitmentTxids, // same slice – intentional
        })
    ...
}

That is already what the code does, and the comment is there. The residual risk is that SelectVtxoPubKeysByCommitmentTxids (the generated method) is visible to any code in the queries package. Acceptable for now, but worth a note in the interface doc that callers must use the repository wrapper, not the sqlc method directly.


Low – redundant amount filter in badger bulk method

In internal/infrastructure/db/badger/vtxo_repo.go, GetVtxoPubKeysByCommitmentTxids already constrains the badgerhold query with .Ge(amountFilter), so the subsequent in-memory check is always true:

query1 := badgerhold.Where("RootCommitmentTxid").In(idxIfaces...).And("Amount").Ge(amountFilter)
vtxos1, err := r.findVtxos(ctx, query1)
...
for _, vtxo := range vtxos1 {
    if vtxo.Amount >= amountFilter {  // always true – Ge already guaranteed this
        pubkeyMap[vtxo.PubKey] = struct{}{}
    }
}

This mirrors the singular method's style (so it is at least consistent), but it would be cleaner to drop the redundant guard from the bulk method since the query already does the filtering.


Low – partial registration on mid-chunk WatchScripts error

Pre-fix, a WatchScripts failure was all-or-nothing (one giant message). Post-fix, if chunk k of N fails, chunks 1…k-1 are already registered and k+1…N are not. The scripts are not removed on error, so a subsequent retry would re-register already-watched scripts (presumably idempotent) and catch the missing ones. This is fine in practice but worth documenting at the call site in restoreWatchingVtxos so a future reader knows why a partial failure is safe to retry.


Minor – progress logging removed from restoreWatchingVtxos

The old milestone logging (restore watching vtxos: 10%…100%) has been dropped. Given that the function now completes in <1 s regardless of N, this is the right call. Operators who saw that log at startup with large round counts can stop worrying about it.


Nits

  • internal/core/application/service.go: len(scripts) <= 0len(scripts) == 0 – good cosmetic fix.
  • Postgres query.sql.go: the removed trailing space on SELECT DISTINCT v.pubkey is a nice cleanup.
  • chunkStrings panic message is clear and testable; TestChunkStringsBadSizePanics covers it properly.
  • rows.Close() called in both defer and explicit error-checked form in the generated postgres/sqlite code – standard sqlc pattern, no issue.

Overall: the N+1 → O(1) collapse and the chunked gRPC transport fix are correct and well-tested. The only item I'd consider addressing before merge is leaving a note (or a //nolint-style warning) on the exported CommitmentTxidsAlt field to make the intended-single-entry-point constraint more visible to future contributors.

@bitcoin-coder-bob bitcoin-coder-bob marked this pull request as ready for review May 28, 2026 19:08
Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review — #1084

Reviewer: Arkana (automated)
Verdict: Request changes (one medium finding + protocol-critical flag requiring human sign-off)


Overall Assessment

Excellent PR. The N+1 → bulk query conversion is well-motivated with concrete benchmarks, the chunked gRPC calls solve a real message-size failure, the >= parity fix in badger is correct, and test coverage is thorough. The sqlc dual-slice workaround for SQLite is well-documented.

Most of this is ready to merge. Two findings below — one medium, one informational — plus the mandatory human review flag.


🟡 MEDIUM — SQLite parameter count limit at high round counts

Files: internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go, internal/infrastructure/db/sqlite/vtxo_repo.go

The SQLite bulk query expands commitment_txids and commitment_txids_alt into individual ? placeholders. Since both slices are the same data, total bound params = 1 + 2N where N = number of sweepable rounds.

modernc.org/sqlite (v1.33.1 per go.mod) compiles SQLite with SQLITE_MAX_VARIABLE_NUMBER = 32766. That means:

  • 10,000 rounds → 20,001 params ✅
  • 16,383 rounds → 32,767 params ❌ exceeds limit, query will fail

Granted, 16K+ sweepable (unsettled/unswept) rounds is extreme, but this is a silent cliff that would crash startup with an opaque SQLite error. Two options:

  1. Preferred: Batch the txid slice in the Go wrapper (e.g., chunks of 5000) and deduplicate results in Go. The postgres path doesn't need this since ANY($2::text[]) is a single bind.
  2. Acceptable: Document the limit and add a log warning if len(commitmentTxids) > 15000.

Not a ship-blocker for typical deployments, but worth hardening before this code encounters a long-running node that hasn't swept in a while.


ℹ️ INFO — Behavior change in Stop() error handling

File: internal/core/application/service.go:428-437

Old code: per-txid loop with continue on error — partial results were still passed to stopWatchingVtxos. If 1 of 1000 lookups failed, the other 999 rounds' scripts were still unwatched.

New code: bulk query with if err != nil { warn } else { stopWatching } — all-or-nothing. If the single bulk query fails, zero scripts get unwatched.

This is acceptable because: (a) a single query is less failure-prone than N queries, (b) shutdown unwatching is best-effort (the wallet daemon handles reconnect), and (c) the retry loop in stopWatchingVtxos already handles transient gRPC failures. Just flagging for awareness.


✅ Things that look correct

  • Bulk SQL queries (postgres + sqlite): Both correctly query vtxo directly (not vtxo_vw), check both v.commitment_txid and the vtxo_commitment_txid join table, and return DISTINCT. The postgres version reuses the same $2::text[] bind cleanly.

  • New index: idx_vtxo_commitment_txid_commitment_txid on vtxo_commitment_txid(commitment_txid) — correct column, IF NOT EXISTS makes it idempotent, down migration drops it. Both sqlite and postgres migrations are identical, which is correct.

  • Badger >= parity fix (internal/infrastructure/db/badger/vtxo_repo.go:327,332): The old > was a real bug. The badgerhold query used .Ge(amountFilter) (inclusive) but the Go dedup loop used > (exclusive), silently dropping VTXOs with amount == amountFilter. Currently harmless since both callers pass amountFilter=0, but the fix makes the contract correct for future callers. Good catch.

  • Badger bulk implementation: The MatchFunc approach for the CommitmentTxids slice intersection is the right call given badgerhold's limitations. The wanted map lookup is O(1) per txid, keeping the MatchFunc efficient. The redundant >= amountFilter check in the dedup loop (after .Ge(amountFilter) in the query) is harmless belt-and-suspenders.

  • Chunked WatchScripts/UnwatchScripts: Clean implementation. defaultWatchScriptsChunkSize = 2000 at ~75 bytes/script ≈ 150 KiB per RPC is conservative. The effectiveChunkSize() fallback pattern is clean. Panicking on size <= 0 is the right call for an internal helper.

  • Partial chunk failure + retry: If chunk 3/5 fails in WatchScripts, the error propagates up. In stopWatchingVtxos, the outer for { ... continue } loop retries all chunks — re-watching already-watched scripts is idempotent. Correct.

  • Test coverage: TestVtxoPubKeysBulkMatchesLoop is the most important test (asserts bulk == loop union). The chunkStrings table tests + panic test + fake gRPC client tests are comprehensive. The inline service_test.go additions cover nil slice, non-existent txid, dedup assertion, and amount filtering.

  • No cross-repo impact: VtxoRepository is internal. No SDKs or external consumers implement it. The singular GetVtxoPubKeysByCommitmentTxid is preserved on the interface — no breakage.


🔴 PROTOCOL-CRITICAL FLAG

This PR touches VTXO script watching — the mechanism by which arkd detects on-chain VTXO movements (unilateral exits, forfeit transactions, sweep confirmations). If scripts are not watched after restart, the server will miss on-chain events and fail to respond to exits or trigger forfeits.

The code change is correct and strictly better than the status quo, but per repo policy: a human must review and approve protocol-critical changes before merge.

@bitcoin-coder-bob — please address the SQLite parameter limit concern above, then this is ready for human sign-off.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
internal/infrastructure/db/service_test.go (1)

1027-1042: ⚡ Quick win

Consider verifying the complete union result.

The test checks deduplication correctly and verifies that tapkey1, tapkey2, and tapkey3 are present. However, it doesn't verify that keys from commitmentTxid (specifically pubkey and pubkey2 from the setup at lines 676-745) are also in the result. Adding assertions for those would provide more complete coverage of the union semantics described in the comment.

✨ Optional enhancement to verify complete union
 	for k, n := range seen {
 		require.Equalf(t, 1, n, "duplicate pubkey %s in bulk result", k)
 	}
+	// Verify complete union: keys from both commitment txids
 	require.Contains(t, bulkKeys, "tapkey1")
 	require.Contains(t, bulkKeys, "tapkey2")
 	require.Contains(t, bulkKeys, "tapkey3")
+	require.Contains(t, bulkKeys, pubkey)
+	require.Contains(t, bulkKeys, pubkey2)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/infrastructure/db/service_test.go` around lines 1027 - 1042, The
test currently verifies deduplication and presence of tapkey1/2/3 after calling
svc.Vtxos().GetVtxoPubKeysByCommitmentTxids(ctx, []string{otherCommitmentTxid,
commitmentTxid}, 0) but misses asserting that the keys from commitmentTxid
(setup variables pubkey and pubkey2) are included; update the test to also
require.Contains(t, bulkKeys, pubkey) and require.Contains(t, bulkKeys, pubkey2)
(or assert the full expected union length) after the deduplication check so the
union semantics are fully validated.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/core/application/service.go`:
- Around line 3648-3653: The loop that builds watch scripts only checks
len(key)==64 but doesn't validate hex content; update the validation so that
each key is decoded (e.g., hex.DecodeString(key)) and ensure it yields exactly
32 bytes before appending to scripts (keep the existing fmt.Sprintf("5120%s",
key) for valid keys), thereby skipping non-hex or wrong-size values to prevent
WatchScripts failures during startup recovery.

---

Nitpick comments:
In `@internal/infrastructure/db/service_test.go`:
- Around line 1027-1042: The test currently verifies deduplication and presence
of tapkey1/2/3 after calling svc.Vtxos().GetVtxoPubKeysByCommitmentTxids(ctx,
[]string{otherCommitmentTxid, commitmentTxid}, 0) but misses asserting that the
keys from commitmentTxid (setup variables pubkey and pubkey2) are included;
update the test to also require.Contains(t, bulkKeys, pubkey) and
require.Contains(t, bulkKeys, pubkey2) (or assert the full expected union
length) after the deduplication check so the union semantics are fully
validated.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d52cf679-b965-4026-98ef-705e3e199efa

📥 Commits

Reviewing files that changed from the base of the PR and between c4f1632 and 4f9b556.

📒 Files selected for processing (17)
  • internal/core/application/service.go
  • internal/core/domain/vtxo_repo.go
  • internal/infrastructure/db/badger/vtxo_repo.go
  • internal/infrastructure/db/postgres/migration/20260527150000_vtxo_commitment_txid_index.down.sql
  • internal/infrastructure/db/postgres/migration/20260527150000_vtxo_commitment_txid_index.up.sql
  • internal/infrastructure/db/postgres/sqlc/queries/query.sql.go
  • internal/infrastructure/db/postgres/sqlc/query.sql
  • internal/infrastructure/db/postgres/vtxo_repo.go
  • internal/infrastructure/db/service_test.go
  • internal/infrastructure/db/sqlite/migration/20260527150000_vtxo_commitment_txid_index.down.sql
  • internal/infrastructure/db/sqlite/migration/20260527150000_vtxo_commitment_txid_index.up.sql
  • internal/infrastructure/db/sqlite/sqlc/queries/query.sql.go
  • internal/infrastructure/db/sqlite/sqlc/query.sql
  • internal/infrastructure/db/sqlite/vtxo_repo.go
  • internal/infrastructure/db/vtxo_repo_bench_test.go
  • internal/infrastructure/wallet/wallet_client.go
  • internal/infrastructure/wallet/wallet_client_test.go

Comment thread internal/core/application/service.go Outdated
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 28, 2026

Actionable comments posted: 0

@bitcoin-coder-bob bitcoin-coder-bob deleted the bob/nbexplorer-enhancement branch May 29, 2026 15:53
@bitcoin-coder-bob bitcoin-coder-bob restored the bob/nbexplorer-enhancement branch May 29, 2026 15:53
@bitcoin-coder-bob
Copy link
Copy Markdown
Collaborator Author

@coderabbitai @arkanaai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

@bitcoin-coder-bob: Sure! I'll kick off a fresh review of the PR now.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@arkanaai
Copy link
Copy Markdown
Contributor

arkanaai Bot commented May 29, 2026

Arkana Review — PR #1084: Bulk vtxo pubkey lookup + chunked WatchScripts

Solid performance work. The root causes are correctly diagnosed, the fixes are clean, and the benchmark numbers are credible. A few items worth addressing before merge:


🟡 Behavioural change in Stop() — reduced error tolerance

Old code (per-round loop):

for _, commitmentTxId := range commitmentTxIds {
    keys, err := s.repoManager.Vtxos().GetVtxoPubKeysByCommitmentTxid(...)
    if err != nil {
        log.WithError(err).Warn("failed to get vtxo tap keys")
        continue  // partial success — remaining rounds still processed
    }
    tapkeys = append(tapkeys, keys...)
}
s.stopWatchingVtxos(tapkeys)

New code:

tapkeys, err := s.repoManager.Vtxos().GetVtxoPubKeysByCommitmentTxids(...)
if err != nil {
    log.WithError(err).Warn("failed to get vtxo tap keys")
} else {
    s.stopWatchingVtxos(tapkeys)  // all-or-nothing
}

If the single bulk query fails, stopWatchingVtxos is skipped entirely — the wallet daemon keeps watching scripts that should have been unwatched on shutdown. In practice this is low risk (why is the DB failing at shutdown?), but it's a regression from the old continue-on-error behaviour. A log line noting "skipping UnwatchScripts due to bulk query failure" would at least make the state explicit.


🟡 Progress logging removed from restoreWatchingVtxos

The old loop emitted restore watching vtxos: N%... debug milestones, which was useful for diagnosing slow boots. The new code is so fast those milestones are irrelevant at normal scale — but at extreme scale (hundreds of thousands of rounds, post-batching across multiple DB calls) that visibility disappears.

Suggested: after the bulk query returns, log once at Debug level with the count of rounds and scripts, e.g.:

log.Debugf("restoreWatchingVtxos: %d sweepable rounds → %d scripts", len(commitmentTxIds), len(scripts))

🟡 Multi-batch SQLite path has no test

The PR description notes the batching helper accepts batchSize for test injection, but no test actually drives the multi-batch path. With sqliteVtxoPubKeysBatchSize = 5000, a single test with, say, batchSize=2 and a handful of txids would cover the slice-and-merge logic. Without it, a regression in the loop boundary (off-by-one in start/end, missed dedup across batches) would go undetected.


🟢 Badger MatchFunc — expected linear scan, document it

The query2 MatchFunc for CommitmentTxids does a full scan of the vtxo store per call. The SQL backends avoid this via the indexed join. This is fine for badger (dev/embedded use), but worth a one-liner comment so the next reader doesn't wonder why it wasn't indexed:

// badgerhold has no "slice intersects set" index operator; this is a
// full vtxo store scan filtered in-memory. Acceptable for the badger
// (dev/embedded) backend; SQL backends use an indexed JOIN instead.

🟢 batchSize <= 0 semantics differ from chunkStrings

chunkStrings panics on size <= 0 (correct — production caller always uses effectiveChunkSize()). The sqlite batch helper silently promotes <= 0 to len(input) (one giant batch). The distinction is fine and intentional, but the comment on getVtxoPubKeysByCommitmentTxidsBatched says "batchSize <= 0 is treated as 'no batching' and runs a single call" which could confuse someone who expected the same panic behaviour. Consider adding "this differs from chunkStrings which panics, because this helper is only ever called from the public method with sqliteVtxoPubKeysBatchSize > 0."


✅ What looks good

  • Bulk query correctness: Using vtxo directly instead of vtxo_vw to avoid the LIKE '%,<txid>,%' scan is the right call. The new index idx_vtxo_commitment_txid_commitment_txid makes the EXISTS clause an index seek on all backends.
  • Postgres ANY() reuse: Single bound parameter reused in both clauses — clean.
  • SQLite dual-slice workaround: The sqlc limitation is well-documented, the two-field struct is clearly named, and the single public method enforces identical slices.
  • >= parity fix on badger: Catching the > vs >= discrepancy across backends is exactly the kind of subtle cross-backend drift that causes hard-to-reproduce bugs. Good catch.
  • Hex-decode validation in restoreWatchingVtxos is strictly better than the old len(key) != 64 check (a 64-char non-hex string would have passed the old guard).
  • chunkStrings panics on size <= 0: Correct call — effectiveChunkSize() ensures it never fires in production.
  • Test coverage for the wallet client chunking is comprehensive (nil/empty, exact boundary, splits, large round-trip, mid-chunk error short-circuit, default fallback).
  • Migration files have matching up/down pairs.
  • Benchmark methodology (single -benchtime=1x iteration with explicit timeout) is appropriate given the intentionally-slow baseline.

Summary

Three issues to address before merge:

  1. Must-have: Test for the multi-batch SQLite path (low effort, high value).
  2. Should-have: Log a warning/debug message in the Stop() error path noting that UnwatchScripts is being skipped.
  3. Nice-to-have: Single Debug log in restoreWatchingVtxos with round/script count; Badger MatchFunc comment.

No protocol-critical paths touched. Flagging for human approval before merge per standard policy.

Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arkana Review — PR #1084 (incremental, commit d45526b)

All three items from my previous review have been addressed:

  1. Multi-batch SQLite testvtxo_repo_batching_test.go covers batch sizes {1, 2, 3, rounds-1, rounds, rounds+1, 0}, exercises dedup across batch boundaries via cross-linked commitment txids in the join table, and tests min-amount filtering separately. Good boundary coverage.

  2. Explicit Stop() log — Warning now states the consequence clearly: wallet may keep watching scripts until next restart. Makes the all-or-nothing behavior visible to operators.

  3. restoreWatchingVtxos debug log — Now includes both script count and round count for boot diagnostics.

The export_test.go pattern is the correct Go idiom for test-only access to unexported helpers.

Approving from an automated review perspective. Since this PR touches VTXO lookup paths used in startup recovery and shutdown (protocol-adjacent), a human reviewer should still sign off before merge.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go (1)

96-108: ⚡ Quick win

Add a boundary case where amount == min_amount.

The seeded amounts (100/5000/200/7500) all sit clearly above or below min_amount=1000, so this test passes identically whether the predicate is >= min_amount or > min_amount. Since this PR explicitly aligns the badger backend to >= for parity, a vtxo with amount == 1000 would lock the inclusive boundary and guard against a future >/>= regression.

💚 Suggested boundary coverage
 	insertVtxoRow(t, db, "vtxo-a-low", 0, "pubkey-a-low", 100, commitmentTxids[0])
 	insertVtxoRow(t, db, "vtxo-a-high", 0, "pubkey-a-high", 5000, commitmentTxids[0])
+	insertVtxoRow(t, db, "vtxo-a-eq", 0, "pubkey-a-eq", 1000, commitmentTxids[0])
 	insertVtxoRow(t, db, "vtxo-b-low", 0, "pubkey-b-low", 200, commitmentTxids[1])
 	insertVtxoRow(t, db, "vtxo-b-high", 0, "pubkey-b-high", 7500, commitmentTxids[1])
@@
-	require.ElementsMatch(t, []string{"pubkey-a-high", "pubkey-b-high"}, got)
+	require.ElementsMatch(t,
+		[]string{"pubkey-a-eq", "pubkey-a-high", "pubkey-b-high"}, got)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go` around lines 96
- 108, Add a boundary case where a vtxo has amount exactly equal to min_amount
(1000) so the test verifies inclusive behavior; seed an additional row with
insertVtxoRow (e.g., "vtxo-equal", amount 1000) tied to one of the
commitmentTxids, call sqlitedb.NewVtxoRepository and then
sqlitedb.GetVtxoPubKeysByCommitmentTxidsBatched(ctx, repo, commitmentTxids,
1000, 1) and assert that the pubkey for the equal-amount vtxo is included in the
returned slice (since the expected predicate is >= min_amount).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go`:
- Around line 43-54: The test currently misses the inclusive boundary for
minAmount; in TestGetVtxoPubKeysByCommitmentTxidsBatched_MinAmount add a vtxo
whose amount equals the minAmount (1000) using the same helper used for other
vtxos (e.g., insertVtxoRow) and ensure its pubkey is included in the expected
set so the query's v.amount >= min_amount behavior is asserted; keep the
existing join-table cross-link setup (insertVtxoCommitmentTxidRow) unchanged so
dedup behavior is still exercised.

---

Nitpick comments:
In `@internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go`:
- Around line 96-108: Add a boundary case where a vtxo has amount exactly equal
to min_amount (1000) so the test verifies inclusive behavior; seed an additional
row with insertVtxoRow (e.g., "vtxo-equal", amount 1000) tied to one of the
commitmentTxids, call sqlitedb.NewVtxoRepository and then
sqlitedb.GetVtxoPubKeysByCommitmentTxidsBatched(ctx, repo, commitmentTxids,
1000, 1) and assert that the pubkey for the equal-amount vtxo is included in the
returned slice (since the expected predicate is >= min_amount).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f2d80562-f2a9-45bd-a534-e0bd6f0cdab4

📥 Commits

Reviewing files that changed from the base of the PR and between 7a79de6 and d45526b.

📒 Files selected for processing (4)
  • internal/core/application/service.go
  • internal/infrastructure/db/sqlite/export_test.go
  • internal/infrastructure/db/sqlite/vtxo_repo.go
  • internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go
✅ Files skipped from review due to trivial changes (1)
  • internal/infrastructure/db/sqlite/export_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/core/application/service.go
  • internal/infrastructure/db/sqlite/vtxo_repo.go

Comment thread internal/infrastructure/db/sqlite/vtxo_repo_batching_test.go
Copy link
Copy Markdown
Contributor

@arkanaai arkanaai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arkana Review — PR #1084 (incremental, commit 9de0251)

New commit adds an amount == min_amount boundary row to TestGetVtxoPubKeysByCommitmentTxidsBatched_MinAmount, explicitly locking the inclusive >= predicate across all backends. This was the last open item from my previous review.

All previous findings resolved. LGTM. ✅

⚠️ Protocol-critical reminder: This PR touches VTXO lookup paths used during startup recovery and shutdown. Human sign-off is still required before merge per protocol-critical review policy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant