ClickHouse · alexey-milovidov · May 11, 2026 · May 7, 2026 · May 7, 2026 · May 7, 2026
diff --git a/.gitignore b/.gitignore
@@ -5,3 +5,24 @@
 *.parquet
 hits.csv
 hits.tsv
+
+# Per-system runtime artifacts produced by benchmark.sh
+result.csv
+log.txt
+load_out.txt
+server.log
+server.pid
+arc_token.txt
+data-size.txt
+.doris_home
+.sirius_env
+
+# Per-system data files
+hits.db
+mydb
+hits.hyper
+hits.vortex
+*.vortex
+
+# Python venvs created by install scripts
+myenv/
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,6 +2,19 @@
 
 Changes in the benchmark methodology or presentation, as well as major news.
 
+### 2026-05-11
+Unified benchmark scripts for different systems by providing a common interface in a set of scripts: `install`, `start`, `check`, `stop`, `load`, `query`, and `data-size`. Make the dataset download scripts common as well. Use a general benchmark runner in `lib/` to ensure different systems get equal treatment. This makes it easier to add more ways of testing, different datasets and scenarios to the benchmark, and simplifies support of all 88 systems presented. Note: for embedded systems, such as sqlite and Python duckdb module, wrap them into a Python HTTP server, so that the benchmark can run each query separately.
+
+Restart databases before measuring cold run of each query as requested in [#667](https://github.com/ClickHouse/ClickBench/issues/667) and [#793](https://github.com/ClickHouse/ClickBench/issues/793). This prevents unfair measurements and removes the way for cheating on benchmark for systems that do excessive in-process caching without flushing it before the cold run. Unify flushing the OS page cache before cold run, so that all benchmark entries follow the same rules. Notes: for stateless systems (such as query engines on top of Parquet), the restart is no-op; for systems without durability and in-memory systems, the restart before each query also requires repeated data loading, which time is included in the cold query measurement. 
+
+Introduced a new measurement - QPS and error rate on concurrent workload (10 connections for 10 minutes) to prove the advantage of the refactoring. Currently, the metric is not exposed in the benchmark. 
+
+Re-run 88 systems on every machine. Fixed queries with regexps for MariaDB and SQLite. Added ARM64 versions for some systems: databend, octosql, octeryx. Use the faster data loader for MariaDB. An attempt to rerun CedarDB showed a bug. Added new systems: Trino, Presto, Quickwit. Generic runner for pandas and polars. Fixed issues with Spark variants. Clean up some tags. Some systems are found dead: vertica, kinetica, singlestore, heavyai.     
+
+Improve the website: move important selectors (open-source, hardware, tuned) on top and show them horizontally, they also filter out visible options in other selectors. When the mouse pointer is on top of a system, highlight their tags. Add a button on the diagram to remove a system from the report. Add measurement date to the diagram (as requested in [#639](https://github.com/ClickHouse/ClickBench/issues/639)). Make some cloud machine names shorter to remove clutter. The report methodology (aggregation of the measurements) and the default selection remains unchanged.
+
+(Alexey Milovidov)
+
 ### 2026-05-08
 Refactored directory structure to keep every historical result - they are organized in directories `system/results/YYYYMMDD/*.json` for each date. Compared to using git history, this unifies the format and structure of the results, making them ready for analysis. You can analyze it with clickhouse-local: `ch "SELECT * FROM '*/results/*/*.json'"` or export the data: `ch "SELECT * FROM '*/results/*/*.json' ORDER BY _path INTO OUTFILE 'results.parquet'"` (Alexey Milovidov).
 
@@ -56,7 +69,7 @@ The systems on the main chart are distinguished by color (systems from the same
 
 Added the "open-source" and "proprietary" tags, so that you can list only open-source databases. For the reference, Umbra, Hyper, and CedarDB are proprietary.
 
-Removed pointless tags, that some systems attribute to themself. One system misattributed itself as "mysql-compatible", two others added tags with their names, another reported two programming languages, a few systems reported an "analytical" tag, which is pointless, and one system didn't report itself as "ClickHouse-derivative" while being based on the ClickHouse interfaces and architecture.
+Removed pointless tags, that some systems attribute to themselves. One system misattributed itself as "mysql-compatible", two others added tags with their names, another reported two programming languages, a few systems reported an "analytical" tag, which is pointless, and one system didn't report itself as "ClickHouse-derivative" while being based on the ClickHouse interfaces and architecture.
 
 Some systems provided bogus results on the loading time or data size. For example, one system reported data size 1000 times less, and we didn't notice that. This was corrected. The comparison on the loading time will not include stateless systems that don't require data loading.
 

diff --git a/arc/benchmark.sh b/arc/benchmark.sh
@@ -1,204 +1,5 @@
 #!/bin/bash
-# Arc ClickBench Complete Benchmark Script (Go Binary Version)
-set -e
-
-# ============================================================
-# 1. INSTALL ARC FROM .DEB PACKAGE
-# ============================================================
-echo "Installing Arc from .deb package..."
-
-# Fetch latest Arc version from GitHub releases
-echo "Fetching latest Arc version..."
-ARC_VERSION=$(curl -s https://api.github.com/repos/Basekick-Labs/arc/releases/latest | grep -oP '"tag_name": "v\K[^"]+')
-if [ -z "$ARC_VERSION" ]; then
-    echo "Error: Could not fetch latest Arc version from GitHub"
-    exit 1
-fi
-echo "Latest Arc version: $ARC_VERSION"
-
-ARCH=$(uname -m)
-if [ "$ARCH" = "aarch64" ] || [ "$ARCH" = "arm64" ]; then
-    DEB_URL="https://github.com/Basekick-Labs/arc/releases/download/v${ARC_VERSION}/arc_${ARC_VERSION}_arm64.deb"
-    DEB_FILE="arc_${ARC_VERSION}_arm64.deb"
-else
-    DEB_URL="https://github.com/Basekick-Labs/arc/releases/download/v${ARC_VERSION}/arc_${ARC_VERSION}_amd64.deb"
-    DEB_FILE="arc_${ARC_VERSION}_amd64.deb"
-fi
-
-echo "Detected architecture: $ARCH -> $DEB_FILE"
-
-if [ ! -f "$DEB_FILE" ]; then
-    wget -q "$DEB_URL" -O "$DEB_FILE"
-fi
-
-sudo dpkg -i "$DEB_FILE" || sudo apt-get install -f -y
-echo "[OK] Arc installed"
-
-# ============================================================
-# 2. PRINT SYSTEM INFO (Arc defaults)
-# ============================================================
-CORES=$(nproc)
-TOTAL_MEM_KB=$(grep MemTotal /proc/meminfo | awk '{print $2}')
-TOTAL_MEM_GB=$((TOTAL_MEM_KB / 1024 / 1024))
-MEM_LIMIT_GB=$((TOTAL_MEM_GB * 80 / 100))  # 80% of system RAM
-
-echo ""
-echo "System Configuration:"
-echo "  CPU cores:    $CORES"
-echo "  Connections:  $((CORES * 2)) (cores × 2)"
-echo "  Threads:      $CORES (same as cores)"
-echo "  Memory limit: ${MEM_LIMIT_GB}GB (80% of ${TOTAL_MEM_GB}GB total)"
-echo ""
-
-# ============================================================
-# 3. START ARC AND CAPTURE TOKEN FROM LOGS
-# ============================================================
-echo "Starting Arc service..."
-
-# Check if we already have a valid token from a previous run
-if [ -f "arc_token.txt" ]; then
-    EXISTING_TOKEN=$(cat arc_token.txt)
-    echo "Found existing token file, will verify after Arc starts..."
-fi
-
-sudo systemctl start arc
-
-# Wait for Arc to be ready
-echo "Waiting for Arc to be ready..."
-for i in {1..30}; do
-    if curl -sf http://localhost:8000/health > /dev/null 2>&1; then
-        echo "[OK] Arc is ready!"
-        break
-    fi
-    if [ $i -eq 30 ]; then
-        echo "Error: Arc failed to start within 30 seconds"
-        sudo journalctl -u arc --no-pager | tail -50
-        exit 1
-    fi
-    sleep 1
-done
-
-# Try to get token - either from existing file or from logs (first run)
-ARC_TOKEN=""
-
-# First, check if existing token works
-if [ -n "$EXISTING_TOKEN" ]; then
-    if curl -sf http://localhost:8000/health -H "x-api-key: $EXISTING_TOKEN" > /dev/null 2>&1; then
-        ARC_TOKEN="$EXISTING_TOKEN"
-        echo "[OK] Using existing token from arc_token.txt"
-    else
-        echo "Existing token invalid, looking for new token in logs..."
-    fi
-fi
-
-# If no valid token yet, try to extract from logs (first run scenario)
-if [ -z "$ARC_TOKEN" ]; then
-    ARC_TOKEN=$(sudo journalctl -u arc --no-pager | grep -oP '(?:Initial admin API token|Admin API token): \K[^\s]+' | head -1)
-    if [ -n "$ARC_TOKEN" ]; then
-        echo "[OK] Captured new token from logs"
-        echo "$ARC_TOKEN" > arc_token.txt
-    else
-        echo "Error: Could not find or validate API token"
-        echo "If this is not the first run, Arc's database may need to be reset:"
-        echo "  sudo rm -rf /var/lib/arc/data/arc.db"
-        exit 1
-    fi
-fi
-
-echo "Token: ${ARC_TOKEN:0:20}..."
-
-# ============================================================
-# 4. DOWNLOAD DATASET
-# ============================================================
-DATASET_FILE="hits.parquet"
-DATASET_URL="https://datasets.clickhouse.com/hits_compatible/hits.parquet"
-EXPECTED_SIZE=14779976446
-
-if [ -f "$DATASET_FILE" ]; then
-    CURRENT_SIZE=$(stat -c%s "$DATASET_FILE" 2>/dev/null || stat -f%z "$DATASET_FILE" 2>/dev/null)
-    if [ "$CURRENT_SIZE" -eq "$EXPECTED_SIZE" ]; then
-        echo "[OK] Dataset already downloaded (14GB)"
-    else
-        echo "Re-downloading dataset (size mismatch)..."
-        rm -f "$DATASET_FILE"
-        wget --continue --progress=dot:giga "$DATASET_URL"
-    fi
-else
-    echo "Downloading ClickBench dataset (14GB)..."
-    wget --continue --progress=dot:giga "$DATASET_URL"
-fi
-
-# ============================================================
-# 5. LOAD DATA INTO ARC
-# ============================================================
-echo "Loading data into Arc..."
-
-# Determine Arc's data directory (default: /var/lib/arc/data)
-ARC_DATA_DIR="/var/lib/arc/data"
-TARGET_DIR="$ARC_DATA_DIR/clickbench/hits"
-TARGET_FILE="$TARGET_DIR/hits.parquet"
-
-sudo mkdir -p "$TARGET_DIR"
-
-if [ -f "$TARGET_FILE" ]; then
-    SOURCE_SIZE=$(stat -c%s "$DATASET_FILE" 2>/dev/null || stat -f%z "$DATASET_FILE" 2>/dev/null)
-    TARGET_SIZE=$(stat -c%s "$TARGET_FILE" 2>/dev/null || stat -f%z "$TARGET_FILE" 2>/dev/null)
-    if [ "$SOURCE_SIZE" -eq "$TARGET_SIZE" ]; then
-        echo "[OK] Data already loaded"
-    else
-        echo "Reloading data (size mismatch)..."
-        sudo cp "$DATASET_FILE" "$TARGET_FILE"
-    fi
-else
-    sudo cp "$DATASET_FILE" "$TARGET_FILE"
-    echo "[OK] Data loaded to $TARGET_FILE"
-fi
-
-# ============================================================
-# 6. SET ENVIRONMENT AND RUN BENCHMARK
-# ============================================================
-export ARC_URL="http://localhost:8000"
-export ARC_API_KEY="$ARC_TOKEN"
-export DATABASE="clickbench"
-export TABLE="hits"
-
-echo ""
-echo "Running ClickBench queries (true cold runs)..."
-echo "================================================"
-./run.sh 2>&1 | tee log.txt
-
-# ============================================================
-# 7. STOP ARC AND FORMAT RESULTS
-# ============================================================
-echo "Stopping Arc..."
-sudo systemctl stop arc
-
-# Format results as proper JSON array
-cat log.txt | grep -oE '^[0-9]+\.[0-9]+|^null' | \
-  awk '{
-    if (NR % 3 == 1) printf "[";
-    printf "%s", $1;
-    if (NR % 3 == 0) print "],";
-    else printf ", ";
-  }' > results.txt
-
-echo ""
-echo "[OK] Benchmark complete!"
-echo "================================================"
-echo "Load time: 0"
-echo "Data size: $EXPECTED_SIZE"
-cat results.txt
-echo "================================================"
-
-# ============================================================
-# 8. CLEANUP
-# ============================================================
-echo "Cleaning up..."
-
-# Uninstall Arc package
-sudo dpkg -r arc || true
-
-# Remove Arc data directory
-sudo rm -rf /var/lib/arc
-
-echo "[OK] Cleanup complete"
+# Thin shim — actual flow is in lib/benchmark-common.sh.
+export BENCH_DOWNLOAD_SCRIPT="download-hits-parquet-single"
+export BENCH_DURABLE=yes
+exec ../lib/benchmark-common.sh
diff --git a/arc/check b/arc/check
@@ -0,0 +1,11 @@
+#!/bin/bash
+set -e
+
+ARC_URL="${ARC_URL:-http://localhost:8000}"
+TOKEN=$(cat arc_token.txt 2>/dev/null || true)
+
+if [ -n "$TOKEN" ]; then
+    curl -sf "$ARC_URL/health" -H "x-api-key: $TOKEN" >/dev/null
+else
+    curl -sf "$ARC_URL/health" >/dev/null
+fi
diff --git a/arc/data-size b/arc/data-size
@@ -0,0 +1,10 @@
+#!/bin/bash
+set -e
+
+# Source parquet file size (loaded into Arc's data directory).
+F="/var/lib/arc/data/clickbench/hits/hits.parquet"
+if [ -f "$F" ]; then
+    sudo stat -c%s "$F"
+else
+    echo 14779976446
+fi
diff --git a/arc/install b/arc/install
@@ -0,0 +1,28 @@
+#!/bin/bash
+set -e
+
+# Install Arc from a .deb release. Idempotent.
+if dpkg -l arc 2>/dev/null | grep -q '^ii '; then
+    exit 0
+fi
+
+ARC_VERSION=$(curl -s https://api.github.com/repos/Basekick-Labs/arc/releases/latest \
+    | grep -oP '"tag_name": "v\K[^"]+')
+if [ -z "$ARC_VERSION" ]; then
+    echo "Error: Could not fetch latest Arc version from GitHub" >&2
+    exit 1
+fi
+
+ARCH=$(uname -m)
+if [ "$ARCH" = "aarch64" ] || [ "$ARCH" = "arm64" ]; then
+    DEB_FILE="arc_${ARC_VERSION}_arm64.deb"
+else
+    DEB_FILE="arc_${ARC_VERSION}_amd64.deb"
+fi
+DEB_URL="https://github.com/Basekick-Labs/arc/releases/download/v${ARC_VERSION}/${DEB_FILE}"
+
+if [ ! -f "$DEB_FILE" ]; then
+    wget -q "$DEB_URL" -O "$DEB_FILE"
+fi
+
+sudo dpkg -i "$DEB_FILE" || sudo apt-get install -f -y
diff --git a/arc/load b/arc/load
@@ -0,0 +1,20 @@
+#!/bin/bash
+set -e
+
+# Arc loads the parquet file into its data directory and indexes it on startup.
+ARC_DATA_DIR="/var/lib/arc/data"
+TARGET_DIR="$ARC_DATA_DIR/clickbench/hits"
+TARGET_FILE="$TARGET_DIR/hits.parquet"
+
+sudo mkdir -p "$TARGET_DIR"
+
+if [ -f "$TARGET_FILE" ] && \
+   [ "$(stat -c%s hits.parquet)" -eq "$(stat -c%s "$TARGET_FILE")" ]; then
+    : # already loaded
+else
+    sudo cp hits.parquet "$TARGET_FILE"
+fi
+
+# Free up local space.
+rm -f hits.parquet
+sync
diff --git a/arc/query b/arc/query
@@ -0,0 +1,49 @@
+#!/bin/bash
+# Reads a SQL query from stdin, POSTs it to Arc's HTTP API.
+# Stdout: query response body (JSON).
+# Stderr: query runtime in fractional seconds on the last line (extracted
+#         from Arc's journal log line `execution_time_ms=N`).
+# Exit non-zero on error.
+set -e
+
+ARC_URL="${ARC_URL:-http://localhost:8000}"
+ARC_API_KEY="${ARC_API_KEY:-$(cat arc_token.txt 2>/dev/null)}"
+
+query=$(cat)
+
+# Build JSON payload with proper escaping.
+JSON_PAYLOAD=$(jq -Rs '{sql: .}' <<<"$query")
+
+# Mark journal position so we can locate the matching execution_time_ms entry.
+LOG_MARKER=$(date -u +"%Y-%m-%dT%H:%M:%S")
+
+RESPONSE=$(curl -s -w "\n%{http_code}" \
+    -X POST "$ARC_URL/api/v1/query" \
+    -H "x-api-key: $ARC_API_KEY" \
+    -H "Content-Type: application/json" \
+    -d "$JSON_PAYLOAD" \
+    --max-time 300)
+
+HTTP_CODE=$(printf '%s\n' "$RESPONSE" | tail -1)
+BODY=$(printf '%s\n' "$RESPONSE" | head -n -1)
+
+if [ "$HTTP_CODE" != "200" ]; then
+    printf 'arc query failed: HTTP %s\n%s\n' "$HTTP_CODE" "$BODY" >&2
+    exit 1
+fi
+
+# Result body to stdout.
+printf '%s\n' "$BODY"
+
+# Extract execution_time_ms from Arc's journal — give it a moment to flush.
+sleep 0.1
+EXEC_MS=$(sudo journalctl -u arc --since="$LOG_MARKER" --no-pager 2>/dev/null \
+    | grep -oP 'execution_time_ms=\K[0-9]+' | tail -1)
+
+if [ -z "$EXEC_MS" ]; then
+    echo "Could not extract execution_time_ms from arc journal" >&2
+    exit 1
+fi
+
+# Convert ms -> seconds and emit on stderr.
+awk -v ms="$EXEC_MS" 'BEGIN { printf "%.4f\n", ms / 1000 }' >&2