Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
c05ccba
build: antithesis-sdk dep + instrumented-build support
DAlperin May 15, 2026
a079d3f
storage/persist/catalog: wrap panic and invariant sites with antithes…
DAlperin May 15, 2026
149531f
parallel_workload: pool-backed mode for externally-managed clusters
DAlperin May 15, 2026
36c78fb
parallel_workload: narrow Scenario.Kill swallow to fault-shaped errors
DAlperin May 17, 2026
94940c4
test/antithesis: harness scaffolding (compose, Makefile, workload image)
DAlperin May 15, 2026
58246e9
ci: nightly Antithesis pipeline + compose lint
DAlperin May 15, 2026
72244ca
test/antithesis: workload helpers (pg/mysql/kafka/testdrive/logging)
DAlperin May 15, 2026
f5fe5c6
test/antithesis: kafka source workload drivers
DAlperin May 15, 2026
725bba4
test/antithesis: mysql CDC workload drivers
DAlperin May 15, 2026
9f6e35c
test/antithesis: postgres CDC workload drivers + testdrive singleton
DAlperin May 15, 2026
7244335
test/antithesis: parallel-workload driver + upsert-ancient property
DAlperin May 15, 2026
a753a00
test/antithesis: catalog/persist/reclock SUT-anchored drivers
DAlperin May 15, 2026
f445668
test/antithesis: drivers targeting SinceViolation bug family (#11200 …
DAlperin May 15, 2026
c19c91b
test/antithesis: split monolithic harness into per-workload-group con…
DAlperin May 17, 2026
089fc49
test/antithesis: drop broken recovery anytime drivers + redundant mys…
DAlperin May 17, 2026
34c599a
test/antithesis: drop unused testdrive reset paths
DAlperin May 17, 2026
5100c87
test/antithesis: bump parallel-workload runtime + add failure-rate/se…
DAlperin May 17, 2026
4b2baf4
test/antithesis: wire CancelAction into parallel-workload driver
DAlperin May 17, 2026
b42f200
test/antithesis: tolerate polaris-side fault errors during parallel-w…
DAlperin May 17, 2026
3cb9ebc
testdrive: use Instant for elapsed-time delta, immune to wall-clock f…
DAlperin May 18, 2026
a5d2713
test/antithesis: add upsert-stress group for INC-936 invalid-upsert-s…
DAlperin May 18, 2026
2b9f62a
test/antithesis: add sql-server-cdc group + testdrive-runner for mysq…
DAlperin May 19, 2026
344239b
test/antithesis: parallel-workload SinceViolation coverage (REFRESH E…
DAlperin May 20, 2026
8a70bba
test/antithesis: extract shared fault-tolerance helper; absorb three …
DAlperin May 20, 2026
edf9cb0
test/antithesis: add onboarding README + propose next frameworks to port
DAlperin May 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 50 additions & 1 deletion Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,7 @@ ahash = { version = "0.8.12", default-features = false }
aho-corasick = "1.1.4"
allocation-counter = "0"
anyhow = "1.0.102"
antithesis_sdk = "0.2.8"
array-concat = "0.5.5"
arrayvec = "0.7.6"
arrow = { version = "57", default-features = false }
Expand Down
48 changes: 35 additions & 13 deletions bin/ci-builder
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ set -euo pipefail

NIGHTLY_RUST_DATE=2026-05-06

# Allow overriding the container runtime (e.g. MZ_DEV_CI_BUILDER_RUNTIME=podman).
DOCKER="${MZ_DEV_CI_BUILDER_RUNTIME:-docker}"

workdir=$(pwd)
cd "$(dirname "$0")/.."

Expand Down Expand Up @@ -128,10 +131,14 @@ gid=$(id -g)
[[ "$gid" -lt 500 ]] && gid=$uid

build() {
local cache_args=()
if [[ "$DOCKER" != "podman" ]]; then
cache_args+=(--cache-from=materialize/ci-builder:"$cache_tag")
cache_args+=(--cache-to=type=inline,mode=max)
fi
# shellcheck disable=SC2086 # intentional splitting of build args string
docker buildx build --pull \
--cache-from=materialize/ci-builder:"$cache_tag" \
--cache-to=type=inline,mode=max \
"$DOCKER" buildx build --pull \
"${cache_args[@]}" \
$docker_build_args \
--tag materialize/ci-builder:"$tag" \
--tag ghcr.io/materializeinc/materialize/ci-builder:"$tag" \
Expand Down Expand Up @@ -181,13 +188,13 @@ case "$cmd" in
build "$@"
;;
exists)
docker manifest inspect "$image_registry"/ci-builder:"$tag" &> /dev/null
"$DOCKER" manifest inspect "$image_registry"/ci-builder:"$tag" &> /dev/null
;;
tag)
echo "$tag"
;;
push)
docker login ghcr.io -u materialize-bot --password "$GITHUB_GHCR_TOKEN"
"$DOCKER" login ghcr.io -u materialize-bot --password "$GITHUB_GHCR_TOKEN"
build --push "$@"
;;
run)
Expand Down Expand Up @@ -274,6 +281,7 @@ case "$cmd" in
--env AZURE_SERVICE_ACCOUNT_PASSWORD
--env AZURE_SERVICE_ACCOUNT_TENANT
--env GCP_SERVICE_ACCOUNT_JSON
--env ANTITHESIS_GCP_SERVICE_ACCOUNT_JSON
--env GITHUB_TOKEN
--env GITHUB_GHCR_TOKEN
--env GPG_KEY
Expand Down Expand Up @@ -372,20 +380,26 @@ case "$cmd" in
)
fi
if [[ "$(uname -s)" = Linux ]]; then
args+=(
--user "$(id -u):$(stat -c %g /var/run/docker.sock)"
)
if [[ "${MZ_DEV_CI_BUILDER_RUNTIME:-docker}" == "podman" ]]; then
args+=(--userns=keep-id)
else
args+=(
--user "$(id -u):$(stat -c %g /var/run/docker.sock)"
)
fi

if [[ $secrets == "true" ]]; then
# Allow Docker-in-Docker by mounting the Docker socket in the
# container. Host networking allows us to see ports created by
# containers that we launch.
args+=(
--volume "/var/run/docker.sock:/var/run/docker.sock"
--network host
--env "DOCKER_TLS_VERIFY=${DOCKER_TLS_VERIFY-}"
--env "DOCKER_HOST=${DOCKER_HOST-}"
)
if [[ -S /var/run/docker.sock ]]; then
args+=(--volume "/var/run/docker.sock:/var/run/docker.sock")
fi

# Forward Docker configuration too, if available.
docker_dir=${DOCKER_CONFIG:-$HOME/.docker}
Expand Down Expand Up @@ -431,14 +445,22 @@ case "$cmd" in
image="$image_registry/ci-builder:$tag"
# Try downloading the image a few times in case of registry flakiness
if [[ "${CI:-}" ]]; then
if ! docker inspect "$image" > /dev/null 2>&1; then
docker pull "$image" || (sleep 3 && docker pull "$image") || (sleep 3 && docker pull "$image") || sleep 3
if ! "$DOCKER" inspect "$image" > /dev/null 2>&1; then
"$DOCKER" pull "$image" || (sleep 3 && "$DOCKER" pull "$image") || (sleep 3 && "$DOCKER" pull "$image") || sleep 3
fi
fi
docker run "${args[@]}" "$image" eatmydata "${docker_command[@]}"
if [[ "$DOCKER" == "podman" ]]; then
# --userns=keep-id already maps the host UID/GID into the
# container, so autouseradd is unnecessary. Override the
# entrypoint to skip it.
args+=(--entrypoint eatmydata)
"$DOCKER" run "${args[@]}" "$image" "${docker_command[@]}"
else
"$DOCKER" run "${args[@]}" "$image" eatmydata "${docker_command[@]}"
fi
;;
root-shell)
docker exec --interactive --tty --user 0:0 "$(<"$cid_file")" eatmydata ci/builder/root-shell.sh
"$DOCKER" exec --interactive --tty --user 0:0 "$(<"$cid_file")" eatmydata ci/builder/root-shell.sh
;;
*)
printf "unknown command %q\n" "$cmd"
Expand Down
9 changes: 7 additions & 2 deletions ci/builder/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
# Stage 1: Build a minimum CI Builder image that we can use for the initial
# steps like `mkpipeline` and `Build`, as well as any tests that are self
# contained and use other Docker images.
FROM ubuntu:noble-20260410 AS ci-builder-min
FROM ubuntu:noble-20260210.1 AS ci-builder-min

WORKDIR /workdir

Expand Down Expand Up @@ -399,8 +399,13 @@ ENV CARGO_HOME=/cargo
RUN mkdir /cargo && chmod 777 /cargo
VOLUME /cargo

# Antithesis coverage instrumentation library (used when --antithesis is passed)
RUN curl -sSL https://antithesis.com/assets/instrumentation/libvoidstar.so \
-o /usr/lib/libvoidstar.so \
&& ldconfig

# Stage 3: Build a lightweight CI Builder image for console/playwright jobs.
FROM ubuntu:noble-20260410 AS ci-builder-console
FROM ubuntu:noble-20260324 AS ci-builder-console

ARG ARCH_GCC
ARG ARCH_GO
Expand Down
32 changes: 32 additions & 0 deletions ci/mkpipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,12 @@ def main() -> int:
type=Sanitizer,
choices=Sanitizer,
)
parser.add_argument(
"--antithesis",
action="store_true",
default=ui.env_is_truthy("CI_ANTITHESIS"),
help="enable Antithesis coverage instrumentation",
)
parser.add_argument(
"--priority",
type=int,
Expand Down Expand Up @@ -166,6 +172,7 @@ def get_hashes(arch: Arch) -> tuple[str, bool]:
arch=arch,
coverage=args.coverage,
sanitizer=args.sanitizer,
antithesis=args.antithesis,
)
deps = repo.resolve_dependencies(image for image in repo if image.publish)
check = deps.check()
Expand Down Expand Up @@ -209,6 +216,7 @@ def fetch_hashes() -> None:
args.coverage,
args.sanitizer,
lto,
args.antithesis,
)
trim_ci_glue_exempt_steps(pipeline)
else:
Expand All @@ -218,9 +226,11 @@ def fetch_hashes() -> None:
args.coverage,
args.sanitizer,
lto,
args.antithesis,
)
truncate_skip_length(pipeline)
handle_sanitizer_skip(pipeline, args.sanitizer)
handle_antithesis_skip(pipeline, args.antithesis)
increase_agents_timeouts(pipeline, args.sanitizer, args.coverage)
prioritize_pipeline(pipeline, args.priority)
switch_jobs_to_aws(pipeline, args.priority)
Expand All @@ -240,6 +250,7 @@ def fetch_hashes() -> None:
args.coverage,
args.sanitizer,
lto,
args.antithesis,
)
add_nightly_deploy_dependency(pipeline, args.pipeline)
remove_dependencies_on_prs(pipeline, args.pipeline, hash_check)
Expand Down Expand Up @@ -328,6 +339,21 @@ def handle_sanitizer_skip(pipeline: Any, sanitizer: Sanitizer) -> None:
step["skip"] = True


def handle_antithesis_skip(pipeline: Any, antithesis: bool) -> None:
if antithesis:
pipeline.setdefault("env", {})["CI_ANTITHESIS"] = "1"

for step in steps(pipeline):
if step.get("antithesis") == "skip":
step["skip"] = True

else:

for step in steps(pipeline):
if step.get("antithesis") == "only":
step["skip"] = True


def increase_agents_timeouts(
pipeline: Any, sanitizer: Sanitizer, coverage: bool
) -> None:
Expand Down Expand Up @@ -711,6 +737,7 @@ def trim_tests_pipeline(
coverage: bool,
sanitizer: Sanitizer,
lto: bool,
antithesis: bool = False,
) -> None:
"""Trim pipeline steps whose inputs have not changed in this branch.

Expand All @@ -731,6 +758,7 @@ def trim_tests_pipeline(
profile=mzbuild.Profile.RELEASE if lto else mzbuild.Profile.OPTIMIZED,
coverage=coverage,
sanitizer=sanitizer,
antithesis=antithesis,
)
deps = repo.resolve_dependencies(image for image in repo)

Expand Down Expand Up @@ -917,6 +945,7 @@ def add_cargo_test_dependency(
coverage: bool,
sanitizer: Sanitizer,
lto: bool,
antithesis: bool = False,
) -> None:
"""Cargo Test normally doesn't have to wait for the build to complete, but it requires a few images (ubuntu-base, postgres), which are rarely changed. So only add a dependency when those images are not on Dockerhub yet."""
if pipeline_name not in ("test", "nightly"):
Expand All @@ -933,6 +962,7 @@ def add_cargo_test_dependency(
profile=mzbuild.Profile.RELEASE if lto else mzbuild.Profile.OPTIMIZED,
coverage=coverage,
sanitizer=sanitizer,
antithesis=antithesis,
)
composition = Composition(repo, name="cargo-test")
deps = composition.dependencies
Expand Down Expand Up @@ -1090,6 +1120,8 @@ def remove_mz_specific_keys(pipeline: Any) -> None:
del step["coverage"]
if "sanitizer" in step:
del step["sanitizer"]
if "antithesis" in step:
del step["antithesis"]
if "ci_glue_exempt" in step:
del step["ci_glue_exempt"]
if (
Expand Down
23 changes: 23 additions & 0 deletions ci/nightly/pipeline.template.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,29 @@ steps:
branches: "main"
skip: "currently broken"

- id: build-x86_64-antithesis
label: ":rust: Build x86_64 (Antithesis)"
# Regenerate the antithesis compose YAML before building so the
# `antithesis-config` image's fingerprint captures the same
# materialized fingerprint we're about to publish — otherwise
# Antithesis would try to pull a stale `materialized:mzbuild-…`
# whenever the committed YAML lagged behind source changes.
command: bin/ci-builder run stable ci/test/build-antithesis.sh
inputs:
- "*"
depends_on: []
timeout_in_minutes: 90
agents:
queue: l-builder-linux-x86_64
env:
CI_ANTITHESIS: "1"
# Antithesis-flavored images get distinct mzbuild fingerprints, so
# they coexist with regular GHCR tags. The build is x86_64-only —
# Antithesis runs amd64 sandboxes.
sanitizer: skip
coverage: skip
antithesis: skip

- id: build-rust-latest-beta
label: "Build with Latest Rust Beta"
command: bin/ci-builder run stable ci/test/rust-beta-build.sh
Expand Down
Loading
Loading