diff --git a/apps/docs/content/guides/backup.mdx b/apps/docs/content/guides/backup.mdx index fca47ed3..b72afee1 100644 --- a/apps/docs/content/guides/backup.mdx +++ b/apps/docs/content/guides/backup.mdx @@ -8,6 +8,8 @@ Zerops auto-backs up databases and storage daily (00:00-01:00 UTC) with X25519 e ## Supported Services MariaDB, PostgreSQL, Qdrant, Elasticsearch, NATS, Meilisearch, Shared Storage. +**ClickHouse**: not on the standard auto-backup path — back it up with the native `BACKUP ALL ...` SQL command (super user), stored as `.tar.gz`. + **Not supported**: Runtimes, Object Storage (use S3 lifecycle policies), Valkey/KeyDB (in-memory). ## Schedule Options @@ -40,17 +42,15 @@ End-to-end with X25519 per-project keys. Decrypted only on download. 7 days after service or project deletion before backups are permanently removed. ## Backup Formats by Service -| Service | Format | -|---------|--------| -| PostgreSQL | pg_dump | -| MariaDB | mysqldump | -| Elasticsearch | elasticdump (.gz) | -| Meilisearch | .dump | -| Qdrant | .snapshot | -| NATS | .tar.gz | -| Shared Storage | filesystem archive | +| Service | Tool → Format | +|---------|---------------| +| PostgreSQL | `pg_dump` → `.zip` (per-schema custom-format `-Fc` dumps) | +| MariaDB | `mariabackup` → `.xb.gz` (xbstream + gzip) — **not** `mysqldump` (that is the manual-export tool, a different operation) | +| Elasticsearch | elasticdump → `.gz` | +| Meilisearch | `.dump` | +| Qdrant | `.snapshot` | +| NATS | `.tar.gz` | +| Shared Storage | tar → `.tar.gz` | ## Gotchas -1. **Object Storage has no Zerops backup**: Use S3 lifecycle policies or external backup -2. **Valkey/KeyDB not backed up**: In-memory data — use persistence or application-level backup -3. **Backup storage is shared**: All services in a project share the backup quota +- Valkey/KeyDB are not backed up → rely on service persistence or application-level backup. diff --git a/apps/docs/content/guides/build-cache.mdx b/apps/docs/content/guides/build-cache.mdx index cf2efbfe..e2f6bb5d 100644 --- a/apps/docs/content/guides/build-cache.mdx +++ b/apps/docs/content/guides/build-cache.mdx @@ -80,15 +80,7 @@ Any change to these zerops.yml fields invalidates **both layers**: --- -## Build Container Specs - -CPU 1-5 cores, RAM 8 GB fixed, Disk 1-100 GB, Timeout 60 min. User `zerops` with **sudo**. Default OS: **Alpine** (use `apt-get` with `os: ubuntu`). - ---- - ## Common Pitfalls -1. **Cascade invalidation**: Changing `prepareCommands` wipes build-layer cache too (e.g., adding `sqlite` to prepare also clears cached `node_modules`) -2. **`cache: false` is misleading**: Only clears `/build/source` cache. Globally installed packages (Go modules, pip packages) persist in the base layer -3. **No-clobber restore**: If source repo contains a file also in cache, **source wins** -- the cached version is silently skipped (logged but does not fail) -4. **Lock file caching**: Cache lock files (`package-lock.json`, `composer.lock`) alongside dependency directories for consistent installs +1. **No-clobber restore**: If source repo contains a file also in cache, **source wins** -- the cached version is silently skipped (logged but does not fail) +2. **Lock file caching**: Cache lock files (`package-lock.json`, `composer.lock`) alongside dependency directories for consistent installs diff --git a/apps/docs/content/guides/cdn.mdx b/apps/docs/content/guides/cdn.mdx index 3b9d2cb8..821ae59e 100644 --- a/apps/docs/content/guides/cdn.mdx +++ b/apps/docs/content/guides/cdn.mdx @@ -48,13 +48,13 @@ DNS TTL: 30 seconds. Geo-steering routes to nearest node. EU Prague is fallback Wildcard must be at end. Use `$` suffix for exact file match. ### Purge via zsc + +Signature: `zsc cdn purge [path]` — the **domain is required first**; the path/pattern is the optional second arg (defaults to `*`). Works for **Static-Mode CDN only** — Object-Storage CDN content is purged via the REST API, not `zsc`. Run from a container whose CDN domain is active. ```bash -zsc cdn purge /* # Purge all cached content -zsc cdn purge /images/* # Purge directory -zsc cdn purge /style.css$ # Purge exact file +zsc cdn purge example.com # Purge all cached content for the domain +zsc cdn purge example.com "/images/*" # Purge a directory +zsc cdn purge example.com "/style.css$" # Purge exact file ``` ## Gotchas -1. **30-day fixed TTL**: Cannot be changed — `Cache-Control: max-age=3600` has no effect on CDN -2. **No wildcard domains on static CDN**: `*.domain.com` is not supported -3. **Purge wildcards at end only**: `/images/*.jpg` is invalid — use `/images/*` +1. **CDN URLs are project-scoped env vars**: `${storageCdnUrl}`, `${staticCdnUrl}`, `${apiCdnUrl}` are referenced directly with no hostname prefix (unlike service vars like `${storage_apiUrl}`) diff --git a/apps/docs/content/guides/choose-cache.mdx b/apps/docs/content/guides/choose-cache.mdx index 1debb17d..130f8115 100644 --- a/apps/docs/content/guides/choose-cache.mdx +++ b/apps/docs/content/guides/choose-cache.mdx @@ -9,24 +9,11 @@ description: "**Use Valkey.** KeyDB development has stalled and is effectively d | Need | Choice | Why | |------|--------|-----| -| **Any caching need** | **Valkey** (default) | Active development, full HA, Redis-compatible | +| **Any caching need** | **Valkey** (default) | Active development, optional HA, Redis-compatible | | Legacy KeyDB apps | KeyDB | Only if migrating existing KeyDB deployment | ## Valkey (Default Choice) - Redis-compatible drop-in replacement -- HA: 3 nodes (1 master + 2 replicas) with automatic failover -- Ports: 6379 (non-TLS), 6380 (TLS), 7000 (read replica non-TLS), 7001 (read replica TLS) -- Connection: `redis://${user}:${password}@${hostname}:6379` -- HA detail: Ports 6379/6380 on replicas forward traffic to current master (Zerops-specific, not native Valkey) - -## KeyDB (Deprecated) - -- Development activity has slowed significantly -- Port: 6379 -- **Do not use for new projects** - -## Gotchas -1. **HA replication is async**: Brief data loss possible during master failover -2. **Port forwarding is Zerops-specific**: Replicas forward 6379/6380 to master — this is not standard Redis/Valkey behavior -3. **Read replicas use different ports**: 7000/7001 for direct replica reads +- Version: use `valkey@7.2` — the platform rejects `valkey@8` at import (`serviceStackTypeNotFound`) +- Connection: `redis://${hostname}:6379` — Valkey runs **unauthenticated** on Zerops; there are NO `user`/`password` env vars. Do **not** template `${cache_user}`/`${cache_password}` — they don't exist and produce a broken DSN diff --git a/apps/docs/content/guides/choose-database.mdx b/apps/docs/content/guides/choose-database.mdx index fd6231e4..2c415ce5 100644 --- a/apps/docs/content/guides/choose-database.mdx +++ b/apps/docs/content/guides/choose-database.mdx @@ -3,38 +3,24 @@ title: "Choosing a Database on Zerops" description: "**Use PostgreSQL** for everything unless you have a specific reason not to. It's the best-supported database on Zerops with full HA, read replicas, and pgBouncer." --- -**Use PostgreSQL** for everything unless you have a specific reason not to. It's the best-supported database on Zerops with full HA, read replicas, and pgBouncer. +**Use PostgreSQL** for everything unless you have a specific reason not to — the best-supported database on Zerops, with optional HA, read replicas, and pgBouncer. Default mode is **NON_HA** (single node); HA is opt-in and immutable after creation. ## Decision Matrix | Need | Choice | Why | |------|--------|-----| -| **General-purpose** | **PostgreSQL** (default) | Full HA, read replicas, pgBouncer, best Zerops support | +| **General-purpose** | **PostgreSQL** (default) | Optional HA, read replicas, pgBouncer, best Zerops support | | MySQL compatibility | MariaDB | MaxScale routing, async replication | | Analytics / OLAP | ClickHouse | Columnar storage, ReplicatedMergeTree, 4 protocol ports | ## PostgreSQL (Default Choice) -- HA: 3 nodes (1 primary + 2 replicas) -- Ports: 5432 (primary), 5433 (read replicas), 6432 (external TLS via pgBouncer) -- Connection: `postgresql://${user}:${password}@${hostname}:5432/${db}` -- Read scaling: Use port 5433 for read-heavy workloads +- Connection: the generated `${connectionString}` is `postgresql://${user}:${password}@${hostname}:5432` (no database path). Append `/${dbName}` yourself if your driver needs one — the db-name var is `dbName` (default `db`) ## MariaDB -- HA: MaxScale routing with async replication -- Port: 3306 -- Connection: `mysql://${user}:${password}@${hostname}:3306/${db}` -- Use when: Application requires MySQL wire protocol +- Connection: the generated `${connectionString}` is `mysql://${user}:${password}@${hostname}:3306` (no database path); append `/${dbName}` if your driver needs one ## ClickHouse -- HA: 3 data nodes, replication factor 3 -- Ports: 9000 (native), 8123 (HTTP), 9004 (MySQL), 9005 (PostgreSQL) -- Requires `ReplicatedMergeTree` engine in HA mode -- Use when: Analytics, time-series, OLAP workloads - -## Gotchas -1. **HA mode is immutable**: Cannot switch HA/NON_HA after creation — delete and recreate -2. **No internal TLS**: Use `http://hostname:port` internally — VPN provides encryption -3. **PostgreSQL URI scheme**: Some libraries need `postgres://` not `postgresql://` — create a custom env var +- HA: replicated databases use a `Replicated(...)` engine `ON CLUSTER`; tables use a `Replicated*MergeTree` engine (without `ON CLUSTER`) diff --git a/apps/docs/content/guides/choose-queue.mdx b/apps/docs/content/guides/choose-queue.mdx index eb06b130..34b55d03 100644 --- a/apps/docs/content/guides/choose-queue.mdx +++ b/apps/docs/content/guides/choose-queue.mdx @@ -14,40 +14,3 @@ description: "**Use NATS** for most cases (simple, fast, JetStream persistence). | Lightweight pub/sub | NATS — core | Low overhead, 8MB default messages, fire-and-forget | | Durable queues, replay, at-least-once | NATS — JetStream | Persistent streams, durable consumers, ack/redeliver | | Event sourcing / audit logs | Kafka | Indefinite topic retention, strong ordering | - -## NATS (Default Choice) - -NATS exposes **two distinct messaging shapes**. Pick ONE per recipe and write yaml comments / KB content describing only that shape — mixing them confuses porters about what the recipe actually does. - -- **Core pub/sub + queue groups**: `nc.subscribe('subject', { queue: 'workers' })`. No persistence; queue groups load-balance delivery across replicas; lost messages stay lost. HA story: surviving cluster nodes keep delivering, no consumer position to restore. Use when fan-out + load balance + at-most-once is enough. -- **JetStream streams + durable consumers**: opens an explicit stream via `JetStreamManager`, subscribes durably via `js.subscribe(...)`. Persistent message store; replay on reconnect; ack/redeliver. HA story: cluster replicates stream state, acked-but-unprocessed messages survive node loss. Use when at-least-once + replay + persistence are required. - -**Authoring rule**: a recipe's yaml comments and KB bullets should reflect the shape the code actually uses. If the worker only calls `nc.subscribe()` with a queue group and never opens a stream, do not invoke JetStream language at HA tiers — the recipe has no stream to replicate. If the worker opens a JetStream stream, the JetStream HA story is the relevant one. - -- Ports: 4222 (client), 8222 (HTTP monitoring) -- Auth: user `zerops` + auto-generated password -- **Connection** — two supported patterns, pick ONE: - - **Separate env vars** (recommended, works with every NATS client library): pass `servers: ${hostname}:${port}` plus `user: ${user}, pass: ${password}` as client-side connect options. The servers list stays credential-free. - - **Opaque connection string**: pass `${connectionString}` directly as the servers option — the platform builds a correctly-formatted URL with embedded auth that the NATS server expects. -- JetStream capability: enabled by default (`JET_STREAM_ENABLED=1`); recipes opt in by writing JetStream client code. Setting `JET_STREAM_ENABLED=0` hard-disables the capability across the project. -- Storage: Up to 40GB memory + 250GB file store -- Max message: 8MB default, 64MB max (`MAX_PAYLOAD`) -- Health check: `GET /healthz` on port 8222 -- **Config changes require restart** (no hot-reload) - -## Kafka - -- Port: 9092 (SASL PLAIN auth) -- Auth: `user` + `password` env vars (auto-generated) -- Bootstrap: `${hostname}:9092` -- HA: 3 brokers, 6 partitions, replication factor 3 -- Storage: Up to 40GB RAM + 250GB persistent -- Topic retention: **Indefinite** (no time or size limits) -- Schema Registry: Port 8081 (if enabled) - -## Gotchas -1. **NATS config changes need restart**: No hot-reload — changing env vars requires service restart -2. **Kafka single-node has no replication**: 1 broker = 3 partitions but zero redundancy -3. **NATS JetStream HA sync interval**: 1-minute sync across nodes — brief data lag possible. Applies only to recipes that actually open JetStream streams; core pub/sub recipes are unaffected. -4. **Kafka SASL only**: No anonymous connections — always use the generated credentials -5. **NATS authorization violation from a hand-composed URL**: do not build a `nats://user:pass@host:4222` URL from the separate env vars. Most NATS client libraries will parse the embedded credentials AND separately attempt SASL with the same values, producing a double-auth that the server rejects with `Authorization Violation` on the first CONNECT frame (symptom: startup crash, no successful subscription). Use either the separate env vars passed as connect options (credential-free servers list) or the opaque `${connectionString}` the platform builds for you — both patterns in the Connection section above avoid the double-auth path. diff --git a/apps/docs/content/guides/choose-runtime-base.mdx b/apps/docs/content/guides/choose-runtime-base.mdx index cbb9faff..5c6df98e 100644 --- a/apps/docs/content/guides/choose-runtime-base.mdx +++ b/apps/docs/content/guides/choose-runtime-base.mdx @@ -3,30 +3,22 @@ title: "Choosing a Runtime Base on Zerops" description: "**Use Alpine** as the default base for all services. Use Ubuntu only when you need system packages not available in Alpine. Use Docker only for pre-built images." --- -**Use Alpine** as the default base for all services. Use Ubuntu only when you need system packages not available in Alpine. Use Docker only for pre-built images. +**Use Alpine** as the default base for all services. Switch to Ubuntu only for **glibc** needs (musl incompatibility): CGO-enabled Go, glibc-built Python/C-extension wheels, or the **Deno** runtime (no Alpine build). Needing a package is NOT itself a reason — both bases install packages (`sudo apk add` / `sudo apt-get install`). Use Docker only for pre-built images. ## Decision Matrix | Need | Choice | Why | |------|--------|-----| | **Any standard app** | **Alpine** (default) | ~5MB, fast, secure, sufficient for 95% of apps | -| System packages (apt) | Ubuntu | Full Debian ecosystem, ~100MB | +| glibc / CGO / C-extensions / Deno | Ubuntu | musl-incompatible binaries; Deno has no Alpine build (~100MB) | | Pre-built Docker images | Docker | VM-based, bring your own image | -| CGO / native libs | Ubuntu | Better glibc compatibility than Alpine's musl | - -## Alpine (Default) - -- Size: ~5MB base -- Package manager: `apk add` -- Best for: All runtimes (Node.js, Python, Go, Rust, Java, PHP, etc.) -- Zerops uses Alpine as default base for all managed runtimes ## Ubuntu - Size: ~100MB base -- Package manager: `apt-get install` -- Version: 24.04 LTS -- Use when: You need packages not available in Alpine, or need glibc (not musl) +- Package manager: `sudo apt-get update && sudo apt-get install -y ` (sudo required) +- Version: 24.04 LTS (22.04 also available) +- Use when: you need glibc (musl incompatibility) — CGO-linked Go, glibc-built C-extensions, or the Deno runtime (no Alpine build). Needing a package is NOT a reason — both bases install packages - Example: Go apps with CGO, Python packages with C extensions that don't compile on musl ## Docker @@ -37,9 +29,3 @@ description: "**Use Alpine** as the default base for all services. Use Ubuntu on - Disk: Can only increase, never decrease without recreation - Build phase runs in containers (not VMs) - **Always use specific version tags** — `:latest` is cached and won't re-pull - -## Gotchas -1. **Alpine uses musl**: Some C libraries may not compile — use Ubuntu if you hit musl issues -2. **Docker is VM-based**: Vertical scaling restarts the VM — expect brief downtime -3. **Docker `:latest` is cached**: Zerops won't re-pull — always use specific tags like `myapp:1.2.3` -4. **Docker requires host networking**: Without `--network=host`, the container can't receive traffic diff --git a/apps/docs/content/guides/choose-search.mdx b/apps/docs/content/guides/choose-search.mdx index 88425b86..0f8ee8f4 100644 --- a/apps/docs/content/guides/choose-search.mdx +++ b/apps/docs/content/guides/choose-search.mdx @@ -16,37 +16,19 @@ description: "**Use Meilisearch** for simple full-text search. Use **Elasticsear ## Meilisearch (Default for Simple Search) -- Single-node only (no clustering) -- Port: 7700 - API keys: `masterKey` (admin), `defaultSearchKey` (frontend-safe), `defaultAdminKey` (backend) - Production mode by default (no search preview dashboard) ## Elasticsearch (Advanced / HA) -- Cluster support with multiple nodes -- Port: 9200 (HTTP only) -- Auth: `elastic` user with auto-generated password -- Plugins via `PLUGINS` env var (comma-separated) -- JVM heap: `HEAP_PERCENT` env var (default 50%) -- Min RAM: 0.25 GB +- Plugins via `PLUGINS` (set in `envSecrets`, comma-separated) +- JVM heap: `HEAP_PERCENT` (in `envSecrets`, default 50%) ## Typesense (Fast Autocomplete) -- HA: 3-node Raft consensus - API key via `apiKey` env var (immutable after generation) -- CORS enabled by default -- Recovery time: up to 1 minute during failover (503/500 auto-resolves) - Data persisted at `/var/lib/typesense` ## Qdrant (Vector Search) -- Ports: 6333 (HTTP), 6334 (gRPC) - API keys: `apiKey` (full access), `readOnlyApiKey` (search only) -- HA: 3 nodes with `automaticClusterReplication=true` by default -- **Internal access only** — no public access available - -## Gotchas -1. **Meilisearch has no HA**: Single-node only — for HA full-text search, use Elasticsearch or Typesense -2. **Qdrant is internal-only**: Cannot be exposed publicly — access via your runtime service -3. **Typesense API key is immutable**: Cannot change `apiKey` after service creation -4. **Elasticsearch plugins require restart**: Changing `PLUGINS` env var needs service restart diff --git a/apps/docs/content/guides/ci-cd.mdx b/apps/docs/content/guides/ci-cd.mdx index a9b8b2b7..bcf5322f 100644 --- a/apps/docs/content/guides/ci-cd.mdx +++ b/apps/docs/content/guides/ci-cd.mdx @@ -22,14 +22,15 @@ jobs: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - - uses: zeropsio/actions@main + - uses: zeropsio/actions@v1.0.2 with: access-token: ${{ secrets.ZEROPS_TOKEN }} - service-id: + service-id: ${{ secrets.ZEROPS_SERVICE_ID }} ``` - `access-token`: From Settings → Access Token Management - `service-id`: From service URL or three-dot menu → Copy Service ID +- The compact `zeropsio/actions` wrapper exposes only `access-token`/`service-id` — it **cannot pass `--setup`**. For a multi-setup `zerops.yaml`, install zcli and run `zcli push --service-id "${{ secrets.ZEROPS_SERVICE_ID }}" --setup ` instead ## GitLab Integration (Webhook) @@ -40,15 +41,13 @@ jobs: 4. Choose trigger: **New tag** (optional regex) or **Push to branch** ## Skip Pipeline -Include `ci skip` or `skip ci` in commit message (case-insensitive). +Include `[ci skip]` or `[skip ci]` (with the square brackets) in the commit message (case-insensitive). ## Disconnect Service detail → Build, Deploy, Run → Stop automatic build trigger. ## Gotchas -1. **Full repo access required**: Webhook integration needs full access to create/manage webhooks -2. **`ci skip` in commit message**: Prevents pipeline trigger — useful for docs-only changes -3. **Service ID not obvious**: Find it in service URL or three-dot menu → Copy Service ID +1. **External/CI deploys leave ZCP unaware**: a webhook or CI `zcli push` does not record the deploy in ZCP local state — the service stays at `deployState=never-deployed`. Bridge it with `zerops_workflow action="record-deploy" targetService=""` when you return to the ZCP develop flow ## GitLab CI diff --git a/apps/docs/content/guides/cloudflare.mdx b/apps/docs/content/guides/cloudflare.mdx index fecbeb49..7c79bc62 100644 --- a/apps/docs/content/guides/cloudflare.mdx +++ b/apps/docs/content/guides/cloudflare.mdx @@ -3,7 +3,7 @@ title: "Cloudflare Integration with Zerops" description: "Always use **Full (strict)** SSL mode in Cloudflare — \"Flexible\" causes redirect loops. Shared IPv4 with Cloudflare proxy is not recommended." --- -Always use **Full (strict)** SSL mode in Cloudflare — "Flexible" causes redirect loops. Shared IPv4 with Cloudflare proxy is not recommended. +Use **Full (strict)** SSL mode in Cloudflare for production (plain "Full" is acceptable for testing) — **never "Flexible"**, which causes redirect loops. Shared IPv4 with Cloudflare proxy is not recommended. ## DNS Configuration @@ -12,13 +12,6 @@ Always use **Full (strict)** SSL mode in Cloudflare — "Flexible" causes redire CNAME ``` -### With Cloudflare Proxy (orange cloud) -| IP Type | Record | Proxy | -|---------|--------|-------| -| IPv6 only | `AAAA ` | Proxied | -| Dedicated IPv4 | `A ` | Proxied | -| Shared IPv4 | **Not recommended** | Reverse AAAA lookup issues | - ### DNS-Only (gray cloud) | IP Type | Records Required | |---------|-----------------| @@ -33,12 +26,6 @@ Method B: CNAME *. ACME: CNAME _acme-challenge. .zerops.zone ``` -## SSL/TLS Settings (Cloudflare Dashboard) -- **Encryption mode: Full (strict)** — mandatory -- **Never use "Flexible"** — causes infinite redirect loops -- Enable "Always Use HTTPS" -- WAF exception: Skip rule for `/.well-known/acme-challenge/` (ACME validation) - ## Preparing a Service for Cloudflare Any runtime service (nodejs, go, python, etc.) can be put behind Cloudflare. Steps: @@ -55,13 +42,10 @@ Any runtime service (nodejs, go, python, etc.) can be put behind Cloudflare. Ste 3. **Configure Cloudflare DNS** to point to your Zerops project IP 4. **Set SSL mode to "Full (strict)"** in Cloudflare dashboard -**Important**: The `zerops_subdomain enable` tool only works on deployed (ACTIVE) services. For new services, use `enableSubdomainAccess: true` in import YAML. +The `enableSubdomainAccess: true` in step 1 is the portable mechanism — the first deploy (or GUI toggle) activates the L7 route; you do not need a separate enable step. (*In ZCP, `zerops_deploy` auto-enables it on first deploy; see the public-access guide for the eligible-mode list and the `serviceStackIsNotHttp` port-shape rejection in the Gotchas below.*) Internal service-to-service communication must always use `http://` — never `https://`. SSL terminates at the Zerops L7 balancer. ## Gotchas -1. **Flexible SSL = redirect loop**: Zerops forces HTTPS, Cloudflare Flexible sends HTTP → infinite redirect -2. **Shared IPv4 + proxy is broken**: Reverse AAAA lookup doesn't work with Cloudflare proxy on shared IPv4 -3. **ACME challenge needs WAF exception**: Without it, Cloudflare blocks Let's Encrypt validation -4. **Wildcard SSL on Cloudflare Free**: Free plan doesn't proxy wildcard subdomains — use DNS-only or upgrade -5. **Subdomain on undeployed service**: `zerops_subdomain enable` returns "Service stack is not http or https" on READY_TO_DEPLOY services — deploy code first or use `enableSubdomainAccess` in import YAML +1. **Wildcard SSL on Cloudflare Free**: Free plan doesn't proxy wildcard subdomains — use DNS-only or upgrade +2. **"Service stack is not http or https"**: `zerops_subdomain enable` returns this when the service has no HTTP-shaped port (a worker, or a port without `httpSupport: true`) — it's about port shape, not deploy state. A READY_TO_DEPLOY service WITH `httpSupport: true` can be enabled; a deployed worker still cannot. diff --git a/apps/docs/content/guides/deployment-lifecycle.mdx b/apps/docs/content/guides/deployment-lifecycle.mdx index 0336490c..f7e2e081 100644 --- a/apps/docs/content/guides/deployment-lifecycle.mdx +++ b/apps/docs/content/guides/deployment-lifecycle.mdx @@ -84,36 +84,9 @@ Default behavior (`temporaryShutdown: false`): 5. Old container processes terminated 6. Old containers deleted -### temporaryShutdown Behavior +### Deploy strategy & checks -| Setting | Behavior | Downtime | -|---------|----------|----------| -| `false` (default) | New containers start BEFORE old ones stop | **Zero downtime** | -| `true` | Old containers stop BEFORE new ones start | **Temporary downtime** | - -Use `temporaryShutdown: true` only when you cannot run two versions simultaneously (e.g., database migrations, singleton locks). - ---- - -## Readiness Check vs Health Check - -| Aspect | Readiness Check | Health Check | -|--------|----------------|--------------| -| When | **During deploy only** | **Continuously after deploy** | -| Purpose | Gates traffic to new containers | Detects runtime failures | -| Location | `deploy.readinessCheck` | `run.healthCheck` | -| Failure action | Container marked failed after timeout, replaced | Container restarted | - -### Readiness Check Mechanics - -1. Application starts via `start` command -2. Readiness check runs (httpGet or exec) -3. If **fails** -- wait `retryPeriod` seconds (default 5s), retry -4. If **succeeds** -- container marked active, receives traffic -5. If still failing after `failureTimeout` (default 300s / 5 min) -- container deleted, new one created - -**httpGet**: succeeds on HTTP `2xx`, follows `3xx` redirects, 5-second per-request timeout -**exec.command**: succeeds on exit code 0, 5-second per-command timeout +`temporaryShutdown` (in the `deploy` block) controls cutover order: `false` (default) starts new containers before removing old ones (zero-downtime); `true` stops old before new (downtime, use only when two versions cannot coexist — exclusive DB migrations, singleton locks). The **readiness check** gates traffic to the new container during a deploy; the **health check** monitors the live app continuously. For check behavior, params, the httpGet/exec timeouts, and the production pattern, see `zerops://guides/readiness-health-checks`. --- @@ -123,7 +96,7 @@ Typical pipeline events in chronological order: 1. **`stack.build` process RUNNING** -- build container created, pipeline started 2. **`stack.build` process FINISHED** -- build complete, artifact uploaded -3. **`appVersion` build event ACTIVE** -- deploy started, containers launching +3. **`appVersion` build event ACTIVE** -- the new version is deployed and running (this is the terminal success state, NOT "launching") 4. **Service status returns to RUNNING** -- all containers active, deploy complete **Terminal states:** @@ -152,18 +125,15 @@ Zerops keeps **10 most recent versions**. Older auto-deleted. Any archived versi ## Gotchas -1. **Build and run are SEPARATE containers** -- build output does not automatically appear in runtime. You must specify `deployFiles` -2. **initCommands run on EVERY container start** -- including restarts and horizontal scaling, not just deploys -3. **initCommands failures do NOT cancel deploy** -- app starts regardless of init exit code -4. **prepareCommands in build vs run** -- `build.prepareCommands` customizes build env, `run.prepareCommands` creates custom runtime image. Different containers, different purposes -5. **deployFiles land in `/var/www`** -- tilde syntax (`dist/~`) extracts contents directly to `/var/www/` (strips directory). Without tilde, `dist` → `/var/www/dist/` (preserved). **CRITICAL**: `run.start` path must match — `dist/~` + `start: bun dist/index.js` BREAKS because the file is at `/var/www/index.js`, not `/var/www/dist/index.js` +1. **initCommands run on EVERY container start** -- including restarts and horizontal scaling, not just deploys +2. **initCommands failures DO fail the deploy** -- `run.initCommands` run during runtime-prepare, BEFORE the start command, and are deploy-gating: a non-zero exit aborts the deploy. The platform emits `RUN.INIT COMMANDS FINISHED WITH ERROR` in the runtime log and surfaces the failed command + exit code on the `stack.build` process (`commandExec` / "init command failed"). The new appVersion goes to `DEPLOY_FAILED` and is **never activated**; the start command never runs and the previously-active version keeps serving. **Diagnose via `appVersion.status` (`DEPLOY_FAILED`) + `activationDate` (`null`), NOT the service status** — the service stays `ACTIVE` on the old version, so service-status alone reads as "fine". Keep init commands idempotent and exit 0. ## SSHFS Mount and Deploy Interaction When using SSHFS (`zerops_mount`) for dev workflows, deploy replaces the container. This has important consequences: 1. **After deploy, run container only has `deployFiles` content.** All other files (including zerops.yml if not in deployFiles) are gone. Use `deployFiles: [.]` for dev services to ensure zerops.yml and source files survive the deploy cycle. -2. **SSHFS mount auto-reconnects after deploy.** No explicit remount is needed — the SSHFS reconnect mechanism handles the container replacement transparently. The mount only becomes truly stale during stop (container not running); after start it auto-reconnects again. +2. **SSHFS mount auto-reconnects only while the service is running.** Usually no remount is needed, but if the mount goes stale after a deploy (stat/ls returns empty, writes hang), recover it explicitly with `zerops_mount action="mount"`. A stopped service has no live mount until it starts again. 3. **zerops.yml must be in deployFiles** for dev self-deploy lifecycle. Without it, subsequent deploys from the container fail because zerops.yml is missing. **Two kinds of "mount" (disambiguation):** diff --git a/apps/docs/content/guides/environment-variables.mdx b/apps/docs/content/guides/environment-variables.mdx index c3aaff61..a317a079 100644 --- a/apps/docs/content/guides/environment-variables.mdx +++ b/apps/docs/content/guides/environment-variables.mdx @@ -30,10 +30,10 @@ Total order for the bare key (highest wins): **system/platform > yaml-baked `run Build and runtime run in **separate containers**. Variables from one phase are not visible in the other unless explicitly referenced with prefixes: -| Want to access | From | Use prefix | -|---------------|------|-----------| -| Runtime var `API_KEY` | Build container | `${RUNTIME_API_KEY}` | -| Build var `BUILD_ID` | Runtime container | `${BUILD_BUILD_ID}` | +| Want to access | From | How | +|---------------|------|-----| +| Runtime var `API_KEY` | Build container | `${RUNTIME_API_KEY}` — runtime `run.envVariables` are known at build time, so the build can read them | +| Build var `BUILD_ID` | Runtime container | **Not available.** The build container is destroyed after build and its vars are not carried into the runtime env store — `${BUILD_BUILD_ID}` reaches the runtime process as the literal string `${BUILD_BUILD_ID}`. Persist the value into a deployed file, or recompute it at runtime. | ```yaml zerops: @@ -59,14 +59,9 @@ run: CACHE_URL: ${cache_connectionString} ``` -```javascript -// App reads the names you mapped above: -const host = process.env.DB_HOST; -``` - - The reference **resolves at container start**, independent of isolation mode — the referenced var does not need to exist at definition time. - An **unresolved ref stays literal** (`${db_hostname}` reaches the process verbatim) — no error, no blank. A wrong hostname/var on the right-hand side becomes a literal string and the app fails at connect time. -- **Hostname transformation**: dashes become underscores. Service `my-db` variable `port` is `${my_db_port}`. +- **Hostname charset**: service hostnames are lowercase alphanumeric only (`[a-z0-9]`) — the platform rejects dashes, underscores, and uppercase with `serviceStackNameInvalid`. So a ref is simply `${hostname_varname}` with the literal hostname (service `cache` → `${cache_port}`); there is no dash-to-underscore rewrite to reason about, because a dashed hostname cannot exist. Only legacy `envIsolation=none` auto-injects every sibling's vars as bare `_KEY` OS env vars without a ref — see Isolation Modes. New projects are `service`; rely on explicit refs. @@ -161,7 +156,7 @@ The frontend consumes `API_URL` via plain `${API_URL}` in `build.envVariables` ( - Defined via GUI, import.yml `envSecrets`, or `dotEnvSecrets` - **Read is privilege-gated** -- masked in GUI; via API an admin/write token returns the value verbatim, a read-only token returns `REDACTED` (keyed on `sensitive=true`). In-container the value is plaintext (the app needs it). Project-level `sensitive=true` does NOT persist — only service-level is a true secret surface. - Can be updated without redeploy, but the service **must be restarted** to pick it up. -- Overridden by yaml-baked `run.envVariables` with the same key (yaml owns the key). +- A yaml-baked `run.envVariables` key and a secret on the **same key cannot coexist** — the platform rejects the secret with `userDataDuplicateKey`. The yaml owns the key; edit the yaml and redeploy to change it. ### dotEnvSecrets @@ -209,13 +204,4 @@ An env-store change (secret or project) propagates to the container in ~5–10s ## System-Generated Variables -Zerops auto-generates variables per service (e.g., `hostname`, `PATH`, DB connection strings). Cannot be deleted. Some read-only (`hostname`), others editable (`PATH`). Reference them from another service with an explicit `${hostname_varname}`. - -## Common Mistakes - -- **DO NOT** expect a sibling's vars to appear automatically under default `service` isolation — reference them explicitly as `${hostname_varname}` in `run.envVariables` (the bare `_KEY` injected form is `none`-only legacy). -- **DO NOT** re-reference a var under its SAME name -- self-shadow loop. Project vars auto-inherit (read directly); cross-service uses a DIFFERENT left-hand name (`DB_HOST: ${db_hostname}`). -- **DO NOT** set a secret/service var on a key already in `run.envVariables` -- rejected (`userDataDuplicateKey`); the yaml owns the key, edit yaml + redeploy. -- **DO NOT** assume secret values are unreadable -- API read is privilege-gated (admin verbatim, read-only `REDACTED`), not unconditionally write-only. -- **DO NOT** forget restart after GUI/API env changes -- the running process won't see new values. -- **DO NOT** expect `envReplace` to recurse subdirectories -- it does not. +Zerops auto-generates variables per service (e.g., `hostname`, `PATH`, DB connection strings). Some are **hard-reserved** and rejected if you try to set them — `PATH` (uppercase) returns `userDataUseOfSystemKey` in any `envVariables` block. Overridable platform vars include `envIsolation` / `sshIsolation` / `zeropsSubdomainHost` (and the CDN URLs) — but never `PATH`. Reference any of them from another service with an explicit `${hostname_varname}`. diff --git a/apps/docs/content/guides/firewall.mdx b/apps/docs/content/guides/firewall.mdx index 8ab4916b..93a3a711 100644 --- a/apps/docs/content/guides/firewall.mdx +++ b/apps/docs/content/guides/firewall.mdx @@ -5,36 +5,5 @@ description: "Zerops uses nftables with restricted TCP ports 1-1024 (only 22, 53 Zerops uses nftables with restricted TCP ports 1-1024 (only 22, 53, 80, 123, 443, 587 allowed); UDP and ports 1025-65535 are unrestricted. -## TCP Ports 1-1024 (Restricted) - -| Port | Protocol | Status | -|------|----------|--------| -| 22 | SSH | Allowed | -| 25 | SMTP | **Blocked** (spam prevention) | -| 53 | DNS | Allowed | -| 80 | HTTP | Allowed | -| 123 | NTP | Allowed | -| 443 | HTTPS | Allowed | -| 465 | SMTPS | **Blocked** (deprecated) | -| 587 | SMTP/STARTTLS | Allowed | -| All others | — | **Blocked** | - -## UDP Ports -No restrictions on any UDP port. - -## TCP Ports 1025-65535 -No restrictions. - -## Direct Port Access Firewall -For services with direct port access enabled: -- Configure **blacklist** or **whitelist** rules per port -- Available on ports 10-65435 -- Protocols: TCP, UDP - ## Port Modification Contact `support@zerops.io` with Project ID + Organization ID to request changes to restricted ports. - -## Gotchas -1. **Port 25 is permanently blocked**: Use port 587 with STARTTLS for email sending -2. **Port 465 is blocked**: Legacy SMTPS — use 587 instead -3. **Cannot self-service unblock**: Must contact Zerops support for port exceptions diff --git a/apps/docs/content/guides/local-development.mdx b/apps/docs/content/guides/local-development.mdx new file mode 100644 index 00000000..457a053b --- /dev/null +++ b/apps/docs/content/guides/local-development.mdx @@ -0,0 +1,114 @@ +--- +title: Local Development with Zerops +description: "Guide: Local Development with Zerops" +--- + +Develop locally with hot reload while connecting to Zerops managed services (DB, cache, storage) via VPN. ZCP generates `.env` with real credentials. Deploy to Zerops with `zerops_deploy` which uses `zcli push` under the hood. + +--- + +## Setup + +### Prerequisites +- **zcli** installed: `npm i -g @zerops/zcli` or [docs.zerops.io/references/cli](https://docs.zerops.io/references/cli) +- **VPN**: WireGuard (installed by zcli automatically on first `zcli vpn up`) +- **Project-scoped token**: Create in Zerops GUI → Settings → Access Tokens → Custom access per project + +### Configuration +```json +// .mcp.json (in project root) +{ + "mcpServers": { + "zcp": { + "command": "zcp", + "env": { "ZCP_API_KEY": "" } + } + } +} +``` + +--- + +## Workflow + +### 1. Connect to Zerops services +```bash +zcli vpn up +``` +- All services accessible by hostname (e.g., `db`, `cache`) +- One project at a time — switching disconnects the current +- **Env vars NOT available via VPN** — use `.env` file instead + +### 2. Load credentials +ZCP writes `.env` via `zerops_env action="generate-dotenv"` (it merges three input channels — project `envVariables`, zerops.yaml `run.envVariables`, and `.env.local` — into one resolved file): +``` +db_hostname=db +db_port=5432 +db_password= +db_connectionString=postgresql://db:@db:5432 +``` +> Don't hand-edit `.env` directly — the next `generate-dotenv` refuses with a diff if it finds keys it didn't produce. Put manual overrides in `.env.local` (a no-touch input channel that survives regeneration), or pass `force=true`. + +### 3. Develop locally +Start your dev server as usual — hot reload works against Zerops managed services over VPN. + +### 4. Deploy to Zerops +``` +zerops_deploy targetService="appstage" +``` +Uses `zcli push` under the hood. Blocks until build completes. + +--- + +## zerops.yml for Local Mode + +The same `zerops.yml` works for both local push and container deploy: + +```yaml +zerops: + - setup: appstage + build: + base: nodejs@22 + buildCommands: + - npm ci + - npm run build + deployFiles: ./dist + run: + start: node dist/server.js + ports: + - port: 3000 + httpSupport: true + envVariables: + DB_URL: ${db_connectionString} +``` + +`${hostname_varName}` references are resolved by Zerops at container runtime — they work regardless of push source (local or container). + +--- + +## Connection Troubleshooting + +| Symptom | Diagnosis | Fix | +|---------|-----------|-----| +| `nc -zv db 5432` times out | VPN not connected | `zcli vpn up ` | +| VPN connected, still timeout | Wrong project | `zcli vpn up ` | +| Connected but auth fails | Stale .env | Regenerate: `zerops_env action="generate-dotenv"` | +| Service unreachable | Service stopped | `zerops_manage action="start" serviceHostname="db"` | + +### Diagnostic sequence +1. `zerops_discover service="db"` — is service RUNNING? +2. `nc -zv db 5432 -w 3` — network reachable? +3. Compare `.env` vs `zerops_env action="generate-dotenv" preview=true` (or `zerops_discover includeEnvValues=true` for stored values — `includeEnvs` returns key templates, not resolved values) — credentials current? + +--- + +## Multi-Project + +Each project directory has its own `.mcp.json` + `.zcp/state/`. VPN is one per machine — switch manually. + +--- + +## Gotchas + +1. **`.env` contains secrets**: Add to `.gitignore` immediately — never commit +2. **Object storage (S3)**: Uses HTTPS apiUrl — may work without VPN but not fully verified. Include VPN as fallback diff --git a/apps/docs/content/guides/logging.mdx b/apps/docs/content/guides/logging.mdx index 80d28e18..2c54eca9 100644 --- a/apps/docs/content/guides/logging.mdx +++ b/apps/docs/content/guides/logging.mdx @@ -12,15 +12,12 @@ Zerops captures stdout/stderr as logs; use syslog output format for severity fil ## Access Methods -### GUI -- Project detail → service → Logs section -- Filter by severity, time range, container - ### CLI ```bash -zcli service log # Runtime logs -zcli service log --showBuildLogs # Build logs +zcli service log -S # Runtime logs (select via -S/--service-id, NOT a positional name) +zcli service log -S --show-build-logs # Build logs ``` +Agents in this ecosystem read runtime logs via the `zerops_logs` MCP tool — it fetches **runtime logs only** (no build-log flag); build logs surface by auto-attachment on a deploy-failure response. ## Severity Filtering Logs must output to **syslog format** for severity filtering to work. Plain stdout/stderr logs appear as "info" level. @@ -56,8 +53,4 @@ Certificate paths: - Custom certs: `ca-file("/etc/syslog-ng/user.crt")` ## Gotchas -1. **Syslog format required**: Without syslog formatting, all logs appear as same severity — no filtering possible -2. **Build logs separate**: Use `--showBuildLogs` flag in CLI — not shown by default -3. **Source name must be `s_src`**: Using `s_sys` (common default) will not capture Zerops logs -4. **UDP for Logstash**: Zerops forwards logs via UDP syslog — ensure Logstash listens on UDP -5. **Custom certs path**: Place custom CA certs in `/etc/syslog-ng/user.crt` +1. **UDP for Logstash**: Zerops forwards logs via UDP syslog — ensure Logstash listens on UDP diff --git a/apps/docs/content/guides/metrics.mdx b/apps/docs/content/guides/metrics.mdx index 851d73f9..0fa94a23 100644 --- a/apps/docs/content/guides/metrics.mdx +++ b/apps/docs/content/guides/metrics.mdx @@ -18,12 +18,13 @@ Zerops supports ELK (APM + logs) and Prometheus/Grafana stacks; expose `/metrics | `logstash` | Log collection | ### APM Configuration -```yaml -envVariables: - ELASTIC_APM_ACTIVE: "true" - ELASTIC_APM_SERVICE_NAME: my-app - ELASTIC_APM_SERVER_URL: https://apmserver.zerops.app - ELASTIC_APM_SECRET_TOKEN: + +Set these on your app as service env vars (GUI or `run.envVariables`). Copy the real APM server URL from the `apmserver` service's subdomain in the GUI — it's a generated subdomain (`apmserver--..zerops.app`), **not** a fixed `apmserver.zerops.app` host: +``` +ELASTIC_APM_ACTIVE=true +ELASTIC_APM_SERVICE_NAME=my-app +ELASTIC_APM_SERVER_URL=https:// +ELASTIC_APM_SECRET_TOKEN= ``` ## Prometheus + Grafana Stack Services @@ -37,16 +38,11 @@ envVariables: ### Custom Metrics 1. Expose HTTP `/metrics` endpoint in your app -2. Set env var: `ZEROPS_PROMETHEUS_PORT=8080` (comma-separated for multiple ports) +2. Set env var: `ZEROPS_PROMETHEUS_PORT=` (e.g. `9090`; comma-separated for multiple ports) 3. Prometheus auto-discovers and scrapes ## Built-in Metrics - Service scaling & resource usage -- PostgreSQL (with `pg_stat_statements` extension) +- PostgreSQL (some metrics require the `pg_stat_statements` extension — superuser `CREATE EXTENSION` + restart) - MariaDB - Valkey - -## Gotchas -1. **`ZEROPS_PROMETHEUS_PORT` is required**: Without it, Prometheus won't discover your custom metrics endpoint -2. **APM server must be public**: Use Zerops subdomain to expose apmserver for trace collection -3. **Cross-project needs forwarder**: Use `prometheuslight` service in source project to forward to global Prometheus diff --git a/apps/docs/content/guides/networking.mdx b/apps/docs/content/guides/networking.mdx index 331ee8d8..a478345e 100644 --- a/apps/docs/content/guides/networking.mdx +++ b/apps/docs/content/guides/networking.mdx @@ -9,14 +9,6 @@ Zerops networking has two layers: a private VXLAN network per project (service-t ## Architecture Overview -``` -Internet - │ - ├─ HTTP/HTTPS ──→ L7 Balancer (SSL termination, nginx) ──→ container VXLAN IP:port - │ - └─ Direct port ──→ L3/Core Balancer ──→ container VXLAN IP:port -``` - **Per-project infrastructure:** - **Private VXLAN network** — isolated overlay network shared by all services - **L7 HTTP Balancer** — 2 HA containers, auto-scales, domain routing + SSL @@ -96,16 +88,7 @@ Work through these steps **in order**: 6. **Service status** — Is the service ACTIVE? (check `zerops_discover`) 7. **Timeout settings** — For slow responses, increase `send_timeout` (default 2s) -**Common framework fixes:** -```bash -app.listen(3000, '0.0.0.0') - -flask run --host=0.0.0.0 - -http.ListenAndServe(":8080", handler) // implicit 0.0.0.0 - -server.address=0.0.0.0 -``` +**Fix:** bind the listen address to `0.0.0.0` (e.g. `app.listen(3000, '0.0.0.0')`), never `127.0.0.1`/`localhost`. --- @@ -136,9 +119,4 @@ server.address=0.0.0.0 --- ## Gotchas -1. **Binding localhost = 502**: The L7 balancer connects via VXLAN IP, not localhost — always bind `0.0.0.0` -2. **Internal HTTPS breaks things**: Service-to-service must use `http://` — the VXLAN network is already isolated -3. **Subdomain 50MB cap**: zerops.app subdomains have a hard 50MB upload limit — use custom domain for larger files -4. **send_timeout default is 2s**: Slow API responses may be cut off — increase for long-running endpoints -5. **Cross-project networking impossible**: Each project is an isolated VXLAN — use public access to bridge projects -6. **Shared IPv4 needs AAAA**: Missing AAAA record = silent routing failure on shared IPv4 +1. **Cross-project networking impossible**: Each project is an isolated VXLAN — to bridge projects, use public access (L7/public endpoint), not private hostnames diff --git a/apps/docs/content/guides/object-storage-integration.mdx b/apps/docs/content/guides/object-storage-integration.mdx index 95bea61d..ea2fba36 100644 --- a/apps/docs/content/guides/object-storage-integration.mdx +++ b/apps/docs/content/guides/object-storage-integration.mdx @@ -43,67 +43,9 @@ https://endpoint.com/bucket-name/object-key https://bucket-name.endpoint.com/object-key ``` -**Every S3 client must be configured for path-style access.** - -## Framework Integration - -### PHP (Laravel — Flysystem) -```php -// config/filesystems.php -'s3' => [ - 'driver' => 's3', - 'endpoint' => env('S3_ENDPOINT'), - 'use_path_style_endpoint' => true, // REQUIRED - 'key' => env('S3_ACCESS_KEY'), - 'secret' => env('S3_SECRET_KEY'), - 'region' => env('S3_REGION', 'us-east-1'), - 'bucket' => env('S3_BUCKET'), -], -``` -Package: `league/flysystem-aws-s3-v3` - -### Node.js (AWS SDK v3) -```javascript -import { S3Client } from '@aws-sdk/client-s3'; -const s3 = new S3Client({ - endpoint: process.env.S3_ENDPOINT, - forcePathStyle: true, // REQUIRED - credentials: { - accessKeyId: process.env.S3_ACCESS_KEY, - secretAccessKey: process.env.S3_SECRET_KEY, - }, - region: process.env.S3_REGION || 'us-east-1', -}); -``` -Package: `@aws-sdk/client-s3` - -### Python (boto3) -```python -import boto3 -s3 = boto3.client('s3', - endpoint_url=os.environ['S3_ENDPOINT'], - aws_access_key_id=os.environ['S3_ACCESS_KEY'], - aws_secret_access_key=os.environ['S3_SECRET_KEY'], - region_name='us-east-1', - config=boto3.session.Config(s3={'addressing_style': 'path'}), # REQUIRED -) -``` -Package: `boto3` - -### Java (AWS SDK) -```java -S3Client s3 = S3Client.builder() - .endpointOverride(URI.create(System.getenv("S3_ENDPOINT"))) - .serviceConfiguration(S3Configuration.builder() - .pathStyleAccessEnabled(true) // REQUIRED - .build()) - .credentialsProvider(StaticCredentialsProvider.create( - AwsBasicCredentials.create( - System.getenv("S3_ACCESS_KEY"), - System.getenv("S3_SECRET_KEY")))) - .region(Region.US_EAST_1) - .build(); -``` +**Every S3 client must be configured for path-style access** (the SDK-specific +flag: `forcePathStyle`/`use_path_style_endpoint`/`addressing_style: path`/ +`pathStyleAccessEnabled`). Framework wiring lives in the recipe for that stack. ## import.yaml Definition diff --git a/apps/docs/content/guides/php-tuning.mdx b/apps/docs/content/guides/php-tuning.mdx index 8f6973ba..fd250766 100644 --- a/apps/docs/content/guides/php-tuning.mdx +++ b/apps/docs/content/guides/php-tuning.mdx @@ -3,13 +3,13 @@ title: "PHP Runtime Tuning on Zerops" description: "Override php.ini via `PHP_INI_*` env vars, FPM via `PHP_FPM_*`. Both require **restart** (not reload). Zerops defaults: upload/post = 1024M, FPM dynamic 20/2/1/3. Upload bottleneck is L7 balancer (50MB subdomain), not PHP." --- -Override php.ini via `PHP_INI_*` env vars, FPM via `PHP_FPM_*`. Both require **restart** (not reload). Zerops defaults: upload/post = 1024M, FPM dynamic 20/2/1/3. Upload bottleneck is L7 balancer (50MB subdomain), not PHP. +Override php.ini via `PHP_INI_*` env vars, FPM via `PHP_FPM_*`. **Applying a change depends on the channel**: values in `run.envVariables` are baked into the app version → changing one requires a **redeploy**; values set via `zerops_env`/GUI service env → changing one requires a **restart** (never reload — see Gotchas). Zerops defaults: upload/post = 1024M, FPM dynamic 20/2/1/3. Upload bottleneck is L7 balancer (50MB subdomain), not PHP. ## PHP Configuration (`PHP_INI_*`) Override any php.ini directive via `PHP_INI_{directive}` env vars in `run.envVariables` or via `zerops_env` API. -**Requires restart** to take effect. Reload writes config files (`/etc/php*/conf.d/overwrite.ini`) but FPM master does not re-read INI on reload. +To apply a CHANGE: if the value lives in `run.envVariables` (baked into the app version), **redeploy**; if set via `zerops_env`/GUI service env, **restart**. A plain reload rewrites the config files (`/etc/php*/conf.d/overwrite.ini`) but the FPM master does not re-read INI on reload, so reload alone never applies the change. ### Zerops Platform Defaults @@ -43,7 +43,7 @@ zerops: ## PHP-FPM (`PHP_FPM_*`) -Configure FPM process management via `PHP_FPM_*` env vars. **Requires restart** — same as PHP_INI. +Configure FPM process management via `PHP_FPM_*` env vars — same change semantics as `PHP_INI_*`: redeploy for `run.envVariables`, restart for `zerops_env`/GUI service env, never reload. Config files are written to `/etc/php*/php-fpm.d/www.conf` by `zerops-zenv` at container startup. @@ -64,12 +64,13 @@ Pre-forks a pool of workers. Good for consistent traffic. High-traffic example: ```yaml -envVariables: - PHP_FPM_PM_MAX_CHILDREN: 50 - PHP_FPM_PM_START_SERVERS: 10 - PHP_FPM_PM_MIN_SPARE_SERVERS: 5 - PHP_FPM_PM_MAX_SPARE_SERVERS: 15 - PHP_FPM_PM_MAX_REQUESTS: 1000 +run: + envVariables: + PHP_FPM_PM_MAX_CHILDREN: 50 + PHP_FPM_PM_START_SERVERS: 10 + PHP_FPM_PM_MIN_SPARE_SERVERS: 5 + PHP_FPM_PM_MAX_SPARE_SERVERS: 15 + PHP_FPM_PM_MAX_REQUESTS: 1000 ``` ### Ondemand Mode @@ -77,11 +78,12 @@ envVariables: Spawns workers only when requests arrive. Saves memory for low-traffic sites. ```yaml -envVariables: - PHP_FPM_PM: ondemand - PHP_FPM_PM_MAX_CHILDREN: 20 - PHP_FPM_PM_PROCESS_IDLE_TIMEOUT: 60s - PHP_FPM_PM_MAX_REQUESTS: 500 +run: + envVariables: + PHP_FPM_PM: ondemand + PHP_FPM_PM_MAX_CHILDREN: 20 + PHP_FPM_PM_PROCESS_IDLE_TIMEOUT: 60s + PHP_FPM_PM_MAX_REQUESTS: 500 ``` Available parameters for ondemand: @@ -129,6 +131,6 @@ run: ## Gotchas -- **Reload does NOT apply changes** -- `PHP_INI_*` and `PHP_FPM_*` both require restart. Zerops reload rewrites config files via `zerops-zenv` but does not signal FPM to re-read them. +- **Reload never applies the change** -- a value in `run.envVariables` needs a **redeploy** (it's baked into the app version); a value in `zerops_env`/GUI service env needs a **restart**. Reload rewrites config files via `zerops-zenv` but does not signal FPM to re-read them. - **Upload fails at 50MB on subdomain** -- this is the L7 balancer limit, not PHP. Use a custom domain for larger uploads. - **`post_max_size` must be >= `upload_max_filesize`** -- PHP silently drops the POST body if it exceeds `post_max_size`, even if the file itself is under `upload_max_filesize`. diff --git a/apps/docs/content/guides/production-checklist.mdx b/apps/docs/content/guides/production-checklist.mdx index 152270ed..f935ea20 100644 --- a/apps/docs/content/guides/production-checklist.mdx +++ b/apps/docs/content/guides/production-checklist.mdx @@ -34,16 +34,26 @@ Before going to production: (1) databases to HA mode, (2) minContainers: 2 on ap ## Dev Services to Remove ### Mailpit → Production SMTP + +Mailpit (the dev mail catcher) is defined as: +```yaml +services: + - hostname: mailpit + type: alpine@3.20 + buildFromGit: https://github.com/zeropsio/recipe-mailpit +``` +For production, point your app at a real provider — non-secret settings in the app's `run.envVariables` (zerops.yaml), the key in `envSecrets` (import.yaml). They are different files; a bare top-level `envVariables:` block is schema-invalid. +```yaml +run: + envVariables: + SMTP_HOST: smtp.sendgrid.net + SMTP_PORT: "587" +``` ```yaml -- hostname: mailpit - type: go@1 - buildFromGit: https://github.com/zeropsio/recipe-mailpit - -envVariables: - SMTP_HOST: smtp.sendgrid.net - SMTP_PORT: "587" -envSecrets: - SMTP_PASSWORD: your-production-key +services: + - hostname: app + envSecrets: + SMTP_PASSWORD: your-production-key ``` ### Adminer → Remove or Restrict @@ -127,29 +137,11 @@ Remove entirely or disable `enableSubdomainAccess`. Use VPN + pgAdmin/DBeaver lo | Environment separation | Separate projects for dev/staging/prod | | Stateless design | Sessions in Valkey, uploads in Object Storage — no local state | | Database mode | `mode: HA` for all managed services (immutable — plan before creation) | -| Min containers | `minContainers: 2` on all app services for zero-downtime deploys | +| Min containers | `minContainers: 2+` on app services for throughput + crash-tolerance (rolling deploys are already zero-downtime at any count via the default `temporaryShutdown: false` — don't conflate the two) | ## Health Check Pattern -Combined readiness + runtime health check for production services: - -```yaml -zerops: - - setup: app - deploy: - readinessCheck: - httpGet: - port: 3000 - path: /health - run: - healthCheck: - httpGet: - port: 3000 - path: /health - start: node server.js -``` - -Readiness check gates traffic during deploy. Health check runs continuously — unhealthy containers are restarted after 5-minute retry window. +Production services should pair a `deploy.readinessCheck` (gates traffic during deploy) with a `run.healthCheck` (continuous — the LB routes around an unhealthy container). The combined pattern, params, and behavior are in `zerops://guides/readiness-health-checks`. ## Gotchas 1. **HA is immutable**: Must delete and recreate service to switch modes diff --git a/apps/docs/content/guides/public-access.mdx b/apps/docs/content/guides/public-access.mdx index 991b7f8e..ad76e561 100644 --- a/apps/docs/content/guides/public-access.mdx +++ b/apps/docs/content/guides/public-access.mdx @@ -12,8 +12,8 @@ Zerops offers three public access methods: zerops.app subdomains (dev only, 50MB - Max upload: **50 MB** - **Not for production** — use for development/testing only - Auto-provisioned SSL -- Pre-configure via import YAML: `enableSubdomainAccess: true` (works for all runtime/web types) -- **Activate routing:** `zerops_deploy` **auto-enables** the subdomain on the first deploy for eligible service modes (dev/stage/simple/standard/local-stage) and waits HTTP-ready — the deploy response carries `subdomainAccessEnabled: true` and the URL. Use `zerops_subdomain enable` only as an explicit recovery/ops command if auto-enable was skipped (a worker / non-HTTP service, or launch-production which deliberately opts out in favor of a custom domain). Import's `enableSubdomainAccess: true` pre-configures intent; deploy activates the L7 balancer. Re-deploys do NOT deactivate it. Use `zerops_discover` to check current status and get the URL (`subdomainEnabled` + `subdomainUrl` fields). +- **Enable it:** set `enableSubdomainAccess: true` in the import YAML (works for all runtime/web types) to pre-configure intent; the first deploy (or the GUI toggle) activates the L7 subdomain route, and re-deploys never deactivate it. +- *In ZCP:* `zerops_deploy` auto-enables the subdomain on the first deploy for eligible service modes (dev/stage/simple/standard/local-stage) and waits HTTP-ready — the deploy response carries `subdomainAccessEnabled: true` and the URL. `zerops_subdomain enable` is the explicit recovery/ops command if auto-enable was skipped (a worker / non-HTTP service, or launch-production which deliberately opts out in favor of a custom domain). `zerops_discover` shows current status (`subdomainEnabled` + `subdomainUrl`). - **Port-specific subdomains**: If HTTP ports are defined in zerops.yml, each port gets its own subdomain: `{hostname}-{subdomainHost_prefix}-{port}.{subdomainHost_rest}`. Example: hostname `appdev`, subdomainHost `1df2.prg1.zerops.app`, port 3000 → actual URL `https://appdev-1df2-3000.prg1.zerops.app`. Port 80 omits the port suffix: `https://appdev-1df2.prg1.zerops.app` - **Internal network fallback**: Every service is accessible internally via `http://{hostname}:{port}` (e.g., `http://appdev:3000`). Use this to verify the app is running when subdomain access is uncertain — `curl http://appdev:3000/health` from the ZCP container or any other service in the project - Works for: nodejs, static, nginx, go, python, php, java, rust, dotnet, and all other runtime types @@ -21,14 +21,8 @@ Zerops offers three public access methods: zerops.app subdomains (dev only, 50MB ### 2. Custom Domains (Production) - Per-project HTTPS balancer (2 containers, HA) - Round-robin load balancing + health checks -- Full upload limit: 512 MB -- Requires IP address assignment: - -| IP Type | Cost | Protocol | Notes | -|---------|------|----------|-------| -| Shared IPv4 | Free | HTTP/HTTPS only | Limited connections, shorter timeouts | -| Dedicated IPv4 | $3/30 days | All protocols | Non-refundable, auto-renews | -| IPv6 | Free | All protocols | Dedicated per project | +- Upload limit: 512 MB default (`client_max_body_size`, configurable up to 2048m on a custom domain) — not a hard cap +- Requires IP address assignment ### 3. Direct Port Access - Available for: Runtime services, PostgreSQL @@ -36,14 +30,5 @@ Zerops offers three public access methods: zerops.app subdomains (dev only, 50MB - Protocols: TCP, UDP - Configurable firewall: blacklist or whitelist per port -## DNS Setup (Custom Domain) -Point your domain to the project's IP: -- `A` record → Dedicated IPv4 -- `AAAA` record → IPv6 -- Shared IPv4: Requires **both A and AAAA** records (AAAA needed for SNI routing) - ## Gotchas -1. **Shared IPv4 needs AAAA record**: Without AAAA, SNI routing fails — always add both A and AAAA -2. **zerops.app 50MB limit**: File uploads over 50MB fail on subdomains — use custom domain -3. **Dedicated IPv4 is non-refundable**: $3/30 days, auto-renews — cannot get refund if removed early -4. **Ports 80/443 reserved**: Your app cannot bind to these — Zerops uses them for SSL termination +1. **Dedicated IPv4 is non-refundable**: $3/30 days, auto-renews — the fee isn't refunded if removed early, but the address can be reused in another project until the subscription ends diff --git a/apps/docs/content/guides/readiness-health-checks.mdx b/apps/docs/content/guides/readiness-health-checks.mdx new file mode 100644 index 00000000..7431d5b0 --- /dev/null +++ b/apps/docs/content/guides/readiness-health-checks.mdx @@ -0,0 +1,89 @@ +--- +title: Readiness & Health Checks on Zerops +description: "Guide: Readiness & Health Checks on Zerops" +--- + +Two distinct mechanisms, often confused. **Readiness check** (`deploy.readinessCheck`) runs ONLY during a deploy — it gates when the new container starts receiving traffic; if it never passes, the deploy fails and the old version keeps serving. **Health check** (`run.healthCheck`) runs CONTINUOUSLY on the live app — it disconnects an unhealthy container from the load balancer, restarts it, and reconnects it on recovery. Both support `httpGet` or `exec` (mutually exclusive within one block). The field shape lives in the zerops.yml schema; this guide owns the behavior. + +--- + +## The distinction (the #1 confusion) + +| | Readiness check | Health check | +|---|---|---| +| Location | `deploy.readinessCheck` | `run.healthCheck` | +| When it runs | **During a deploy only** | **Continuously, after startup** | +| Purpose | Gate traffic to a NEW container | Detect runtime failure of a LIVE container | +| On failure | Deploy fails; new appVersion not activated; old version keeps serving | Container removed from LB → restarted → reconnected on recovery | + +A readiness check makes a deploy wait for the app to actually answer before cutting traffic over. A health check keeps a degraded container out of rotation while it's live. Use both on production services. + +## Readiness check (`deploy.readinessCheck`) + +Checks the **new** container at `localhost`. Until it passes, traffic stays on the old container. + +```yaml +deploy: + readinessCheck: + httpGet: { port: 3000, path: /health } + failureTimeout: 60 # seconds until the container is marked failed + retryPeriod: 10 # seconds between attempts +``` + +Mechanics: `start` runs → readiness check runs → on fail, wait `retryPeriod` and retry → on success, the container is marked active and receives traffic → if still failing after `failureTimeout`, the container is deleted and the deploy fails (the previous appVersion stays active). Set `failureTimeout`/`retryPeriod` explicitly — there is no fixed schema default to rely on. + +## Health check (`run.healthCheck`) + +Runs on every container continuously after startup. + +```yaml +run: + healthCheck: + httpGet: { port: 3000, path: /health } + failureTimeout: 30 # consecutive-failure seconds before restart (reset by a success) + disconnectTimeout: 30 # seconds before a failing container is pulled from the LB + recoveryTimeout: 30 # seconds of success before a restarted container takes traffic again + execPeriod: 10 # seconds between attempts +``` + +**Failure sequence**: repeated failures → `disconnectTimeout` removes the container from the load balancer → `failureTimeout` triggers a restart → `recoveryTimeout` gates traffic reconnection once it's healthy again. + +## httpGet vs exec (both checks) + +- **`httpGet`** — GET to `localhost:{port}{path}`, triggered **inside** the container. Success = HTTP `2xx` (follows `3xx` redirects), 5-second per-request timeout. `host` sets a custom Host header; `scheme: https` only if the app demands TLS internally (default is plain HTTP — the L7 balancer terminates SSL upstream). +- **`exec`** — a local shell command, success = exit `0`, 5-second per-command timeout. Has access to all env vars. Use a YAML `|` block for multi-step scripts. + +**DO NOT** put both `httpGet` and `exec` in the same check block — they are mutually exclusive. + +## temporaryShutdown (deploy container ordering) + +Readiness gating only buys zero-downtime when the old container stays up during cutover — that is `temporaryShutdown` (in the `deploy` block): + +| Value | Behavior | Downtime | +|---|---|---| +| `false` (default) | New containers start and pass readiness BEFORE old ones are removed | None (zero-downtime rolling deploy) | +| `true` | Old containers stop BEFORE new ones start | Yes | + +Use `true` only when you cannot run two versions simultaneously (exclusive DB-migration access, singleton locks). Rolling cutover is zero-downtime at any `minContainers` value — don't conflate replica count with the deploy strategy. + +## Dev/stage placement + +In dev+stage pairs, `healthCheck` and `readinessCheck` belong ONLY on the **stage** entry. Dynamic-runtime dev services run `start: zsc noop --silent` (a no-op keepalive that idles the container while the agent drives the real dev server's lifecycle) — adding a `healthCheck` to a dev service causes unwanted container restarts during iteration. + +## Production pattern + +Combine both on a production service so deploys wait for readiness and the LB routes around runtime failures: + +```yaml +zerops: + - setup: app + deploy: + readinessCheck: + httpGet: { port: 3000, path: /health } + run: + healthCheck: + httpGet: { port: 3000, path: /health } + start: node server.js +``` + +Without health checks, the load balancer cannot route around an unhealthy container — it keeps sending traffic to a degraded instance. diff --git a/apps/docs/content/guides/scaling.mdx b/apps/docs/content/guides/scaling.mdx index d612324a..49c8c792 100644 --- a/apps/docs/content/guides/scaling.mdx +++ b/apps/docs/content/guides/scaling.mdx @@ -22,9 +22,9 @@ Zerops autoscales vertically (CPU/RAM/disk) and horizontally (container count). | **Linux containers** (Alpine, Ubuntu) | Yes | Yes (1-10 containers) | Same as runtimes | | **Managed DB** (PostgreSQL, MariaDB) | Yes | No (fixed: NON_HA=1, HA=3) | Mode immutable after creation | | **Managed cache** (KeyDB/Valkey) | Yes | No (fixed: NON_HA=1, HA=3) | Mode immutable after creation | -| **Shared storage** | No (automatic, not configurable) | No (fixed: NON_HA=1, HA=3) | DO NOT set verticalAutoscaling in import.yml | +| **Shared storage** | Yes (cpu/ram/disk configurable) | No (fixed: NON_HA=1, HA=3) | Accepts verticalAutoscaling in import.yml | | **Object storage** | No | No | Fixed size at creation, no verticalAutoscaling | -| **Docker** | No (manual, triggers VM restart) | Yes (VM count changeable, triggers restart) | No autoscaling at all | +| **Docker** | No (manual, triggers VM restart) | Manual only (change VM count, triggers restart) | No automatic autoscaling | ## Vertical Autoscaling @@ -122,15 +122,9 @@ HA recovery: failed container is disconnected, new one created on different hard PostgreSQL HA exposes read replica port **5433** for distributing SELECT queries. -## Configuring Thresholds via zerops_scale +## Autoscaling Thresholds -Threshold parameters can be set via the `zerops_scale` MCP tool, not just import.yml: - -``` -zerops_scale serviceHostname="api" minFreeRamGB=0.5 minFreeRamPercent=5 minFreeCpuCores=0.2 -``` - -All four threshold parameters (`minFreeRamGB`, `minFreeRamPercent`, `minFreeCpuCores`, `minFreeCpuPercent`) are optional and can be combined with any other scaling parameters in a single call. +The dual-threshold trigger controls WHEN vertical scaling fires. All four fields (`minFreeRamGB`, `minFreeRamPercent`, `minFreeCpuCores`, `minFreeCpuPercent`) are optional and live in the `verticalAutoscaling` block (see import.yml Syntax below). (*In ZCP they can also be set live: `zerops_scale serviceHostname="api" minFreeRamGB=0.5 minFreeRamPercent=5 minFreeCpuCores=0.2`, combinable with any other scaling parameter in one call.*) ## Docker Services - Run in **VMs**, not containers. **No autoscaling** -- resources fixed at creation @@ -174,28 +168,30 @@ services: ## Strategy Presets -**Development** — SHARED CPU, min resources, 1 container. Cost-effective for dev/staging: -``` -zerops_scale serviceHostname="api" cpuMode="SHARED" minCpu=1 maxCpu=2 minRam=0.25 maxRam=1 minContainers=1 maxContainers=1 +**Development** — SHARED CPU, min resources, single container (cost-effective for dev/staging): +```yaml +minContainers: 1 +maxContainers: 1 +verticalAutoscaling: { cpuMode: SHARED, minCpu: 1, maxCpu: 2, minRam: 0.25, maxRam: 1 } ``` **Production** — DEDICATED CPU, higher minimums, multiple containers for HA: -``` -zerops_scale serviceHostname="api" cpuMode="DEDICATED" minCpu=2 maxCpu=8 minRam=2 maxRam=8 minContainers=2 maxContainers=6 +```yaml +minContainers: 2 +maxContainers: 6 +verticalAutoscaling: { cpuMode: DEDICATED, minCpu: 2, maxCpu: 8, minRam: 2, maxRam: 8 } ``` -**Burst workloads** — Wide autoscaling range, SHARED CPU: -``` -zerops_scale serviceHostname="worker" cpuMode="SHARED" minCpu=1 maxCpu=8 minRam=1 maxRam=16 minContainers=1 maxContainers=10 +**Burst workloads** — wide autoscaling range, SHARED CPU: +```yaml +minContainers: 1 +maxContainers: 10 +verticalAutoscaling: { cpuMode: SHARED, minCpu: 1, maxCpu: 8, minRam: 1, maxRam: 16 } ``` -## Common Mistakes +(*In ZCP, the same presets apply live via `zerops_scale serviceHostname=... cpuMode=... minCpu=... ...`.*) -**DO NOT** add `verticalAutoscaling` to **object-storage** or **shared-storage** services in import.yml -- causes import failure. Object storage has a fixed `objectStorageSize` only. Shared storage is managed automatically. - -**DO NOT** set `minContainers` or `maxContainers` for managed services (DB, cache, shared-storage) -- container count is fixed by `mode` (NON_HA=1, HA=3). Setting these causes import failure. - -**DO NOT** use `DEDICATED` CPU for low-traffic or dev services -- wastes resources. Use `SHARED` and switch to `DEDICATED` only when consistent performance matters. +## Common Mistakes **DO NOT** set `minFreeRamGB: 0` and `minFreeRamPercent: 0` simultaneously -- the API rejects this with "Invalid custom autoscaling value". Always keep at least the default absolute threshold (0.0625 GB). diff --git a/apps/docs/content/guides/shared-storage-integration.mdx b/apps/docs/content/guides/shared-storage-integration.mdx new file mode 100644 index 00000000..93196e16 --- /dev/null +++ b/apps/docs/content/guides/shared-storage-integration.mdx @@ -0,0 +1,48 @@ +--- +title: Shared Storage Integration on Zerops +description: "Guide: Shared Storage Integration on Zerops" +--- + +Shared storage is a managed SeaweedFS volume mounted as a POSIX filesystem at `/mnt/` into one or more runtime services — for files that must be shared *between containers/services* (shared config, plugin directories, a common working set). It is mounted via the import.yaml `mount:` field; there is **no `zerops.yaml` mount**. For high-write workloads or user uploads, prefer Object Storage (S3) instead — shared storage is POSIX/NFS-style and not built for high-throughput churn. + +## Mounting — import.yaml `mount:` is the only config-file mechanism + +Declare the storage service, then list it under the runtime's service-level `mount:`. This auto-connects the storage at provision — import alone is sufficient, no second step: + +```yaml +services: + - hostname: storage + type: shared-storage + - hostname: app + type: nodejs@22 + buildFromGit: https://github.com/myorg/myapp # mount: requires buildFromGit + mount: + - storage # list one or more shared-storage hostnames +``` + +After deploy, the runtime has `/mnt/storage` (SeaweedFS FUSE, writable). Multiple volumes can be mounted to one service (`/mnt/files1`, `/mnt/files2`, …). + +**There is NO `zerops.yaml` mount field.** A `mount:` under `run:` is silently stripped by the platform — it even passes yaml validation (validation-passing ≠ honored), but produces no mount and no connection. Mounting is import.yaml-only (or `connect-storage`, below). + +## Connecting a storage to a runtime that missed the import mount + +A runtime that was READY_TO_DEPLOY at import time (e.g. a stage service created but not yet deployed) does NOT pick up the import `mount:`. Once it's ACTIVE, connect explicitly: + +``` +zerops_manage action="connect-storage" serviceHostname="app" storageHostname="storage" +``` + +This registers the connection, but the FUSE mount materializes **only on the next fresh deploy (new container creation)** — a plain restart does NOT bring it up. Redeploy the service after connecting. + +## Constraints & behavior + +- **Mount path**: always `/mnt/`. Runtime containers only — NOT available during build or `run.prepareCommands` phases. +- **Mounting overwrites** any existing content in the mount directory. +- **Capacity**: max 60 GB total (raise via support request); file size is unbounded within the 60 GB. `verticalAutoscaling` floors: RAM 0.5 GB, disk 5 GB. +- **HA**: 1:1 replication with auto-failover; during a master failover the mount is briefly unavailable (~30s). +- **POSIX**: standard filesystem ops (with minor permission-setting limits). Filesystem operations are logged to runtime logs tagged `zerops-mount-`. `df` can report misleading numbers — use the Zerops GUI for accurate storage metrics. +- **No env vars**: shared storage exposes no connection variables — it's a filesystem, not a networked service. + +## Shared storage vs object storage + +Use **shared storage** when you need a POSIX filesystem shared across services (shared config, plugin/extension directories, a common scratch area). Use **object storage** (S3/MinIO) for user uploads, media, and any high-throughput or write-heavy file operations — and for anything that must survive independent of any single service. Don't reach for shared storage as a generic uploads bucket. diff --git a/apps/docs/content/guides/smtp.mdx b/apps/docs/content/guides/smtp.mdx index dac7476a..6e1f947d 100644 --- a/apps/docs/content/guides/smtp.mdx +++ b/apps/docs/content/guides/smtp.mdx @@ -24,16 +24,28 @@ Only port **587** (STARTTLS) is allowed for outbound email — ports 25 and 465 | Amazon SES | `email-smtp.{region}.amazonaws.com` | 587 | Access key | Secret key | ## Configuration Example + +Non-secret SMTP settings belong in `run.envVariables` (zerops.yaml); the password is a secret in `envSecrets` (import.yaml, service level). These live in **different files** — a bare top-level `envVariables:`/`envSecrets:` block is rejected (`envVariables` is valid only under `build`/`run`). + +```yaml +zerops: + - setup: app + run: + envVariables: + SMTP_HOST: smtp.sendgrid.net + SMTP_PORT: "587" + SMTP_USER: apikey +``` + ```yaml -envVariables: - SMTP_HOST: smtp.sendgrid.net - SMTP_PORT: "587" - SMTP_USER: apikey -envSecrets: - SMTP_PASSWORD: +services: + - hostname: app + type: nodejs@22 + envSecrets: + SMTP_PASSWORD: ``` +A change to `envSecrets` requires a **service restart** to take effect. + ## Gotchas -1. **Port 25 is permanently blocked**: Cannot be unblocked — use 587 with STARTTLS -2. **Port 465 is also blocked**: Legacy SMTPS is deprecated — use 587 -3. **Gmail needs App Password**: Regular Gmail passwords won't work — generate an App Password in Google Account settings +- **Gmail SMTP**: a regular Gmail password fails auth — generate an App Password in Google Account settings and use it as `SMTP_PASSWORD`. diff --git a/apps/docs/content/guides/vpn.mdx b/apps/docs/content/guides/vpn.mdx index 4799a7b4..d2fc771c 100644 --- a/apps/docs/content/guides/vpn.mdx +++ b/apps/docs/content/guides/vpn.mdx @@ -8,7 +8,7 @@ Zerops VPN uses WireGuard via `zcli vpn up ` — connects to one pro ## Commands ```bash zcli vpn up # Connect -zcli vpn up --auto-disconnect # Auto-disconnect on terminal close +zcli vpn up --auto-disconnect # First disconnect an already-active VPN, then connect zcli vpn up --mtu 1350 # Custom MTU (default 1420) zcli vpn down # Disconnect ``` @@ -31,11 +31,6 @@ zcli vpn down # Disconnect |---------|----------| | Interface already exists | `zcli vpn down` then `zcli vpn up` | | Hostname not resolving | Try `db.zerops` suffix. On Windows, add `zerops` to DNS suffix list. Note: `dig`/`nslookup` bypass system resolver — use `nc -zv db 5432` to test | -| WSL2 not working | Enable systemd in `/etc/wsl.conf` under `[boot]` | +| WSL2 not working | Set `systemd=true` in `/etc/wsl.conf` under `[boot]`, then `wsl --shutdown` | | Conflicting VPN | Use `--mtu 1350` | | Ubuntu 25.* issues | Install AppArmor utilities | - -## Gotchas -1. **No env vars via VPN**: Must read env vars from GUI or API — VPN only provides network access -2. **One project at a time**: Cannot connect to multiple projects simultaneously -3. **Hostname resolution**: Both `hostname` and `hostname.zerops` work (VPN sets up DNS search domain). Use plain hostname for simplicity. If resolution fails on Windows, add `zerops` to DNS suffix list in Advanced TCP/IP Settings. diff --git a/apps/docs/content/guides/zerops-yaml-advanced.mdx b/apps/docs/content/guides/zerops-yaml-advanced.mdx index 8fb7980a..b0c29580 100644 --- a/apps/docs/content/guides/zerops-yaml-advanced.mdx +++ b/apps/docs/content/guides/zerops-yaml-advanced.mdx @@ -7,54 +7,9 @@ Behavioral semantics for advanced zerops.yml features: health/readiness checks, --- -## Health Check Behavior +## Health / Readiness Checks & temporaryShutdown -Health checks run **continuously** on every container after startup. Two types (mutually exclusive): - -- **`httpGet`**: GET to `localhost:{port}{path}`. Success = 2xx. Runs **inside** the container. Use `host` for custom Host header, `scheme: https` only if app requires TLS. -- **`exec`**: Shell command, success = exit 0. Has access to all env vars. Use YAML `|` for multi-command scripts. - -| Parameter | Purpose | -|-----------|---------| -| `failureTimeout` | Seconds of consecutive failures before container restart | -| `disconnectTimeout` | Seconds before failing container is removed from load balancer | -| `recoveryTimeout` | Seconds of success before restarted container receives traffic again | -| `execPeriod` | Interval in seconds between check attempts | - -**Failure sequence**: repeated failures -> `disconnectTimeout` removes from LB -> `failureTimeout` triggers restart -> `recoveryTimeout` gates traffic reconnection. - -**DO NOT** configure both `httpGet` and `exec` in the same block. - ---- - -## Readiness Check Behavior - -Runs **only during deployments** to gate traffic switch to a new container. - -```yaml -deploy: - readinessCheck: - httpGet: { port: 3000, path: /health } - failureTimeout: 60 - retryPeriod: 10 -``` - -**How it works**: Checks the **new** container at `localhost`. Until it passes, traffic stays on the old container. After `failureTimeout`, deploy fails and the old container remains active. - -**DO NOT** confuse with healthCheck -- readiness gates a deploy; healthCheck monitors continuously after. - -> **Dev/stage distinction**: In dev+stage pairs, healthCheck and readinessCheck belong ONLY on the stage entry. Dev services use `start: zsc noop --silent` — the agent controls server lifecycle via SSH. Adding healthCheck to dev causes unwanted container restarts during iteration. - ---- - -## temporaryShutdown - -| Value | Behavior | Downtime | -|-------|----------|----------| -| `false` (default) | New containers start first, old removed after readiness | None (zero-downtime) | -| `true` | All old containers stop, then new ones start | Yes | - -Use `true` when: exclusive DB migration access needed, or brief downtime acceptable. Use `false` for: production web services, APIs, user-facing apps. +Health checks (`run.healthCheck`, continuous), readiness checks (`deploy.readinessCheck`, deploy-time traffic gate), the httpGet/exec shape + params + failure sequence, the dev/stage placement rule (dev uses `start: zsc noop --silent`, no healthCheck), and `temporaryShutdown` deploy ordering are all owned by `zerops://guides/readiness-health-checks`. --- @@ -69,7 +24,7 @@ run: allContainers: false ``` -Parameters: `command` (required), `timing` (required, 5-field cron: `min hour dom mon dow`), `workingDir` (default `/var/www`), `allContainers` (`false` = one container, `true` = all containers). +Parameters: `command` (required), `timing` (required, 5-field cron: `min hour dom mon dow`), `allContainers` (**required** by the schema — `false` = one container, `true` = all containers), `workingDir` (optional, default `/var/www`). Cron runs inside the runtime container with full env var access. When `allContainers: false`, Zerops picks **one** container (good for DB jobs). Use `true` for cache clearing or log rotation everywhere. Minimum granularity is 1 minute. @@ -90,7 +45,7 @@ run: - litestream restore -if-replica-exists -if-db-not-exists $DB_NAME ``` -Each entry: `command` (required), `name` (required), `workingDir` (optional), `initCommands` (optional, per-process init). **DO NOT** use both `start` and `startCommands`. +Each entry: `command` (**required** — the only field the schema requires), `name` (optional, distinguishes processes in logs), `workingDir` (optional), `initCommands` (optional, per-process init). **DO NOT** use both `start` and `startCommands`. --- @@ -178,7 +133,7 @@ Configuration is **merged at the section level** -- child values override parent Available runtimes and versions are listed in **Service Stacks (live)** -- injected by `zerops_knowledge` and workflow responses. Some key rules: - PHP: build `php@X`, run `php-nginx@X` or `php-apache@X` (different bases) -- Deno, Gleam: REQUIRES `os: ubuntu` (not available on Alpine) +- Deno: REQUIRES `os: ubuntu` (no Alpine build exists). Gleam runs on both Alpine and Ubuntu. - Static sites: build `nodejs@latest`, run `static` - `@latest` = newest stable version