diff --git a/README.md b/README.md index d1b628f..d484418 100644 --- a/README.md +++ b/README.md @@ -130,6 +130,15 @@ Screen". Launches fullscreen. `NO_SUDO=1` is the right setting if your user is in the `docker` group (the default for this repo). Drop it if you need `sudo docker`. +## Deploying + +- **Local dev** — `NO_SUDO=1 scripts/dev.sh up` (described in Quick + start above). +- **Portainer** — push images with `scripts/publish.sh`, then deploy + the [`docker-compose.portainer.yml`](docker-compose.portainer.yml) + stack via the Portainer UI. Full walk-through: + [`docs/deploy-portainer.md`](docs/deploy-portainer.md). + ## Manual test runbook End-to-end checks that unit tests can't cover (live Baileys, diff --git a/docker-compose.portainer.yml b/docker-compose.portainer.yml new file mode 100644 index 0000000..3ba4630 --- /dev/null +++ b/docker-compose.portainer.yml @@ -0,0 +1,111 @@ +# Portainer-ready stack. Pulls cm-whatsapp-{web,bot} from +# gitea.04080616.xyz/yiekheng instead of building from source — drop +# this file into a Portainer "Stack" (Repository or Web editor) and +# fill the env vars in the Portainer UI. +# +# Differences vs docker-compose.base.yml: +# - No `build:` blocks (Portainer pulls only). +# - Named volumes (cmbot-data, cmbot-sessions, cmbot-media) instead +# of host bind-mounts so the operator doesn't need shell access +# to manage persistent state. +# - Ports section on `web` so the operator can route a reverse +# proxy / Cloudflare Tunnel directly at the container. +# - `restart: unless-stopped` on both services. +# +# Required env vars (set in Portainer → Stack → Environment variables): +# DATABASE_URL postgres://USER:PASS@HOST:5432/wabot +# AUTH_SECRET 32-byte random hex (use scripts/gen_auth_secret.sh +# on any machine and copy the output) +# WEB_PORT host port for the web container (default 9000) +# +# Optional: +# DOCKER_IMAGE_TAG registry tag to deploy (default: latest) +# OPERATOR_TOKEN_VERSION session-cookie kill switch (default: 1) +# BOT_FIRE_CONCURRENCY pg-boss workers (default: 8) +# BOT_GROUP_CONCURRENCY per-account parallel sends (default: 3) +# BOT_MAX_SEND_PER_MINUTE per-account token-bucket rate (default: 40) +# BOT_LOG_LEVEL pino log level (default: info) +# +# Registry auth: Portainer needs a pull credential for +# gitea.04080616.xyz before you start the stack: +# Portainer → Registries → Add registry +# Name: gitea.04080616.xyz +# URL: gitea.04080616.xyz +# Username: +# Token: +# After adding, edit each service in the stack and set "Registry" to +# the one you just added so the pull resolves. + +services: + bot: + image: gitea.04080616.xyz/yiekheng/cm-whatsapp-bot:${DOCKER_IMAGE_TAG:-latest} + container_name: cmbot-bot + restart: unless-stopped + environment: + NODE_ENV: production + DATABASE_URL: ${DATABASE_URL} + DATA_DIR: /data + SESSIONS_DIR: /data/sessions + MEDIA_DIR: /data/media + BOT_HEALTH_PORT: 8081 + BOT_LOG_LEVEL: ${BOT_LOG_LEVEL:-info} + BOT_FIRE_CONCURRENCY: ${BOT_FIRE_CONCURRENCY:-8} + BOT_GROUP_CONCURRENCY: ${BOT_GROUP_CONCURRENCY:-3} + BOT_MAX_SEND_PER_MINUTE: ${BOT_MAX_SEND_PER_MINUTE:-40} + volumes: + - cmbot-sessions:/data/sessions + - cmbot-media:/data/media + healthcheck: + test: + - "CMD-SHELL" + - "wget -qO- --timeout=2 http://127.0.0.1:8081/health >/dev/null || exit 1" + interval: 30s + timeout: 5s + retries: 3 + start_period: 20s + networks: + - cmbot + + web: + image: gitea.04080616.xyz/yiekheng/cm-whatsapp-web:${DOCKER_IMAGE_TAG:-latest} + container_name: cmbot-web + restart: unless-stopped + depends_on: + - bot + environment: + NODE_ENV: production + DATABASE_URL: ${DATABASE_URL} + DATA_DIR: /data + MEDIA_DIR: /data/media + WEB_PORT: 3000 + AUTH_SECRET: ${AUTH_SECRET} + OPERATOR_TOKEN_VERSION: ${OPERATOR_TOKEN_VERSION:-1} + volumes: + # Web reads media from the same persistent volume the bot wrote. + - cmbot-media:/data/media:ro + ports: + # Maps the Next.js port (3000 inside the container) to whatever + # WEB_PORT the operator set. The reverse proxy / Cloudflare Tunnel + # in front of this host points at :${WEB_PORT}. + - "${WEB_PORT:-9000}:3000" + healthcheck: + test: + - "CMD-SHELL" + - "wget -qO- --timeout=2 http://127.0.0.1:3000/api/health >/dev/null || exit 1" + interval: 30s + timeout: 5s + retries: 3 + start_period: 30s + networks: + - cmbot + +volumes: + cmbot-sessions: + name: cmbot-sessions + cmbot-media: + name: cmbot-media + +networks: + cmbot: + driver: bridge + name: cmbot diff --git a/docker/bot.Dockerfile b/docker/bot.Dockerfile index 6e0073e..bedc509 100644 --- a/docker/bot.Dockerfile +++ b/docker/bot.Dockerfile @@ -26,11 +26,13 @@ COPY --from=build /app/node_modules /app/node_modules COPY --from=build /app/apps/bot /app/apps/bot COPY --from=build /app/packages/db /app/packages/db COPY --from=build /app/packages/shared /app/packages/shared -RUN addgroup -g 1000 app && \ - adduser -D -u 1000 -G app -s /sbin/nologin app && \ - mkdir -p /data/sessions /data/media /app && \ - chown -R app:app /app /data && \ +# Reuse the `node` user (UID/GID 1000) that node:alpine ships with — +# `addgroup -g 1000 app` failed in CI because gid 1000 was already +# taken by the node group. Same hardening posture (non-root, no +# shell login), one less moving part. +RUN mkdir -p /data/sessions /data/media /app && \ + chown -R node:node /app /data && \ chmod 700 /data/sessions -USER app +USER node EXPOSE 8081 CMD ["node", "apps/bot/dist/index.js"] diff --git a/docker/web.Dockerfile b/docker/web.Dockerfile index 5432f24..8f87a51 100644 --- a/docker/web.Dockerfile +++ b/docker/web.Dockerfile @@ -29,9 +29,9 @@ ENV HOSTNAME=0.0.0.0 COPY --from=build /app/apps/web/.next/standalone ./ COPY --from=build /app/apps/web/.next/static ./apps/web/.next/static COPY --from=build /app/apps/web/public ./apps/web/public -RUN addgroup -g 1000 app && \ - adduser -D -u 1000 -G app -s /sbin/nologin app && \ - chown -R app:app /app -USER app +# Reuse the `node` user (UID/GID 1000) that node:alpine ships with — +# `addgroup -g 1000 app` collided with the pre-existing node group. +RUN chown -R node:node /app +USER node EXPOSE 3000 CMD ["node", "apps/web/server.js"] diff --git a/docs/deploy-portainer.md b/docs/deploy-portainer.md new file mode 100644 index 0000000..9dd3a68 --- /dev/null +++ b/docs/deploy-portainer.md @@ -0,0 +1,172 @@ +# Deploying via Portainer + +End-to-end deploy steps for a fresh Portainer-managed host. Targets +the standard cm-whatsapp-bot pair of images published by +`scripts/publish.sh`. + +## 0. Prerequisites + +- Portainer 2.x running on the target host (CE or EE both fine). +- A Postgres reachable from that host (the `wabot` database with the + pgcrypto / pg_trgm extensions enabled — run migrations from any + machine that can reach the DB before the stack is brought up). +- A pull credential for `gitea.04080616.xyz` — a Gitea personal + access token with the `read:packages` scope. Generate one in + Gitea → User Settings → Applications. +- A reverse proxy / Cloudflare Tunnel pointing at + `http://:` if the deploy needs to be + reachable on the public domain (e.g. `wabot.04080616.xyz`). + +## 1. Add the registry to Portainer + +Portainer → **Registries** → **+ Add registry** → Custom registry. + +| Field | Value | +|---------------|-----------------------------| +| Name | `gitea.04080616.xyz` | +| Registry URL | `gitea.04080616.xyz` | +| Authentication | enabled | +| Username | your Gitea username | +| Password | the read:packages PAT | + +Save. The registry must show as connected before continuing — if the +test pull fails, the stack will hang on `pull` later. + +## 2. Push the images (on your dev machine) + +```bash +# Login once (sudo path matches scripts/dev.sh by default) +sudo docker login gitea.04080616.xyz + +# Push :latest. Tag explicitly with DOCKER_IMAGE_TAG=v1.x.y if you +# want pinned-tag deploys (recommended for prod — never deploy +# `latest` if you can avoid it; tag versions per release). +NO_SUDO=1 ./scripts/publish.sh latest +``` + +`publish.sh` builds + pushes both images: +- `gitea.04080616.xyz/yiekheng/cm-whatsapp-bot:` +- `gitea.04080616.xyz/yiekheng/cm-whatsapp-web:` + +## 3. Create the Portainer stack + +Portainer → **Stacks** → **+ Add stack**. + +**Name:** `cm-whatsapp-bot` + +**Build method:** "Web editor" or "Repository". Either is fine — +"Repository" pointing at this repo's `master` and the file +`docker-compose.portainer.yml` is the cleanest path because future +deploys are just "Pull and redeploy" inside Portainer. + +**Web editor path:** copy the contents of +[`docker-compose.portainer.yml`](../docker-compose.portainer.yml) +into the editor verbatim. + +**Repository path:** + +| Field | Value | +|------------------|-------------------------------------------------------------| +| Repository URL | http://192.168.0.215:3000/yiekheng/cm_whatsapp_bot_v1.git | +| Reference | refs/heads/master | +| Compose path | docker-compose.portainer.yml | +| Authentication | enabled (same Gitea PAT as step 1) | +| Auto-update | optional — enabled lets Portainer redeploy on every push | + +## 4. Set environment variables + +In the same stack form, scroll to **Environment variables** and add: + +| Key | Value | +|---------------------------|------------------------------------------------| +| `DATABASE_URL` | `postgres://wabot:PASS@192.168.0.210:5432/wabot` | +| `AUTH_SECRET` | output of `scripts/gen_auth_secret.sh` | +| `WEB_PORT` | host port (e.g. `9000`) | +| `DOCKER_IMAGE_TAG` | `latest` (or a pinned `v1.x.y`) | +| `OPERATOR_TOKEN_VERSION` | `1` (bump only when you want to invalidate every existing session) | +| `BOT_LOG_LEVEL` | `info` | + +Optional tuning (defaults are fine for most installs): + +| Key | Default | When to bump | +|---------------------------|---------|--------------| +| `BOT_FIRE_CONCURRENCY` | `8` | More accounts firing in parallel | +| `BOT_GROUP_CONCURRENCY` | `3` | More groups per fire — but careful with WhatsApp rate caps | +| `BOT_MAX_SEND_PER_MINUTE` | `40` | Aged accounts can push toward 60 | + +## 5. Run database migrations + +The stack does NOT auto-migrate on boot. Apply migrations from any +machine that can reach the same Postgres: + +```bash +DATABASE_URL='postgres://...' \ + ./scripts/db.sh migrate +``` + +If the journal is non-monotonic, the migrate runner refuses with a +clear error and prints which `_journal.json` entry to bump (the +guard added in commit 47d7c53 + the CI test in +`apps/web/src/test/drizzle-journal-monotonic.test.ts`). + +Then seed the bootstrap operator + set its password: + +```bash +DATABASE_URL='postgres://...' SEED_OPERATOR_USERNAME=admin \ + ./scripts/db.sh seed +DATABASE_URL='postgres://...' \ + ./scripts/set-password.sh admin # reads the password from stdin +``` + +## 6. Deploy the stack + +In Portainer → click **Deploy the stack**. Watch the container list +in **Containers**: + +- `cmbot-bot` should show *running, healthy* within ~20 s. +- `cmbot-web` should show *running, healthy* within ~30 s (Next.js + cold boot is the bottleneck). + +If a container shows *unhealthy*, check **Logs**: + +| Symptom | Likely cause | +|----------------------------------------------|--------------| +| `column "email" does not exist` | Migrations weren't applied. Run step 5. | +| `Server is not configured for sign-in` | `AUTH_SECRET` blank or missing. Set it in stack env. | +| `pg-boss: queue policy ...standard` | Harmless first-boot log; the bot force-flips it. | +| `Stream Errored (restart required)` (Baileys) | Upstream noise; ignore unless pairing fails. | + +## 7. First sign-in + +Visit `https:///login`, sign in as `admin` with the +password set in step 5, and walk the +[`docs/runbook.md`](runbook.md) smoke checklist before declaring +the deploy good. + +## 8. Future redeploys + +Two paths depending on how you set up step 3: + +**Web editor flow:** +1. Run `scripts/publish.sh ` on your dev machine. +2. In Portainer → Stack → "Update the stack" → "Re-pull image and + redeploy". + +**Repository flow:** +1. Run `scripts/publish.sh `. +2. Commit any compose / env changes to master. +3. Portainer → Stack → "Pull and redeploy". (If auto-update is on, + skip this — Portainer redeploys on every push.) + +Always pin a tag (`v1.4.2`) instead of `latest` for production — +makes rollback a one-field stack edit instead of a republish. + +## Rolling back + +In Portainer → Stack → set `DOCKER_IMAGE_TAG=v1.4.1` (or whatever +the previous good tag was) → Re-pull and redeploy. The cmbot-* data +volumes (sessions, media) are preserved across image swaps, so a +rollback doesn't lose pairings or uploaded media. + +If the schema also rolled back, run the corresponding `down` SQL by +hand — drizzle's migrator only goes forward, by design.