Old behaviour: HEIC/AVIF photos, .mov / .webm / .mkv videos, and niche
audio (FLAC, etc.) got rejected outright at upload with "Images are
not supported" / "Videos are not supported" errors. Strict but
unfriendly — recipients could still receive these as a downloadable
file via WhatsApp's document path; we just weren't using it.
New behaviour: anything not playable inline gets routed through the
document path automatically. The recipient downloads the file and
opens it in their default app. The 100 MB document cap applies
instead of the inline 5 / 16 / 16 MB caps. Only oversized uploads
get rejected.
Where the policy lives
----------------------
The classifier moved into a new `@cmbot/shared/whatsapp-media`
module so the web upload validator AND the bot's fire-reminder send
path share one source of truth:
- resolveDeliveryKind(mime, bytes?) → "image" | "video" | "audio"
| "document". Native types stay as-is; HEIF / AVIF / QuickTime /
WebM / Matroska / non-MP3-or-M4A audio all collapse to "document".
- Bytes argument is optional but recommended — sniffing the first
12 bytes of the file catches iOS Safari's habit of labelling
a HEIC as image/jpeg or a .mov as video/mp4. Bytes win when they
disagree with the mime.
Web side
--------
- `lib/whatsapp-media.ts` re-exports the shared helpers and keeps
only the validator + byte-formatter. `validateForWhatsApp` calls
resolveDeliveryKind internally; the size cap it returns is for the
RESOLVED kind (so a HEIC routes to document and gets the 100 MB
cap). The "Images are not supported" / "Videos are not supported"
rejection messages are gone — there's no format rejection anymore.
- `actions/media.ts` collapses the previous explicit-mime + byte-sniff
pair into a single `validateForWhatsApp(mime, size, bytes)` call.
- Compose-step upload-zone hint updated to spell out the per-kind
caps: "JPEG/PNG up to 5 MB · MP4/3GP up to 16 MB · MP3/M4A/OGG
up to 16 MB · documents up to 100 MB".
Bot side
--------
- `fire-reminder.ts` reads the first 12 bytes of the file before
dispatching and calls `resolveDeliveryKind(mimeType, head)` to
pick the senderKind. So a HEIC on disk (whose mime claims
image/jpeg) gets sent via Baileys' document path — no failed
thumbnail extraction, message arrives as a downloadable .heic.
- New `readHeadBytes(filePath, n)` helper opens, reads N bytes,
closes — no full-file slurp.
Tests
-----
249 web + 31 shared + 26 bot = 306 passing total.
Web (`lib/whatsapp-media.test.ts`):
- "HEIC at 30 MB allowed: routes to document (100 MB cap)"
- "HEIC at 110 MB rejects: exceeds the document cap"
- "MOV at 50 MB allowed (would be 16 MB cap as video, 100 MB as
document)"
- "MOV pretending to be mp4 demotes to document (50 MB allowed)"
- "FLAC audio routes to document path"
- "genuine MP4 byte-sniff path keeps it as video"
Shared (`packages/shared/src/whatsapp-media.test.ts`, new):
- The cross-package contract: 11 tests covering size limits,
classifyMediaKind, resolveDeliveryKind for native + demoted +
byte-sniff cases, plus the underlying helpers.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cm WhatsApp Reminder Bot
Self-hosted WhatsApp reminder bot. Pairs multiple WhatsApp accounts via Telegram-delivered QR codes and sends scheduled reminders to groups.
Status
Plan 1 complete. Foundation, DB schema, and Telegram-driven WhatsApp pairing are working end-to-end. Reminder scheduling, the web dashboard, and production deploy are upcoming plans (docs/superpowers/plans/).
What's working today:
- Single-operator Telegram bot with a whitelist + audit log of every command.
- BotFather-style menu navigation:
/menuopens a single message that edits in place as you navigate. - Pair a new WhatsApp account with
/menu→ 📡 Pair New → reply with a label. QR is delivered to Telegram and refreshed in place as it expires. - Browse paired accounts with 📒 Accounts. Tap an account → see groups, send a test text message, or unpair.
- Group sync runs at pairing and on every Baileys
groups.upsert/groups.updateevent, plus a manual 🔄 Refresh button. Removed groups are pruned automatically. - Auto-reconnect on transient drops; restart-survival via Baileys
useMultiFileAuthState(no QR rescan needed across container restarts as long as WhatsApp hasn't logged the device out).
Host requirements
Only Docker. No host Node, pnpm, or any other language toolchain — everything runs in containers via the long-lived tools service.
Architecture in one paragraph
Two app containers and one external dependency. bot (Node.js) holds the live Baileys WhatsApp sessions, the grammy Telegram bot, and (in plan 2) a pg-boss scheduler. web (Next.js, plan 3) is stateless UI + API. tools is a long-running Node 22 + pnpm sidecar used for installs/tests/typechecks/migrations so the host doesn't need a Node toolchain. Postgres lives external at 192.168.0.210 in a wabot database. All cross-service communication goes through Postgres (LISTEN/NOTIFY for events, table writes for state).
Full design spec: docs/superpowers/specs/2026-05-03-whatsapp-bot-design.md
Quick start (dev)
Prerequisites: Docker, the wabot database + waBot role on 192.168.0.210 (with a pg_hba.conf line permitting 192.168.0.0/24), and a Telegram bot token from @BotFather.
# 1. Configure env
cp envs/.env.example .env.development
# edit .env.development: real DATABASE_URL, TELEGRAM_BOT_TOKEN, your TG user ID
scripts/gen_auth_secret.sh --write
# 2. Bring up the tools container, install deps
NO_SUDO=1 scripts/dev.sh up
NO_SUDO=1 scripts/dev.sh pnpm install
# 3. Apply migrations and seed your operator row
NO_SUDO=1 scripts/db.sh migrate
NO_SUDO=1 scripts/db.sh seed
# 4. Watch the bot service
NO_SUDO=1 scripts/dev.sh logs bot
In Telegram, message your dev bot /menu, tap 📡 Pair New, reply with a label, scan the QR.
NO_SUDO=1 is the right setting if your user is in the docker group (the default for this repo). Drop it if you need sudo docker.
Layout
apps/bot/— Node service: Baileys WhatsApp + grammy Telegram + (later) pg-boss schedulerapps/web/— Next.js dashboard (plan 3)packages/db/— Drizzle schema and migrationspackages/shared/— cross-app helpers (rrule, media paths, timezones)docs/superpowers/specs/— design specs and manual test runbooksdocs/superpowers/plans/— implementation plansdocker/— Dockerfiles (tools.Dockerfile,bot.Dockerfile,web.Dockerfileplaceholder)scripts/—dev.sh,db.sh,gen_auth_secret.sh, plus stubs for plans 2/4
Scripts
All pnpm/tsx/drizzle-kit invocations run inside the tools container, so no host Node is needed.
| Script | Purpose |
|---|---|
scripts/dev.sh up|down|logs|status|build|exec|pnpm|shell|restart-bot |
Stack lifecycle and tools-container shell |
scripts/db.sh migrate|generate|studio|seed|reset |
Drizzle migration helper |
scripts/gen_auth_secret.sh [--write] |
Generate AUTH_SECRET (host-only, no Node needed) |
scripts/publish.sh |
Push to Gitea registry — implemented in plan 4 |
scripts/link-account.sh |
CLI pairing without Telegram — implemented in plan 2 |
Set NO_SUDO=1 if your user is in the docker group (recommended).
Next plan
docs/superpowers/plans/<next-date>-reminder-scheduling.md — pg-boss, reminder CRUD via Telegram, fire-reminder handler, sender (text/image/video), retry policy, run history.