yiekheng cfd3308477 feat(media): unsupported image/video/audio formats fall back to document delivery
Old behaviour: HEIC/AVIF photos, .mov / .webm / .mkv videos, and niche
audio (FLAC, etc.) got rejected outright at upload with "Images are
not supported" / "Videos are not supported" errors. Strict but
unfriendly — recipients could still receive these as a downloadable
file via WhatsApp's document path; we just weren't using it.

New behaviour: anything not playable inline gets routed through the
document path automatically. The recipient downloads the file and
opens it in their default app. The 100 MB document cap applies
instead of the inline 5 / 16 / 16 MB caps. Only oversized uploads
get rejected.

Where the policy lives
----------------------
The classifier moved into a new `@cmbot/shared/whatsapp-media`
module so the web upload validator AND the bot's fire-reminder send
path share one source of truth:

  - resolveDeliveryKind(mime, bytes?) → "image" | "video" | "audio"
    | "document". Native types stay as-is; HEIF / AVIF / QuickTime /
    WebM / Matroska / non-MP3-or-M4A audio all collapse to "document".
  - Bytes argument is optional but recommended — sniffing the first
    12 bytes of the file catches iOS Safari's habit of labelling
    a HEIC as image/jpeg or a .mov as video/mp4. Bytes win when they
    disagree with the mime.

Web side
--------
- `lib/whatsapp-media.ts` re-exports the shared helpers and keeps
  only the validator + byte-formatter. `validateForWhatsApp` calls
  resolveDeliveryKind internally; the size cap it returns is for the
  RESOLVED kind (so a HEIC routes to document and gets the 100 MB
  cap). The "Images are not supported" / "Videos are not supported"
  rejection messages are gone — there's no format rejection anymore.
- `actions/media.ts` collapses the previous explicit-mime + byte-sniff
  pair into a single `validateForWhatsApp(mime, size, bytes)` call.
- Compose-step upload-zone hint updated to spell out the per-kind
  caps: "JPEG/PNG up to 5 MB · MP4/3GP up to 16 MB · MP3/M4A/OGG
  up to 16 MB · documents up to 100 MB".

Bot side
--------
- `fire-reminder.ts` reads the first 12 bytes of the file before
  dispatching and calls `resolveDeliveryKind(mimeType, head)` to
  pick the senderKind. So a HEIC on disk (whose mime claims
  image/jpeg) gets sent via Baileys' document path — no failed
  thumbnail extraction, message arrives as a downloadable .heic.
- New `readHeadBytes(filePath, n)` helper opens, reads N bytes,
  closes — no full-file slurp.

Tests
-----
249 web + 31 shared + 26 bot = 306 passing total.

Web (`lib/whatsapp-media.test.ts`):
- "HEIC at 30 MB allowed: routes to document (100 MB cap)"
- "HEIC at 110 MB rejects: exceeds the document cap"
- "MOV at 50 MB allowed (would be 16 MB cap as video, 100 MB as
  document)"
- "MOV pretending to be mp4 demotes to document (50 MB allowed)"
- "FLAC audio routes to document path"
- "genuine MP4 byte-sniff path keeps it as video"

Shared (`packages/shared/src/whatsapp-media.test.ts`, new):
- The cross-package contract: 11 tests covering size limits,
  classifyMediaKind, resolveDeliveryKind for native + demoted +
  byte-sniff cases, plus the underlying helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 13:07:54 +08:00
2026-05-09 17:10:48 +08:00

cm WhatsApp Reminder Bot

Self-hosted WhatsApp reminder bot. Pairs multiple WhatsApp accounts via Telegram-delivered QR codes and sends scheduled reminders to groups.

Status

Plan 1 complete. Foundation, DB schema, and Telegram-driven WhatsApp pairing are working end-to-end. Reminder scheduling, the web dashboard, and production deploy are upcoming plans (docs/superpowers/plans/).

What's working today:

  • Single-operator Telegram bot with a whitelist + audit log of every command.
  • BotFather-style menu navigation: /menu opens a single message that edits in place as you navigate.
  • Pair a new WhatsApp account with /menu📡 Pair New → reply with a label. QR is delivered to Telegram and refreshed in place as it expires.
  • Browse paired accounts with 📒 Accounts. Tap an account → see groups, send a test text message, or unpair.
  • Group sync runs at pairing and on every Baileys groups.upsert / groups.update event, plus a manual 🔄 Refresh button. Removed groups are pruned automatically.
  • Auto-reconnect on transient drops; restart-survival via Baileys useMultiFileAuthState (no QR rescan needed across container restarts as long as WhatsApp hasn't logged the device out).

Host requirements

Only Docker. No host Node, pnpm, or any other language toolchain — everything runs in containers via the long-lived tools service.

Architecture in one paragraph

Two app containers and one external dependency. bot (Node.js) holds the live Baileys WhatsApp sessions, the grammy Telegram bot, and (in plan 2) a pg-boss scheduler. web (Next.js, plan 3) is stateless UI + API. tools is a long-running Node 22 + pnpm sidecar used for installs/tests/typechecks/migrations so the host doesn't need a Node toolchain. Postgres lives external at 192.168.0.210 in a wabot database. All cross-service communication goes through Postgres (LISTEN/NOTIFY for events, table writes for state).

Full design spec: docs/superpowers/specs/2026-05-03-whatsapp-bot-design.md

Quick start (dev)

Prerequisites: Docker, the wabot database + waBot role on 192.168.0.210 (with a pg_hba.conf line permitting 192.168.0.0/24), and a Telegram bot token from @BotFather.

# 1. Configure env
cp envs/.env.example .env.development
# edit .env.development: real DATABASE_URL, TELEGRAM_BOT_TOKEN, your TG user ID
scripts/gen_auth_secret.sh --write

# 2. Bring up the tools container, install deps
NO_SUDO=1 scripts/dev.sh up
NO_SUDO=1 scripts/dev.sh pnpm install

# 3. Apply migrations and seed your operator row
NO_SUDO=1 scripts/db.sh migrate
NO_SUDO=1 scripts/db.sh seed

# 4. Watch the bot service
NO_SUDO=1 scripts/dev.sh logs bot

In Telegram, message your dev bot /menu, tap 📡 Pair New, reply with a label, scan the QR.

NO_SUDO=1 is the right setting if your user is in the docker group (the default for this repo). Drop it if you need sudo docker.

Layout

  • apps/bot/ — Node service: Baileys WhatsApp + grammy Telegram + (later) pg-boss scheduler
  • apps/web/ — Next.js dashboard (plan 3)
  • packages/db/ — Drizzle schema and migrations
  • packages/shared/ — cross-app helpers (rrule, media paths, timezones)
  • docs/superpowers/specs/ — design specs and manual test runbooks
  • docs/superpowers/plans/ — implementation plans
  • docker/ — Dockerfiles (tools.Dockerfile, bot.Dockerfile, web.Dockerfile placeholder)
  • scripts/dev.sh, db.sh, gen_auth_secret.sh, plus stubs for plans 2/4

Scripts

All pnpm/tsx/drizzle-kit invocations run inside the tools container, so no host Node is needed.

Script Purpose
scripts/dev.sh up|down|logs|status|build|exec|pnpm|shell|restart-bot Stack lifecycle and tools-container shell
scripts/db.sh migrate|generate|studio|seed|reset Drizzle migration helper
scripts/gen_auth_secret.sh [--write] Generate AUTH_SECRET (host-only, no Node needed)
scripts/publish.sh Push to Gitea registry — implemented in plan 4
scripts/link-account.sh CLI pairing without Telegram — implemented in plan 2

Set NO_SUDO=1 if your user is in the docker group (recommended).

Next plan

docs/superpowers/plans/<next-date>-reminder-scheduling.md — pg-boss, reminder CRUD via Telegram, fire-reminder handler, sender (text/image/video), retry policy, run history.

Description
No description provided
Readme 1.5 MiB
Languages
TypeScript 97.8%
Shell 1.2%
Dockerfile 0.5%
CSS 0.5%