cm_whatsapp_bot_v1/docs/superpowers/specs/2026-05-09-web-app-design.md
yiekheng 3e2bc8c7ee docs: web app design (Telegram-free pivot, plan 3 spec)
After live-testing the Telegram bot we hit limits that don't go away with
more menu polish (Markdown fragility, callback_data limits, no native
date pickers, awkward media UX). Pivot to a Next.js PWA installable on
the operator's phone; remove Telegram entirely.

Spec covers: service topology with bot codebase shrunk, no-auth access
stance with rate limiting + reverse-proxy gating, Server Actions
replacing public REST mutation endpoints, SSE for live updates, the new
web-side pair flow with live QR display, multi-step reminder wizard
backed by URL state, mobile-first shadcn/ui visual layer, PWA service
worker via @serwist/next, and a step-by-step plan to delete the existing
Telegram code first.

Inherits all confirmed values from the 2026-05-03 master spec.
2026-05-09 22:15:51 +08:00

315 lines
17 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Web App Design — Telegram-Free Pivot
**Status:** Draft
**Date:** 2026-05-09
**Supersedes:** Sections of `2026-05-03-whatsapp-bot-design.md` that describe Telegram as the primary control surface.
## 1. Why this exists
After live-testing the Telegram bot we hit limits that don't go away with more menu polish:
- Markdown parsing is fragile; user content breaks rendering.
- callback_data is capped at 64 bytes; complex flows need stateful workarounds.
- No native date/time picker; we rebuilt a year/month/day grid by hand.
- Media UX (uploading photos, previewing video) is awkward in chat.
- Keyboard navigation through deep menus is slow for daily use.
The operator wants to install the controls **as a Progressive Web App on his phone** so they look and feel native. This document describes the migration: the Telegram bot is removed entirely and replaced by a Next.js PWA.
## 2. Stakeholders & access
- **Operator (brother):** sole end-user. Uses the PWA daily.
- **Developer (you):** builds, deploys, occasionally debugs.
- **No login.** The web app is reachable only at `https://wabot.04080616.xyz`. Whoever resolves that hostname and reaches port 443 has full control. Single seeded `operators` row in Postgres represents the brother for audit purposes; the app does not authenticate the request — it trusts the network perimeter (aaPanel reverse proxy + HTTPS).
This is an explicit trade-off. Risk: a leaked URL = full access. Mitigation: rotate by changing the subdomain. Defense in depth via rate limiting + strict referer checks below.
## 3. Tech stack
- **Next.js 16 (App Router)**, TypeScript end-to-end.
- **Tailwind CSS v4** + **shadcn/ui** components (latest registry).
- **Geist** font via `next/font`.
- **react-hook-form** + **zod** for forms (same zod schemas validated client-side and re-validated in server actions).
- **`@serwist/next`** for PWA service worker.
- **Drizzle ORM** (already in `packages/db`).
- **`pg`** for `LISTEN` in the SSE endpoint.
No new database tables; all reads/writes hit the existing schema from `2026-05-03-whatsapp-bot-design.md` §9.
## 4. Service topology
```
┌─────────────────────────────────────────────────────┐
│ Home Docker server │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ web │ │ bot │ │
│ │ Next.js 16 │◄───────►│ Node.js │ │
│ │ PWA + UI │ via │ Baileys │ │
│ │ Server │ Postgres│ pg-boss │ │
│ │ Components │ (LISTEN/│ sender │ │
│ │ + Server │ NOTIFY)│ ipc │ │
│ │ Actions │ │ │ │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ │ shared volume: │ /data/sessions │
│ │ /data/media │ │
│ │ │ │
│ └────────────┬───────────┘ │
│ │ │
│ ▼ │
│ aaPanel reverse proxy ─► wabot.04080616.xyz │
└─────────────────────────────────────────────────────┘
Postgres at 192.168.0.210
```
### Container responsibilities (post-pivot)
| Container | Role |
|---|---|
| `web` | All UI (Server Components for reads; Server Actions for mutations); SSE endpoint for live events; PWA service worker; QR PNG rendering for pair flow. |
| `bot` | Baileys WhatsApp sessions; pg-boss scheduler; fire-reminder; sender; group sync; IPC consumer that listens to `bot.command` Postgres notifications and dispatches to the right module. **No more Telegram code.** |
### Web ↔ bot channel
Same as before — Postgres `LISTEN/NOTIFY`. Web writes a row + `pgNotify('bot.command', ...)`; bot consumes; bot writes results back + `pgNotify('web.event', ...)`; web's SSE endpoint relays to the open browser.
## 5. Access stance (no auth)
- No login screen, no session cookies, no CSRF tokens (server actions handle their own).
- aaPanel is the only ingress. Bot's `:8081` health port is unreachable from outside.
- **Rate limit at Next.js middleware**: 30 requests / 10 sec per source IP. Anything more = 429.
- **Origin/Referer check on Server Actions**: Next.js 16 enforces this by default for actions; we leave it on.
- **Rotate the subdomain** if the URL ever leaks.
- **Audit log** is still written for every action, with `operator_id` set to the single seeded operator row.
## 6. Routes
```
/ Dashboard (overview cards)
/accounts Accounts list
/accounts/new Pair new account (live QR)
/accounts/[id] Account detail
/accounts/[id]/groups Groups list (paginated/searchable)
/accounts/[id]/pairing Live pair flow page (QR + status)
/groups/[id] Group detail + send test
/reminders Reminders list
/reminders/new Reminder wizard (?step=1..5)
/reminders/[id] Reminder detail (history, edit, delete)
/settings Profile (display name, default timezone)
# Server-side only — no public REST API for mutations:
GET /api/events Single SSE stream (read-only, public-safe)
```
All other `/api/*` paths return 404 (configured at aaPanel and as middleware match-all).
### Why no REST API for mutations
- Server Actions in Next.js 16 are first-class. They post to the page's own URL with an encrypted action ID, run on the server, return a serializable result, and integrate with `revalidatePath` / `revalidateTag` automatically.
- The browser never sees `/api/reminders` etc. as discoverable URLs.
- Type-safe end-to-end: the server action's return type flows through to the calling component without manual fetch boilerplate.
## 7. Live updates (SSE)
`GET /api/events` streams Server-Sent Events. The handler:
1. Connects to Postgres with a dedicated client (`LISTEN web.event`).
2. Forwards each notification's payload to the client as an SSE message:
```
event: session.qr
data: {"accountId":"...", "qrPng":"<base64>"}
event: session.connected
data: {"accountId":"...", "phoneNumber":"+60..."}
event: groups.synced
data: {"accountId":"...", "count":12}
event: reminder.fired
data: {"reminderId":"...", "runId":"...", "status":"success"}
event: reminder.failed
data: {"reminderId":"...", "error":"..."}
event: session.disconnected
data: {"accountId":"..."}
```
3. On disconnect, releases the PG client.
Client side: a single `useEvents()` hook opens the stream once at app mount. Each event triggers `queryClient.invalidateQueries` for the relevant key — the React Query cache stays fresh without polling.
## 8. Pair flow (replaces Telegram QR delivery)
```
Operator on /accounts → tap "Pair New Account" → /accounts/new
Form: { label: string }
Submit → server action: pairAccountAction(label)
├─ Insert whatsapp_accounts row { status:'pending', label }
└─ pgNotify('bot.command', { type:'account.start_pairing', accountId })
Server action returns { accountId } and the page redirects to /accounts/[id]/pairing
/accounts/[id]/pairing (server component renders shell + client island for SSE)
Shows label, account ID, "Waiting for QR…" shimmer
Client component subscribes to SSE
Bot's IPC consumer picks up notification:
sessionManager.start(accountId)
Listens for Baileys events:
qr → render PNG (base64) → pgNotify('web.event', { type:'session.qr', qrPng })
open → update DB + sync groups → pgNotify('session.connected') + 'groups.synced'
close (loggedOut) → pgNotify('session.timeout')
Browser:
On 'session.qr' — replace shimmer with <img src="data:image/png;base64,..."> + 30s countdown ring
On 'session.connected' — show ✅ Connected as +60xxx + auto-redirect to /accounts/[id] after 3s
On 'session.timeout' or 5-min server timer — show "Pairing timed out" + "Try again" button
```
The 5-min server-side timeout from plan 2 stays (in `bot`). On timeout the bot deletes the pending row and pgNotifies `session.timeout`.
## 9. Reminder wizard (replaces Telegram menu)
`/reminders/new` is one page that uses URL search params for state (`?step=N&...`). Five steps, each rendered server-side, with a server action per step that validates and redirects to the next step's URL.
| Step | Inputs | Notes |
|---|---|---|
| 1 — Account | radio list of paired accounts | shown as cards: label, phone, last connected status |
| 2 — Groups | checkbox list with search | **multi-target** — gain over plan 2's single-group constraint |
| 3 — Compose | textarea + file upload | drag-drop on desktop, native picker on mobile; file uploads go to `/data/media` via server action `uploadMediaAction` |
| 4 — When | `<input type="datetime-local">` + quick-pick chips | native iOS/Android datetime picker; chips for Now / Tomorrow 9 AM / Next Mon 9 AM |
| 5 — Review | rendered summary + [Schedule] | server action `createReminderAction` writes DB + schedules pg-boss job |
Edit-on-the-fly: each step has a "← Edit account / groups / body / time" link that navigates back to that step with the data preserved in URL.
URL-state is sufficient for v1 — small enough to fit in a query string. If we ever need to support multi-MB body content (drafts), we move to a `reminder_drafts` table.
## 10. Visual & layout
- **Mobile-first**. Tailwind breakpoints: `sm:` and up = "desktop layout"; below = single-column with comfortable tap targets (≥44px).
- **shadcn/ui** components throughout. Latest registry: Sidebar, Dialog, Form, DataTable, Sonner (toast), Sheet (mobile drawer), Tabs, Skeleton, Card.
- **Light + dark mode** auto-follows system; manual toggle in `/settings`.
- **Spacing rhythm**: 4 / 8 / 16 / 24 / 32 px.
- **Typography**: Geist (default).
- **Status colors**: green (connected/success), amber (pending/disconnected), red (banned/failed), neutral (ended).
- **Production-grade visual layer is delegated to the `frontend-design:frontend-design` skill during implementation** — it handles spacing, hierarchy, and feel.
### Layout shape
| Viewport | Shell |
|---|---|
| Mobile (<640px) | Top app bar (title + back) + bottom nav (Dashboard / Accounts / Reminders / Settings). Sheets for filters, dialogs for confirms. |
| Desktop (≥640px) | Left sidebar (collapsible) with same nav items + secondary nav for "New Account" / "New Reminder". Main content area with breadcrumbs at top. |
## 11. PWA
- `app/manifest.webmanifest` name, short name, theme color, 192px + 512px icons, `display: standalone`, `start_url: /`, `background_color`.
- Service worker via **`@serwist/next`** (Workbox successor designed for App Router):
- Cache app shell (HTML for navigation routes, CSS, JS, fonts) instant launch after first visit.
- Network-first for data routes (so live data still wins).
- Static assets cache-first.
- Offline fallback page rendered if no network.
- iOS install via `apple-mobile-web-app-capable` + `apple-touch-icon` meta tags.
- "Install on home screen" prompt rendered on the dashboard if `beforeinstallprompt` fires.
## 12. Telegram removal (must happen first in implementation)
The plan-3 implementation **starts** with deleting Telegram-related code so the bot container builds clean afterward.
### Files / modules deleted
- `apps/bot/src/telegram/` entire directory (bot.ts, callbacks.ts, menus.ts, state.ts, commands/*, middleware/*)
- `apps/bot/src/media/ingest.ts` Telegram-side download (replaced by web upload action)
- Telegram-specific tests in `apps/bot/src/telegram/**/*.test.ts`
### Files modified
- `apps/bot/src/index.ts`: drop createTelegramBot / tg.start / shutdown.tg.stop. Replace with `startCommandConsumer(boss)` from a new `apps/bot/src/ipc/command-consumer.ts`.
- `apps/bot/package.json`: remove `grammy`, keep `qrcode` (still needed for QR PNG rendering, but **moves usage** see below).
- `apps/bot/src/whatsapp/qr-renderer.ts` stays (called from the new IPC consumer's pair-handler).
### New modules
- `apps/bot/src/ipc/command-consumer.ts` subscribes to Postgres `LISTEN bot.command`, dispatches:
- `account.start_pairing` starts Baileys session, wires QR/open/close events to `pgNotify('web.event', …)`
- `account.unpair` existing unpair logic
- `account.sync_groups` group sync
- `group.send_test` existing send-test
- `apps/bot/src/ipc/notify.ts` typed `pgNotify(event, payload)` helper.
### Env keys removed
- `TELEGRAM_BOT_TOKEN`
- `TELEGRAM_OPERATOR_WHITELIST`
- `TELEGRAM_QR_CHAT_ID`
`SEED_OPERATOR_TELEGRAM_ID` still exists for backwards-compat with the seed script but the value loses its meaning; we keep the seeded operators row for audit log foreign keys.
### Cleanup tests
- All vitest tests under `apps/bot/src/telegram/` deleted along with the source.
- New tests: IPC consumer dispatch tests (mocked PG client), web's pair-flow server action tests (against a real test DB).
## 13. Error handling
| Failure | Detection | Response |
|---|---|---|
| WA send transient | sender throws | pg-boss retries 3× with backoff (already in plan 2). On final failure, reminder_run_targets row gets `status='failed'`. SSE pushes `reminder.failed` toast in UI. |
| WA session lost | Baileys close event | account row `disconnected`. SSE pushes `session.disconnected` status badge in /accounts goes amber. Auto-reconnect after 5 sec. |
| Pair timeout | bot's 5-min timer | Account row deleted. SSE pushes `session.timeout` page navigates to "try again" view. |
| Server action validation | zod parse fails | Returns `{ ok:false, errors: { field: msg } }`. Form re-renders with field-level errors. |
| Postgres unavailable | drizzle throws | Both containers log error, restart via Docker. UI shows a banner "Reconnecting…" if the SSE channel drops. |
| Media upload exceeds limit (50MB) | server action rejects | Returns error; UI shows "File too large". |
| SSE channel drops | EventSource fires `error` | Client reconnects with exponential backoff (built into `EventSource`). |
## 14. Observability
- **Logs**: pino JSON to stdout, captured by Docker (unchanged).
- **Health endpoints**:
- Web: `GET /api/health` DB ping + commit SHA + uptime.
- Bot: internal `:8081/health` DB ping + per-WA-session counts.
- **Per-reminder audit trail** stays in DB.
- **Sentry hookup** deferred (out of scope for this design).
## 15. Build, deploy, and dev experience
- New Dockerfile: `docker/web.Dockerfile` (currently a placeholder). Multi-stage: deps build (`pnpm --filter @cmbot/web build` `.next/standalone`) runtime (`node apps/web/.next/standalone/server.js`).
- New service in compose: `web` (replaces the existing placeholder).
- `apps/web/` package: `@cmbot/web`, depends on `@cmbot/db` and `@cmbot/shared` workspaces.
- aaPanel reverse proxy: existing config block updated to forward to `web:3000` and pass through SSE headers; deny `/api/*` except `/api/events`.
Local dev:
- `scripts/dev.sh up` brings web alongside tools + bot.
- Hot reload: web mounts `apps/web/src` (and dependent packages) into the container; `next dev` watches.
## 16. Out of scope (for this plan)
- Recurring reminders (RRULE) same plan-2 deferral; web wizard supports one-off only for now.
- Standalone media library page media is attached to reminders, not browseable separately yet.
- E2E browser tests (Playwright) manual test runbook in plan 3 covers verification.
- Sentry / external error tracking.
- WebPush notifications (the operator already gets WhatsApp messages on his phone; PWA badging is enough).
- Multi-operator (still single-tenant).
- Passkeys / WebAuthn (only relevant if we add auth later).
## 17. Confirmed values
Inherits from the master spec:
- Subdomain: `wabot.04080616.xyz`
- Default timezone: `Asia/Kuala_Lumpur`
- Postgres: `192.168.0.210` / `wabot`
- Media retention: 90 days
New for this design:
- Component library: shadcn/ui
- Visual style: clean utility / admin dashboard
- Auth: none (URL is the secret)
- Real-time: SSE
- Service worker: `@serwist/next`