# Windowed, Pacing-Safe Reminder Fan-Out > Design spec for a faster, ban-safe, multi-account-friendly reminder > delivery loop. Written 2026-05-10. Implementation tracked in a > follow-up plan doc. ## Goal Deliver a reminder to many groups (target: 1000+) safely within a per-reminder delivery window. If we cannot finish in the window, stop, mark the run `partial`, and tell the operator the account is at capacity for this fan-out. ## Constraints - WhatsApp's anti-spam is the dominant ceiling. For an established account (years of legit history), 30–60 sends/minute is the sustainable safe band; tighter for newer accounts. - The system runs on a single bot process talking to multiple paired WhatsApp accounts. Each account's Baileys socket is independent. - Two simultaneous fan-outs on the **same** WhatsApp account would double its effective send rate and risk a ban. - The operator dropped multi-account fan-out (one reminder splitting across N accounts) earlier this week. We respect that decision — this design does not automatically split work across accounts. ## Approach (selected: B) **A. Minimal pacing fix.** Drop the rigid 1.5s sleep, add a token-bucket rate limiter, add window-end check, cache DB lookups. Wins ≈30% on text-only reminders; very little on media-heavy ones. **B. Pacing + media-upload cache + bounded concurrency.** Everything in A, plus upload each unique media file ONCE per run via Baileys' `prepareWAMessageMedia` and reuse the resulting `WAMediaUpload` payload for every group send. Run up to N groups in parallel within one account (parts within a group stay serial so order is preserved). Wins are massive on text + picture: 1000 groups × 5 MB = 5 GB of upload turns into 5 MB. **Recommended.** **C. Multi-account fan-out** — dropped per operator decision. ## Per-account isolation (cross-account parallelism) Today `boss.work()` is called with default `teamSize=1`, so a single fan-out monopolises the whole bot. Two reminders on **different** accounts queue serially, which surprises the operator. The new model is **per-account serialization, cross-account parallelism**: - `teamSize` raised so multiple reminders on different accounts run simultaneously. - A per-key async mutex keyed by `accountId` wraps the inner work, so two reminders on the **same** account take turns. - The token-bucket rate limiter is per-account too, so one account's pacing budget never throttles another. ``` pg-boss worker pool (teamSize = BOT_FIRE_CONCURRENCY, default 8) ├─ R1 (account A) ──┐ │ ├─ per-account-A mutex ──→ serialised within A ├─ R3 (account A) ──┘ │ ├─ R2 (account B) ──── per-account-B mutex ──→ parallel with A's │ └─ R4 (account C) ──── per-account-C mutex ──→ parallel with A and B ``` ## Delivery window Each reminder gets a window in its operator timezone. If the run cannot finish inside the window, send what we can and stop. - New columns on `reminders`: - `delivery_window_start_hour int default 6` - `delivery_window_end_hour int default 18` - Both interpreted in the row's existing `timezone` column. - Validation: `0 ≤ start < end ≤ 24`. Cross-midnight windows (e.g. 22 → 06) are rejected in v1 to keep the math obvious; can be added later if anyone needs them. - UI uses two number inputs in the When step (and edit-when page). - `delivery-window.ts` exports a pure helper: `windowEndAt(timezone, endHour, fireAt) → Date`. Returns the end-of-window timestamp for the calendar day `fireAt` falls on, in the given timezone. If `fireAt` is already past that day's end-hour, the returned timestamp is in the past — the run loop's first iteration sees `now() >= windowEndAt`, marks every target `skipped`, and the run resolves to `failed` (zero sent). That's the right behaviour: "we can't send after window close, even one message". - **Only the end hour is enforced at runtime in v1.** The start hour is documented on the row but not gated — operators schedule fire times that fall in their band naturally (cron + the picker's default 09:00 time fields land inside 06–18). Enforcing the start too would mean holding messages from a 4am cron miss-fire until 6am, which is a v2 conversation. ## Run loop changes (`fire-reminder.ts`) Up-front, once per run: 1. Load all `reminder.targets`, `reminder.messages`, and referenced `media_files` rows into in-memory Maps. Drops ~3000 round-trips to ~3 round-trips for a 1000-group run. 2. Pre-create every `reminder_run_targets` row with `status = "pending"` so progress is observable from the Activity tab while the fan-out is mid-flight. 3. **Pre-upload each unique media** via Baileys' `prepareWAMessageMedia`. Cache the resulting `WAMediaUpload` payload keyed by `mediaId` for the duration of the run. 4. Compute `windowEndAt` and stash it. Per-target (limited to `BOT_GROUP_CONCURRENCY` parallel groups, default 3): 1. **Window-end gate:** if `Date.now() >= windowEndAt`, mark the target `skipped` with `error="delivery window closed"` and skip. 2. **Already-sent gate:** if the run-target row is already `sent` (i.e. a retry is replaying), skip. 3. Acquire a token from the per-account rate limiter (default 40 msg/min, configurable `BOT_MAX_SEND_PER_MINUTE`). 4. `assertSessions(group)` — call once per group, cache for the run. 5. For each part in `reminder.messages`: - text → `socket.sendMessage(jid, { text })` - media → `socket.sendMessage(jid, uploadedMediaCache[mediaId])` - sleep `jitter(200..500 ms)` between parts (replaces the rigid 1.5 s wait — preserves per-chat ordering at WA's natural pace). 6. Update the run-target row to `sent` with latency. Final status: - **success** — every target sent. - **partial** — at least one sent, at least one not (window-close, failed, missing group). `error_summary` reads: `"Delivery window closed at 18:00 (Asia/Kuala_Lumpur). 412 of 1000 groups delivered. This account is at capacity for this fan-out — consider sending the remainder from another paired account."` - **failed** — zero sent. ## Notification body The existing `reminder.fired` SSE event already carries `{ status }`. The web's notification mapper already handles `partial` with a "see activity" hint. The body extends to mention "X of Y delivered" when status === "partial". ## Components | File | Role | LOC est. | | --- | --- | --- | | `migrations/0008_*.sql` | add 2 int columns to `reminders` | <20 | | `packages/db/src/schema.ts` | drizzle alignment | <10 | | `apps/bot/src/scheduler/per-key-mutex.ts` (new) | accountId-keyed async mutex | ~40 | | `apps/bot/src/scheduler/rate-limiter.ts` (new) | per-account token bucket | ~60 | | `apps/bot/src/scheduler/media-upload-cache.ts` (new) | `prepareWAMessageMedia` results, keyed by mediaId | ~50 | | `apps/bot/src/scheduler/delivery-window.ts` (new) | pure window-end calculator | ~30 | | `apps/bot/src/scheduler/fire-reminder.ts` (rewrite) | new loop using all of the above | ~200 | | `apps/bot/src/scheduler/reminder-jobs.ts` | `teamSize` config | <10 | | `apps/bot/src/env.ts` | `BOT_FIRE_CONCURRENCY`, `BOT_MAX_SEND_PER_MINUTE`, `BOT_GROUP_CONCURRENCY` | <20 | | `apps/web/src/actions/reminders.ts` | accept the two new fields | <30 | | `apps/web/src/components/reminder-wizard/when-form-client.tsx` | "Delivery hours" inputs | <40 | | `apps/web/src/components/reminder-edit/edit-when-form.tsx` | same | <30 | | `apps/web/src/lib/notifications.ts` | partial-status body extension | <15 | ## Tests - `delivery-window.test.ts` — pure function. Window in past → next-day end; window crosses midnight (start > end) — explicitly reject in the schema; timezone offsets handled correctly. - `rate-limiter.test.ts` — fake-clock token bucket. N tokens drained, then refill rate; backpressure via `acquire()` returning a Promise. - `per-key-mutex.test.ts` — different keys do NOT block each other (parallelism); same key DOES (serialisation); a throwing handler releases the lock; cleanup removes empty entries. - `media-upload-cache.test.ts` — mock socket: `prepare` called once per unique mediaId regardless of how many groups consume it. - `fire-reminder.test.ts` (extend) — window-end gate marks remaining targets `skipped`; partial-status error_summary includes account / delivered / total context. ## Tuning knobs (env) | Var | Default | Effect | | --- | --- | --- | | `BOT_FIRE_CONCURRENCY` | 8 | pg-boss worker pool size; max accounts running simultaneously | | `BOT_GROUP_CONCURRENCY` | 3 | per-account parallel group sends | | `BOT_MAX_SEND_PER_MINUTE` | 40 | per-account token-bucket rate; loosen to 60 if no flags after weeks of running, tighten to 20 if any rate-limit response | Per-reminder `delivery_window_start_hour` / `delivery_window_end_hour` default to 6/18 and can be widened (e.g. 0/24) for a specific big run. ## Out of scope (v2 candidates) - **Crash resumability across bot restarts.** Today, if the bot dies mid-fan-out, pg-boss will retry the job; the loop will skip any rows already marked `sent`, but the in-memory rate-limiter and upload-cache state are gone — meaning the retry uploads media again and starts pacing from a full bucket. Acceptable for v1. - **Pause / resume mid-run** controls. - **Cross-day window resume** (current design hard-stops at window end and reports partial; doesn't queue the remainder for tomorrow). - **Multi-account auto-split** of a single reminder. - **Adaptive rate limiting** (auto-back-off on WA rate-limit response codes; today the operator tunes the env var). ## Acceptance - 1000-group reminder with one image, established account: completes in roughly 30–50 minutes, comfortably inside a 6am–6pm window. - Two reminders on different accounts firing within seconds of each other: both progress simultaneously, neither blocks the other. - A run that hits the window end: stops cleanly, marks remaining as skipped, surfaces the partial-status message in the Activity tab and via the browser notification. - 355 existing tests still pass; ≈25 new tests cover the new helpers.