docs: consolidate windowed-fanout spec/plan with ETA + paused/resume

Folds in three rounds of requirement evolution:

* Pause/resume on window close (was stop-and-report-partial).
* ETA preview pill at compose / edit time so the operator sees
  whether their chosen window will fit before scheduling.
* Interactive paused-run banner with Resume / Cancel buttons on the
  detail page; pause notification deep-links to it.

Helper relocations:

* windowEndAt() moves to packages/shared so both bot fire-reminder
  and the web ETA pill can import the same calculator.

Plan grows from 8 to 10 tasks: adds Task 9 (run-eta + RunEtaPill,
TDD) and Task 10 (resume/cancel actions + PausedRunBanner).
Acceptance gains two paused-flow smoke tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
yiekheng 2026-05-10 14:33:51 +08:00
parent c4d4f1dda7
commit 082a70db06
2 changed files with 877 additions and 93 deletions

File diff suppressed because it is too large Load Diff

View File

@ -100,6 +100,34 @@ cannot finish inside the window, send what we can and stop.
too would mean holding messages from a 4am cron miss-fire until
6am, which is a v2 conversation.
## Estimated finish time (ETA preview)
The operator sets the delivery window, but they need a feel for
whether the window is *enough*. We surface an ETA at compose and
edit time so they can widen the window (or shrink the run) before
hitting Schedule.
- A pure helper `estimateRunDuration({ targetCount, ratePerMinute })`
returns `{ durationMinutes, estimatedFinishAt }` given a fire time.
Calculation: `ceil(targetCount / ratePerMinute)` minutes plus a
15% buffer for per-group setup latency, with a 1-minute floor.
- Default `ratePerMinute` reads `BOT_MAX_SEND_PER_MINUTE` (40); the
number is hard-coded into the web bundle as a constant — operators
who tune the bot env are responsible for redeploying web. The web
side does NOT read bot env directly.
- Displayed in two places:
- **Wizard Review step**, between the recipients summary and the
Schedule button:
`"~28 minutes · finishes ~10:32 (Asia/Kuala_Lumpur)"`
- **Edit Groups** and **Edit When** pages, near the save button.
- Style:
- Green pill `"Fits in window"` when `estimatedFinishAt <= windowEndAt`.
- Amber pill `"Likely to pause"` when it doesn't, with a one-line
suggestion: *"Widen the window or split into smaller runs."*
- The ETA is advisory, not a hard gate — the operator can still
schedule a run that's likely to pause; pause-and-resume covers
that case. The ETA just removes the surprise.
## Run loop changes (`fire-reminder.ts`)
Up-front, once per run:
@ -182,22 +210,36 @@ UI surfaces of paused runs:
Success/Partial/Failed/Skipped/Archived filters. Resume button
inline on each paused row.
- Reminder detail page's run history shows the same Resume button on
paused rows.
paused rows. A prominent banner at the top of the detail page
surfaces the latest paused run with two buttons side-by-side:
**Resume** (re-enqueues the run via the action above) and
**Cancel run** (marks the run `partial` so it stops appearing as
paused; pending targets flip to `skipped` with `error="canceled by
operator"`). The banner is the operator's "interactive"
resume/cancel choice referenced from the pause notification.
- The `reminder.fired` SSE event for status=paused triggers a
notification with title "Reminder paused" and body
`"X of Y groups delivered. Resume from the Activity tab."`
`"X of Y groups delivered. Tap to resume or cancel."` Clicking the
notification deep-links to the detail page where the banner lives.
Note on the Notifications API: page-side `new Notification()` does
not support inline action buttons (only service-worker push
notifications do). The "interactive" choice is therefore one
click into the detail page — fewer surfaces to keep in sync, no
service worker required.
## Notification body
The existing `reminder.fired` SSE event already carries `{ status }`.
The notification mapper extends:
We extend it to carry `sent` and `total` counts so the notification
can be specific. The notification mapper:
- `success` → unchanged.
- `partial` → body mentions delivered/total counts when present.
- `paused` → headline `"Reminder paused"`, body
`"X of Y groups delivered. Resume from the Activity tab."` Click
takes the operator to the reminder's detail page where the Resume
button lives.
`"X of Y groups delivered. Tap to resume or cancel."` Click
takes the operator to the reminder's detail page where the
Resume / Cancel banner lives.
- `failed` → unchanged.
- `skipped` → still filtered (bookkeeping noise).
@ -211,13 +253,16 @@ The notification mapper extends:
| `apps/bot/src/scheduler/rate-limiter.ts` (new) | per-account token bucket | ~60 |
| `apps/bot/src/scheduler/media-upload-cache.ts` (new) | `prepareWAMessageMedia` results, keyed by mediaId | ~50 |
| `apps/bot/src/scheduler/delivery-window.ts` (new) | pure window-end calculator | ~30 |
| `apps/bot/src/scheduler/fire-reminder.ts` (rewrite) | new loop using all of the above | ~200 |
| `apps/bot/src/scheduler/fire-reminder.ts` (rewrite) | new loop using all of the above | ~220 |
| `apps/bot/src/scheduler/reminder-jobs.ts` | `teamSize` config | <10 |
| `apps/bot/src/env.ts` | `BOT_FIRE_CONCURRENCY`, `BOT_MAX_SEND_PER_MINUTE`, `BOT_GROUP_CONCURRENCY` | <20 |
| `apps/web/src/actions/reminders.ts` | accept the two new fields | <30 |
| `apps/web/src/actions/reminders.ts` | accept the two new fields + `resumeReminderRunAction` + `cancelReminderRunAction` | ~80 |
| `apps/web/src/components/reminder-wizard/when-form-client.tsx` | "Delivery hours" inputs | <40 |
| `apps/web/src/components/reminder-edit/edit-when-form.tsx` | same | <30 |
| `apps/web/src/lib/notifications.ts` | partial-status body extension | <15 |
| `apps/web/src/lib/run-eta.ts` (new) | pure ETA calculator | ~40 |
| `apps/web/src/components/reminder-wizard/run-eta-pill.tsx` (new) | shared green/amber pill component | ~50 |
| `apps/web/src/components/reminder-detail/paused-run-banner.tsx` (new) | "Resume / Cancel run" banner | ~70 |
| `apps/web/src/lib/notifications.ts` | paused + partial body extension | <30 |
## Tests
@ -232,8 +277,18 @@ The notification mapper extends:
- `media-upload-cache.test.ts` — mock socket: `prepare` called once
per unique mediaId regardless of how many groups consume it.
- `fire-reminder.test.ts` (extend) — window-end gate marks remaining
targets `skipped`; partial-status error_summary includes account /
delivered / total context.
targets `skipped` (failed-from-the-start path) or leaves them
`pending` and resolves the run `paused` when at least one send
succeeded; resume re-attaches and only re-attempts `pending` rows.
- `run-eta.test.ts` — pure ETA helper: 1000 groups @ 40/min returns
~29 minutes (with the 15% buffer), edge cases (0 groups → 0,
rate=0 → throws, fractional minutes → rounded up).
- `notifications.test.ts` (extend) — `paused` body reads
`"X of Y groups delivered. Tap to resume or cancel."`; `partial`
body uses sent/total when present.
- `paused-run-banner.test.tsx` — banner only renders when the latest
run's status is `paused`; Resume click triggers the action;
Cancel click triggers the cancel action.
## Tuning knobs (env)
@ -262,6 +317,12 @@ default to 6/18 and can be widened (e.g. 0/24) for a specific big run.
pause mid-fan-out isn't wired).
- **Retry-failed-targets** action (paused-resume only re-attempts
`pending` rows; `failed` rows stay failed).
- **Native push action buttons** (would require a service worker +
push endpoint; v1 keeps the resume/cancel choice on the detail
page, one click away from the notification).
- **Adaptive ETA from observed rate** (today the ETA uses the
configured `BOT_MAX_SEND_PER_MINUTE`; a v2 could feed back the
actual sustained rate from prior runs).
- **Multi-account auto-split** of a single reminder.
- **Adaptive rate limiting** (auto-back-off on WA rate-limit response
codes; today the operator tunes the env var).
@ -270,16 +331,20 @@ default to 6/18 and can be widened (e.g. 0/24) for a specific big run.
- 1000-group reminder with one image, established account: completes
in roughly 3050 minutes, comfortably inside a 6am6pm window.
- Wizard Review shows ETA pill before submit. Setting an end hour
that won't fit flips the pill amber and surfaces the "Likely to
pause" hint; the operator can still proceed.
- Two reminders on different accounts firing within seconds of each
other: both progress simultaneously, neither blocks the other.
- A run that hits the window end mid-fan-out: stops cleanly, marks
the run `paused`, leaves un-started targets as `pending`, surfaces
the paused-status notification with delivered/total counts.
- The operator clicks **Resume** on a paused run — fan-out continues
from the unsent targets, respecting the same per-account rate
limit + window. If it again can't finish, it pauses again with an
updated count.
- The detail page surfaces a Resume / Cancel banner for the paused
run. **Resume** re-enqueues; if it pauses again, the banner
re-appears with an updated count. **Cancel run** flips remaining
targets to `skipped` and resolves the run `partial`; banner
disappears.
- A run that hits the window end BEFORE any send (fired too late):
resolves `failed`, no resume offered.
- 355 existing tests still pass; ≈30 new tests cover the new helpers
and the paused/resume flow.
- 355 existing tests still pass; ≈40 new tests cover the new helpers,
the paused/resume flow, the ETA preview, and the banner.