172 Commits

Author SHA1 Message Date
adaf087a5f feat(web): drop '· last admin' label from user row
The Demote/Delete buttons are already disabled with proper tooltips
implied by their disabled state; the extra inline label was visual
clutter on the only-admin's own row.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:44:41 +08:00
f69652d43b feat(web): AES-GCM cookies + per-username/global rate limit + origin check
Three layers of login hardening pulled together — addresses the
"don't let middleman / robot easily log in by mimicking headers"
follow-up.

1. AES-256-GCM session cookie (apps/web/src/lib/auth-cookie.ts)

   The old format was base64-encoded JSON + HMAC-SHA256 signature, so
   anyone with the cookie could read userId/role straight off the
   bytes. Switched to AES-GCM authenticated encryption: the payload
   is encrypted with a 256-bit key derived from AUTH_SECRET via
   SHA-256, a fresh 12-byte nonce is drawn per encryption (never
   reused — locked in by test), and tampering with either the IV or
   ciphertext fails the GCM auth tag → decrypt throws → null.

   Cookie format: <base64url(iv)>.<base64url(ciphertext+tag)>

   Existing cookies become invalid on deploy because the IV portion
   doesn't decode to 12 bytes — middleware bounces them to /login.
   No env bump needed; users just sign in once with the new secret.

2. Three-layer rate limit on loginAction

   Old: per-IP only. An attacker with a residential-proxy pool or
   spoofed X-Forwarded-For could hop IPs and brute one account.
   New: Promise.all of three checkRateLimit calls
     - per-IP        login:<ip>          10 / 5 min
     - per-username  login-user:<lower>  5 / 15 min
     - global        login-global        100 / min (backstop)
   First-hit wins; logger captures which limit tripped (ip / username
   / global) without telling the attacker which one.

3. Action-level Origin/Host check

   serverActions.allowedOrigins already does this at the framework
   layer; running it inside loginAction lets us log the mismatch and
   reject before bcrypt + DB. Missing Origin treated as same-origin
   (RFC: same-origin POSTs may omit it). Malformed Origin → reject.

Tests:
  - auth-cookie.test.ts updated to AES-GCM (15 tests, +4 vs HMAC):
    fresh IV per encryption, ciphertext doesn't leak userId/role,
    IV-swap rejected, ciphertext-tamper rejected, wrong-length IV
    rejected, malformed b64 doesn't throw.
  - auth.test.ts adds 7 new cases: three-layer key shape, per-username
    limit alone trips, global limit alone trips, cross-origin rejected,
    same-origin accepted, missing-Origin treated as same-origin,
    malformed-Origin rejected.

Web suite 453 → 463 tests, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:41:49 +08:00
6942745085 test(bot): cover the reschedule corner case in scheduleReminderFire
Lock down the pre-send cancel that fixed the dropped 8:20 PM fire:

  - cancel UPDATE always runs BEFORE boss.send (regression: stately
    dedupe silently rejected the new send when a stale created job
    existed; now we tombstone the stale row first)
  - cancel scopes to state='created' only (active and completed jobs
    must survive — they're in-flight or historical)
  - cancel filters by THIS reminder's singletonKey (no cross-reminder
    cancellation)
  - boss.send still receives singletonKey + startAfter + retryLimit
  - first-time schedule (zero stale rows) still calls send
  - cancel UPDATE error degrades to "send anyway" — the handler-level
    recent-run dedupe will catch any duplicate that lands
  - boss.send returning null is surfaced (so the caller's logger
    captures jobId: null instead of silently treating it as success)

77 bot tests now (was 70).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:33:29 +08:00
2e6fbfa7a5 fix(bot): reschedule was silently dropped under stately policy
Scheduled reminder for May 10 8:20 PM never fired. Bot logs showed
"reminder.fire: scheduled" with jobId: null at 12:18 UTC — pg-boss
returned null because the queue was on policy=stately, which dedupes
sends across the (created/active/retry) state cone by singletonKey.
A previous schedule for the same reminder (next recurring fire,
created earlier) was still in 'created' state, so the new send for
today 8:20 PM hit the dedupe and was silently rejected.

Two fixes:

  1. Switch the queue policy back to 'standard' (the default) and
     force-flip any existing 'stately' queue row on boot. Standard
     lets us enqueue across reschedules.
  2. scheduleReminderFire now does a pre-send cancel: any 'created'
     job for this singletonKey is moved to 'cancelled' before the new
     boss.send. The new schedule wins; old stale jobs are tombstoned
     so the recurring/edit path produces exactly-one upcoming fire.

Duplicate-fire safety (the 'qwerd msg three times' bug) is already
covered at the handler level by the inner-mutex recent-run check
inside fireReminderInner — that's what stately was guarding against,
and the inner check works under standard too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:27:52 +08:00
991b7ae0ab fix(web): swallow click after a swipe so dragging a row does not navigate
Repro: on the reminders list, click-and-drag a card to swipe — the
shelf opened AND the wrapped Link fired its click, so the operator
landed on the reminder detail page mid-swipe.

Track a dragMoved ref in SwipeableRow that flips true when the
pointer travels past the standard 6 px tap threshold. On pointerup,
if dragMoved is set, register a one-shot capture-phase click handler
on the row container that preventDefault + stopPropagation. The
synthetic click the browser fires on pointerup is intercepted before
it reaches the anchor's onClick, so the row stays put after a swipe
and a real tap (under 6 px movement) still navigates as before.

A 350ms safety timeout strips the listener if no click materialises
(pointerup landed outside the element) so a later legitimate click
isn't accidentally swallowed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:54:31 +08:00
b293bbf142 fix(web): suppress native drag on SwipeableRow so anchors do not eat the swipe
Reminders and activity rows wrap their card in Link, and anchors are
natively draggable. As soon as the operator moves horizontally the
browser kicks into drag-link mode and the pointer events never reach
SwipeableRow handlers — left/right swipe-to-Pause/Delete silently
broke on the reminders list.

Add onDragStart preventDefault + draggable=false to the row body once
and every SwipeableRow consumer is fixed in place. The existing pan-y
touch-action stays — together they give us pointer control on both
desktop and mobile.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:52:11 +08:00
a789b61e1f fix(bot): triple-fire reminder bug — force pg-boss policy + close TOCTOU dedupe
Repro: fire a reminder, message lands 2-3 times in WhatsApp (logs
showed three 'fire-reminder: done' entries within 1.5 s for the same
reminderId).

Two interlocking root causes:

  1. The queue was created at 'standard' policy (pre-dating the
     stately rollout). pg-boss's createQueue is idempotent and DOES
     NOT update the policy on an existing queue row, so re-deploying
     the code that requested policy=stately silently kept the
     standard policy. Standard accepts duplicate enqueues with the
     same singletonKey — three reminder.fire jobs for the same
     reminderId could all land at once.

  2. The handler-level recent-run dedupe was TOCTOU. The check ran
     OUTSIDE the per-account mutex, so three concurrent invocations
     all read 'no recent run', then queued up on the mutex one at a
     time and each INSERTed a fresh run + sent the message.

Fixes:

  - registerReminderJobs now forces the queue policy via direct SQL
    (UPDATE pgboss.queue SET policy = 'stately' WHERE name = ...
    AND policy <> 'stately') on every boot. Idempotent + survives
    pre-existing standard-policy rows.
  - fireReminderInner re-checks for a recent run AFTER the mutex is
    held but BEFORE the INSERT. By that point any concurrent winner
    has already inserted, so the duplicate sees the row and bails
    cleanly.

New test in fire-reminder.test.ts (the TOCTOU repro): outer check
returns no recent run, inner check returns a freshly-inserted one,
asserts the mutex was acquired but the second findFirst was hit
(i.e. we got past the outer check and the inner check stopped us).

Verified live: pgboss.queue.policy is now 'stately' for reminder.fire.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:43:41 +08:00
e800882d15 fix: 'Pause sending by' is off by default everywhere
The optional 'Pause sending by' deadline was defaulting to 18 (= 6 PM)
in three places:
  - reminders.delivery_window_end_hour schema default (NOT NULL DEFAULT 18)
  - createReminderAction / editScheduleAction fallback when the field
    is missing on the input
  - the Zod refine validator's secondary fallback

Net effect: any reminder created before this change has 18 in the DB,
so the edit form's checkbox flips ON automatically (the wizard treats
'value !== undefined && value !== 24' as 'opted in'). The wizard's
own create flow always sends 24 explicitly when the box is unchecked
— but legacy / direct API payloads + the schema default for older rows
don't carry that intent through.

Switch every default to 24 (the off-sentinel the wizard already uses)
so the optional toggle stays off until the operator ticks it. New
migration 0012 also backfills existing rows from 18 → 24 so editing
old reminders no longer auto-checks 'Pause sending by'.

Tests in when-form-deadline.test.tsx already lock in the UI contract
(off when initialDeliveryEndHour is undefined or 24, on for any other
value). No assertion changes needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:30:09 +08:00
5c48e0e85f fix(web): wire Refresh Groups button to syncGroupsAction with live SSE refresh
The button was a placeholder that submitted to a no-op server action,
so clicking did nothing. Replace with a small client component that:

  1. Calls syncGroupsAction(accountId) to pgNotify the bot.
  2. Listens for the bot's groups.synced event over SSE and
     router.refresh()es when it arrives so the new rows appear without
     a manual reload.
  3. Disables the button + shows a Syncing… label while the sync is
     in flight, with a 15s safety timeout if the bot or SSE channel
     drops so the spinner doesn't strand.

Drop the in-place <form action={async() => 'use server'}> placeholder.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:13:24 +08:00
40d788302c test(bot): cover post-pair-restart re-warming sequence
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:10:46 +08:00
d0db248460 fix(bot): re-warm pairing flag on post-pair-restart close
After a successful QR scan Baileys closes with status 515 and the
session-manager schedules a reconnect via setTimeout(stop().then(start)).
That cleanup stop emits a SECOND close event which arrived at our
pair-handler listener with warmingUp already cleared (the first qr
cleared it). The decision then resolved to 'treat-as-timeout',
detaching the listener and pushing session.timeout to the UI right at
the moment the user actually paired successfully — pairing then
silently completed in the DB but the UI never got session.connected.

Fix: re-arm pairingWarmingUp inside the post-pair-restart branch so
the cleanup-stop's close is swallowed too. Cleared again by the
following qr/open from the freshly-reopened socket, which then emits
session.connected to the UI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:09:08 +08:00
7af7aa35d0 test(bot): cover the back→re-pair close-leak regression
Extract the pair-handler's close-event decision into a pure helper
decidePairListenerOnClose(warmingUp, restartRequired) returning one of
ignore-leaked-close / post-pair-restart / treat-as-timeout. Refactor
pair-handler to call the helper instead of the inline if-chain.

New tests in pair-state.test.ts:
- warmingUp=true → ignore-leaked-close (regression: prior session's
  close racing the new listener)
- warmingUp=true + restartRequired=true → still ignore (defense in
  depth — a stale 515 must not hand control to the reconnect path)
- warmingUp=false + restartRequired=true → post-pair-restart
- warmingUp=false → treat-as-timeout

Bot suite goes from 60 → 64 tests, all green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:06:39 +08:00
68668ef2cd feat(web): footer reads 'Signed in as <username>' with italic name
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:04:39 +08:00
fe8e14b7a0 fix(bot): swallow leaked close from previous pairing attempt
Repro: scan QR window once → click Back → click Pair again → instantly
see 'Pairing timed out' (sometimes for several attempts in a row).

Root cause: when handleStartPairing hits a still-running session it
calls await sessionManager.stop(accountId) and immediately attaches a
fresh listener. session.close() resolves before sessionManager broadcasts
the close event to listeners (handleEvent has several awaits between
close arriving and the listener fan-out). The new listener was already
attached by then and saw the OLD session's close as if it were the new
session timing out — flipped the row to unpaired and pushed
session.timeout to the UI.

Fix: track a per-account 'pairingWarmingUp' Set. The new attempt enters
warming-up the moment its listener attaches; clears on the first qr or
open (those events can only come from the freshly-started session). A
close that arrives while still warming is logged and ignored. abandonPair
also clears the flag for safety.

Also drop the redundant Admin card from /settings — the Admin nav entry
on the sidebar/drawer already routes admins to /settings/users, the
extra card was duplicate UI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:02:10 +08:00
dbdb156a09 fix(web): drop redundant Close button from account dialogs
DialogFooter showCloseButton was rendering a third button (Close) next
to the Cancel + 'Yes, delete' / 'Yes, unpair' pair. The corner X icon
already closes the dialog, so the extra button was just visual noise.
Drop the prop on both account-card dialogs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:57:06 +08:00
6759ca8131 fix(web): client-component delete/unpair cards on accounts/[id]
The DialogTrigger asChild + transparent button overlay pattern wasn't
emitting a clickable button in the rendered DOM under radix-ui 1.4 +
Next 16 (server component context), so Delete and Unpair both became
no-ops. Replace each with a small client component that:
  - holds open-state for the confirm Dialog
  - drives the Card itself as the click target via role='button',
    tabIndex, onClick, and Enter/Space keydown handlers
  - calls the server action through useTransition

The Card stays a div (no <button> wrapping a Card → satisfies the
existing static-guard test). Removed the unused inline Dialog imports
and unpair/delete icons from the page.

Also trim the forgot-password dialog body to one sentence per request
('don't write too detail').

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:55:29 +08:00
5d583d9194 fix(web): forgot-password dialog, settings tagline, account dialog triggers
- Login page: replace static 'Forget Password? Contact IT' line with a
  proper dialog button. Clicking opens an explanatory dialog (self-
  service reset is intentionally disabled; admins can reset from
  /settings/users or run scripts/set-password.sh).
- /settings: drop the 'cm WhatsApp Bot · self-hosted' tagline.
- /accounts/[id]: Unpair + Delete cards weren't responding to clicks.
  Restructure so the transparent <button> overlay is a sibling of
  <Card> inside a <div className='relative'> wrapper (mirrors the
  working Pair/Re-pair pattern). The previous layout placed the
  DialogTrigger inside the Card, which produced no clickable button
  in the rendered DOM under radix-ui 1.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:50:28 +08:00
c493101b60 feat(web): password policy, sign-out, dashboard isolation, activity tweaks
Multi-fix batch from a rapid feedback round:

- Password policy mirrors Facebook's documented rule (≥6 chars + mix of
  letters with numbers/symbols). Centralised in
  apps/web/src/lib/password-policy.ts; createUserAction,
  resetUserPasswordAction, the AddUser form, and the row Reset-password
  flow all use it. CLI scripts/set-password.ts inlines the same check
  so the bootstrap path stays consistent.
- App shell adds a Sign-out button in both the desktop sidebar footer
  and the mobile drawer footer, with the signed-in username next to it.
  Layout passes username down alongside role. Theme toggle was removed
  from the shell per request — operators don't need it in the chrome.
- Dashboard stats: getDashboardStats was running findMany on reminders
  with NO operator filter, so a brand-new user saw global counts from
  every tenant. Switched to an INNER JOIN on whatsapp_accounts so the
  card on / only counts this user's reminders. (Counts had been showing
  '1 / 1 / 3 / 5' to a fresh user — the cross-tenant leak the user
  flagged.)
- /activity drops the All tab and the Clear-history button. Default
  filter is now Success when no ?filter= is set; Partial keeps fanning
  into Paused + Failed; Skipped still merges into Archived.
- /settings drops the Display name row entirely and only shows the Role
  row to admins. Layout receives username so the shell can also surface
  it next to the Sign-out button.
- Tests: password-policy.test.ts (11 cases), updated users.test.ts to
  use policy-compliant passwords + cover letters-only / digits-only
  rejection, sidebar-footer assertion swapped from theme-toggle to the
  new Sign-out + username markup. 453 tests green; typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:46:29 +08:00
b92ead3a97 feat(web): add-user form + delete confirmation in user management
- New AddUserFormClient on /settings/users (admin-only): username +
  password + role select. Wraps createUserAction.
- UserRowClient gains an isLastAdmin prop and a confirm-dialog before
  delete. Demote and Delete are both disabled on the last remaining
  admin so an admin can't lock everyone out via the UI (server-side
  guards in users.ts already cover the API).
- Page passes isLastAdmin per row and computes adminCount once.
- Role badge uses emerald for admin / slate for user; explicit Promote
  / Demote arrows replace the bidirectional icon.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:36:03 +08:00
4ddf5c094e feat(web): admin nav entry + role-aware AppShell
- Add an Admin nav item (key 'admin', href /settings/users) with
  visibleTo=['admin'] so signed-in users with role='user' don't see it.
- nav-config exposes navItemsForRole(role) helper that filters NAV_ITEMS
  by visibleTo.
- Root layout fetches getCurrentUser() and forwards role into AppShell.
  AppShell narrows the role gate to the rendered nav (sidebar + drawer);
  /login still short-circuits to the bare header. Unknown role falls
  back to 'user' visibility (defense-in-depth).
- Settings page renders an admin-only card linking to Users so admins
  have a discoverable in-app entry point too.

Tests:
- nav-config: navItemsForRole admin/user matrix + admin entry shape.
- app-shell: admin link visible for admin, hidden for user, hidden for
  null/unauthenticated, /login bare header strips nav entirely.
- actions/auth: cookie payload encodes role=user, unknown role rejected,
  AUTH_SECRET-unset path, whitespace-only username rejected, rate-limit
  key contains client IP, unknown-user path still hits DB+bcrypt.

440 tests now (was 423).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:30:58 +08:00
797326e062 feat(web): collapse Skipped→Archived, Partial→Paused+Failed; full-width filter rows
- Activity filter tabs drop Partial and Skipped; Partial runs now appear
  under both Paused and Failed (anything that didn't fully succeed),
  Skipped runs surface under Archived (history the operator chose not
  to send). Five tabs left: All / Success / Paused / Failed / Archived.
- listActivityRuns flips skipped runs out of the default list and into
  the archived view at the SQL layer so pagination stays correct.
- Tabs row spans the full width and wraps onto a second row when the
  viewport can't fit them. Account-filter select also span full width
  on every breakpoint instead of capping at sm:max-w-xs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:26:34 +08:00
ebbbdbdfb8 fix(web): make session cookie secure flag conditional on production
Setting Secure on http://localhost cookies works in Chrome (localhost
exception) but Firefox/Safari silently drop them, so dev users hit
'redirect to /login on every click' after a 'successful' login. Switch
to secure: NODE_ENV === 'production'. Public deploy still gets
Secure-only.

Also swap the login footer copy from a CLI hint to 'Forget Password?
Contact IT' — operator-friendly, doesn't leak the bootstrap
mechanism on the public sign-in screen.

Test updated to assert secure=true under prod NODE_ENV and a new test
locks in secure=false in dev.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:19:59 +08:00
050292a282 feat(web): bare login header — only centred brand mark
The login page lived inside the authenticated AppShell, so the desktop
sidebar (with all nav items) and the mobile menu drawer were rendering
on the sign-in screen. AppShell now branches on pathname=/login and
renders a single centred header (cm + WhatsApp Bot) with no nav, plus
the form. Drops the redundant in-card title since the header carries
the brand.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:14:03 +08:00
b29d137c84 feat: production hardening — robots, allowedOrigins, container non-root, rate limits, CLI bootstrap
robots.ts + metadata.robots blocks indexing.
serverActions.allowedOrigins gates cross-origin Server Action posts.
Bot + web Dockerfiles add a non-root 'app' user (uid 1000) with
chmod 700 on /data/sessions.
sendTestAction grows a per-group rate limit (3/60s).
resumeReminderRunAction + cancelReminderRunAction get a per-IP
rate limit (30/10s).
.env.example documents every required key.
packages/db/src/scripts/{set-password,create-user}.ts + thin shell
wrappers in scripts/ — first admin sets their password via
./scripts/set-password.sh admin before signing in.
2026-05-10 18:05:34 +08:00
67091c294a feat(web): user-management surface (admin only)
createUserAction, setUserRoleAction, resetUserPasswordAction,
deleteUserAction — all gated by requireAdmin(). Self-demote and
last-admin guards prevent the operator from accidentally locking
themselves out. /settings/users page lists every operator with
inline Demote/Promote, Reset password, and Delete buttons. 10 unit
tests.
2026-05-10 18:01:09 +08:00
b77a9d106d feat(web): middleware gates non-allowlisted paths on session cookie
Edge-runtime check via auth-cookie.verifySession. /api/* paths get a
401 (no body) when unauthenticated; pages get a 307 to /login with
the original path encoded into ?next=. Allowlist explicitly excludes
/api/events and /api/qr — both were unauthenticated in v1.1.0 and
let an unauthenticated client snoop the entire SSE event stream and
enumerate paired account QR codes.
2026-05-10 17:57:07 +08:00
5b4787d10e fix(web): typed-routes + redirect-mock signatures in auth.ts
Next.js 16 typed-routes (experimental.typedRoutes in next.config.ts)
narrows redirect()'s parameter to RouteImpl<T>, which a runtime
string from the form can't satisfy. Cast to any with a comment for
the two redirect call sites in auth.ts.

The auth.test.ts redirectMock used `() =>` zero-arg signature, which
typescript rejected once the action started passing the path through.
Change to `(_path: string) =>` so the signature matches and the test
still passes (vitest's esbuild-transpiled run was fine; tsc caught it).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 17:54:59 +08:00
4f1056cdcd feat(web): /login page with username + password form
Server-rendered card-style login. Form posts to loginAction; on
failure the client renders the generic 'Invalid username or
password' error. Centred, mobile-first, autocomplete-friendly so
phone PWAs autofill from the keychain on subsequent logins.
2026-05-10 17:52:35 +08:00
cedd623466 feat(web): loginAction + logoutAction (with TDD)
Username + password verified against the operators row, bcrypt
compare regardless of user-found state for timing equivalence,
DUMMY_HASH precomputed and committed. 10/5min IP rate limit, no
password ever logged. Issues a 30-day HttpOnly+Secure+SameSite=Lax
cookie on success, redirects via safeRedirect(next). 12 unit tests
covering correct creds, wrong username, wrong password, missing
password_hash, empty/long inputs, case-insensitive match, rate-limit
trigger, no-password-leak, safe redirect, unsafe redirect, logout.
2026-05-10 17:50:41 +08:00
d236196476 feat(web): getCurrentUser / requireUser / requireAdmin helpers
Reads the session cookie from next/headers, verifies via auth-cookie,
loads the operators row, returns the shape every existing call site
expects (.id, .defaultTimezone, etc) plus the new .role and
.username. getSeededOperator stays as a thin compat shim that
delegates to getCurrentUser, so the ~12 tests that mock
@/lib/operator keep working without churn.
2026-05-10 17:46:16 +08:00
e1ba1da2de feat(web): safeRedirect helper for the login \?next= param
Falls back to / for anything that isn't a single-slash-prefixed
relative path. Locks out protocol-relative (//evil.com), absolute
(https://evil.com), and javascript: redirects. 7 tests cover the
full attacker matrix.
2026-05-10 17:44:10 +08:00
27b7a3df1f feat(web): edge-safe HMAC-signed session cookie
signSession + verifySession run on Edge runtime (Web Crypto only).
Verifier checks signature (constant-time compare), expiry, clock-skew
on iat (60s tolerance), token version vs OPERATOR_TOKEN_VERSION env,
and role-shape sanity. 11 unit tests cover round-trip plus every
rejection path attackers could probe.
2026-05-10 17:43:01 +08:00
838e129f37 chore: add bcryptjs to web + db packages
Pure-JS bcrypt for password hashing. Avoids the native-build pain
of node-bcrypt in our Alpine Docker images. Login is a rare event
so the perf gap is irrelevant for our scale.
2026-05-10 17:41:06 +08:00
46c0315559 refactor(db): drop operators.telegram_user_id (not used since v1.0)
The Telegram bot phase ended in Plan 3 — the operator now signs in
via username + password. Migration 0011 drops the legacy column +
its unique index. seed.ts no longer reads SEED_OPERATOR_TELEGRAM_ID;
docker-compose.base.yml swaps the env to SEED_OPERATOR_USERNAME
(default 'admin'); .env.development follows. Settings page shows
'Username' instead of 'Operator ID'. Auth-and-prod-hardening plan
doc updated to drop the synthetic telegram_user_id from the
create-user CLI script and createUserAction insert.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 17:39:46 +08:00
4cb4015666 fix(bot): dedupe duplicate reminder.fire jobs (msg sent twice)
Observed: reminder fired twice within ~2s. The bot logs showed two
distinct pg-boss jobIds for the same reminder enqueued at the same
scheduledAt — both ran fire-reminder, both sent the message.

Root cause: pg-boss's `singletonKey` only deduplicates on queues with
a 'singleton' / 'stately' / 'short' policy. Our queue was created
without specifying a policy, defaulting to 'standard', which IGNORES
the singletonKey. Two sends with the same key produced two jobs.

Fix lives at two layers:

* Layer 1 — queue policy. createQueue(REMINDER_FIRE_QUEUE) now
  passes `{ policy: 'stately' }`. With this, future fresh deploys
  fold a duplicate send (same singletonKey) into the existing
  'created' job rather than producing a second one. This doesn't
  retroactively change an existing queue's policy (pg-boss doesn't
  support that), but new queues are correct from creation.

* Layer 2 — defense-in-depth check inside fireReminder. Before
  acquiring the per-account mutex, query reminderRuns for any row
  with the same reminderId fired in the last 30s. If found, log
  + bail. This guards against:
    - Existing queues stuck on policy='standard'.
    - Race windows even within 'stately' policy.
    - The operator double-clicking Save in the wizard.
    - A jittery pg_notify('bot.command') replay.
  Resume jobs (payload.runId set) skip this check — they're meant
  to attach to an existing run.

Tests:
* New "BAILS OUT when a fresh fire collides with a recent run" case
  in fire-reminder.test.ts.
* beforeEach now resets findExistingRunMock too, since both the
  resume and dedupe paths share that mock.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 16:41:11 +08:00
be3f28a1e6 refactor(web,bot,db): rename reminder status 'ended' → 'inactive'
The 'ended' label read like a terminal failure state ("the reminder
gave up") when in practice it just means "this reminder isn't going
to fire on its own — restart it if you want it back". 'inactive' is
the more accurate read.

* SQL migration 0009 backfills existing rows.
* Bot fire-reminder writes 'inactive' on one-off completion / no
  further occurrences.
* Web actions, queries, filters, and reminder lifecycle gates updated.
* Dashboard counter card label "Active / Paused / Ended / Total"
  becomes "Active / Paused / Inactive / Total".
* Reminders list filter tab "Ended" becomes "Inactive".
* Status pill style key renamed to match.
* Tests updated alongside the runtime changes.

Also: the "Pause sending by" deadline opt-in now renders as a
visible card-shaped row with hover state + Set/Off label on the
right, so the toggle is discoverable instead of a tiny native
checkbox tucked next to the label.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 16:32:53 +08:00
2e1defaef6 feat(web): "Pause sending by" deadline is opt-in via a checkbox
Wizard When-step and the per-section Edit-when page now gate the
HourSelect behind a checkbox. The control reads "[ ] Pause sending
by (optional)" by default — checking it reveals the hour picker;
unchecking hides it again.

The off-state is encoded as deliveryWindowEndHour=24 (next-day
midnight) so the bot's existing windowEndAt helper produces an end
that's always in the future for any reminder fired the same day,
making the gate effectively never trip. This avoids a NULL-allowing
schema migration while still giving the operator a clean "no
deadline" mode.

Existing reminders:
  • Stored 24 → checkbox starts UNCHECKED, picker hidden.
  • Stored anything else → checkbox starts CHECKED, picker shows
    the saved value.
  • Unsupplied (legacy rows) → checkbox starts UNCHECKED.

RunEtaPill picks up an optional `windowEndAt` prop. When omitted —
the no-deadline path — it renders a neutral grey pill with just the
ETA, skipping the green "Fits before deadline" / amber "Likely to
pause" comparison that wouldn't be meaningful without a deadline.

Tests:
* when-form-deadline.test.tsx (4) — fresh / 24 / real-hour /
  optional-hint paths.
* run-eta-pill.test.tsx (+1) — neutral pill when windowEndAt is
  undefined.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 16:25:44 +08:00
309020fa5d feat(bot): sweep stale 'pending' runs on startup
Corner case observed: fire-reminder writes the run row with
status='pending' UP FRONT (so the Activity tab shows progress
mid-run), then flips to a terminal status once it's done. If the
bot is killed between those two writes — e.g. a redeploy or crash —
the row sits at 'pending' forever. pg-boss already marked the job
'completed', so it won't retry. Activity surfaces and the dashboard
counters then show a "stuck" run that never moves.

sweepStalePendingRuns runs at bot startup, finds any 'pending' run
older than 5 minutes, and:
  • Flips the run to 'failed' with a clear error_summary so the UI
    stops treating it as in-flight.
  • Flips its still-'pending' run_target rows to 'skipped' with the
    same reason so per-group counts remain coherent.

The 5-minute floor is generous enough that an actual mid-run worker
rebalance isn't accidentally killed.

Tests:
* 4 sweep tests covering: no-stale path skips the second UPDATE;
  with-stale path fires both UPDATEs; counts are forwarded; the
  edge case where a stale run has zero pending targets.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 16:05:18 +08:00
bb8d28a594 feat(web): paused-run banner + Activity Paused filter / Resume button
Reminder detail page:

* Surfaces a PausedRunBanner above the rest of the surface when the
  most recent run is in 'paused' state. The banner shows the
  delivered/total counts, the deadline that closed the window, and
  Resume / Cancel run buttons that call the matching server actions.
* getReminderWithRuns now LEFT JOIN-aggregates run_target counts so
  the banner has sent/total per run without an N+1 fan-out.

Activity tab:

* New Paused filter tab between Success and Partial.
* Paused rows in the desktop table get an inline ResumeRunButton
  (emerald play icon, useTransition + error surfacing).
* RunStatusBadge picks up a Paused entry — amber, PauseCircle icon.

Tests:
* PausedRunBanner — 4 SSR cases (resume/cancel CTA rendered, X-of-Y
  copy, generic fallback, amber styling).
* ResumeRunButton — 4 SSR cases (aria, emerald accent, compact /
  default size variants).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:58:06 +08:00
376bbe595b feat(web,bot): resumeReminderRunAction + cancelReminderRunAction
Web actions:

* resumeReminderRunAction({ runId }) → validates ownership and that
  the run is in 'paused' state, then publishes a reminder.resume
  command via pg_notify('bot.command'). The bot's command-consumer
  picks it up and enqueues a fresh pg-boss job at REMINDER_FIRE_QUEUE
  carrying { reminderId, runId }; fire-reminder's existing resume
  branch attaches to the row.
* cancelReminderRunAction({ runId }) → flips remaining 'pending'
  targets to 'skipped' with error="canceled by operator", marks the
  run 'partial' with a clear errorSummary, and lifts the parent
  reminder out of 'paused' (recurring → active so the next
  occurrence fires; one-off → ended).

Bot:

* New BotCommand variant { type: "reminder.resume"; reminderId; runId }
* command-consumer registers handleResumeReminder which calls
  enqueueReminderResume(boss, reminderId, runId) — a sibling of
  scheduleReminderFire that posts the job at REMINDER_FIRE_QUEUE
  with { reminderId, runId } and singletonKey "reminder:resume:<runId>"
  so the resume doesn't conflict with a future-occurrence schedule.

Tests:
* reminders.run-actions.test.ts (11 tests) — every guard rail
  (invalid uuid, missing run, missing reminder, foreign operator,
  wrong status) and the recurring/one-off lifecycle branches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:54:21 +08:00
57786f9d09 feat(bot,web): window-end gate + paused/resume run lifecycle
fire-reminder.ts now:

* Computes windowEnd via @cmbot/shared/windowEndAt(timezone, endHour,
  now). Per-target loop trips the gate before sending; pending rows
  are LEFT pending (not flipped to skipped) so the run is resumable.
* Accepts an optional runId on the FireReminderPayload. When set,
  the handler ATTACHES to that existing run instead of creating a
  new one and only re-tries pending targets. Resume is allowed even
  when the reminder.status is 'paused' (otherwise we couldn't drag
  it back into delivery).
* Final-status logic adds a 'paused' branch (window closed mid-run
  with at least one row still pending AND something delivered);
  failed when window closed before any send; partial / success
  otherwise.
* Lifecycle: a paused run flips the reminder row to status='paused'
  and skips the recurring re-arm. Resuming or completing later
  flips it back to 'active'.
* SSE event payload gains optional sent/total counts.

reminderFiredToNotification picks up:
* New 'paused' headline + 'X of Y groups delivered. Tap to resume
  or cancel.' body.
* 'partial' body uses sent/total when present.

WebEventMap and the bot's WebEvent union match the new shape.

Tests:
* fire-reminder.test.ts gains a "resume against paused reminder
  acquires mutex" case.
* notifications.test.ts gains 3 paused/partial-sent body cases.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:48:52 +08:00
670eaf493c feat(web): swipeable account rows, editable label, disabled-account guard
Accounts list (mobile):
* Each row is a SwipeableRow. Swipe right reveals Pair (or Unpair if
  the account is connected). Swipe left reveals Groups + Delete.
* The right shelf widens to 176px so two buttons fit comfortably; a
  new \`leftShelfWidth\`/\`rightShelfWidth\` prop on SwipeableRow drives
  the override (default 88 stays for single-button shelves).

Accounts list (desktop): unchanged grid of clickable cards.

Account detail:
* New "Name" card at the top opens /accounts/[id]/edit/label, the
  dedicated rename surface (mirrors the reminder edit-name pattern).
* Paired-at row now shows the full timestamp ("10 May 2026, 3:33:04 pm")
  via toLocaleString instead of toLocaleDateString.

Reminder wizard:
* Disconnected accounts on the "Account" step are no longer
  clickable. They render as a non-link with aria-disabled, dimmed
  to opacity-50 with cursor-not-allowed and a "Pair this account
  before scheduling a reminder from it" tooltip. The bot has no
  live session for those accounts, so this prevents broken submits.

renameAccountAction validates the label, rejects duplicates within
the same operator, and revalidates /accounts and the detail page.

Tests added:
* AccountSwipeableRow — 6 SSR tests (shelves rendered, conditional
  Pair/Unpair, Groups + Delete shelf, 176px shelf width, hidden
  accountId field).
* EditAccountLabelForm — 5 SSR + payload tests (prefill, Save
  button, required + maxLength=60, action call shape).
* StepAccount — 4 SSR tests (connected → Link, disconnected → no
  Link + aria-disabled, opacity/cursor styles, "Not connected"
  copy).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:42:10 +08:00
c166a09fdb fix(web): client bundle no longer pulls in node-only rrule code
The wizard's RunEtaPill imported windowEndAt from the @cmbot/shared
barrel, which transitively re-exports rrule.ts. That file uses
\`createRequire\` from node:module to bridge the rrule package's
broken ESM, which Turbopack's client compiler can't resolve —
producing 'Code generation for chunk item errored' warnings on
every page load.

* Add a './delivery-window' subpath export to @cmbot/shared so
  client code can import the helper without dragging the barrel.
* Switch review-submit-client.tsx to that subpath.
* Add 2 regression tests over the emitted JS asserting it never
  picks up node:* modules, createRequire, or transitively imports
  rrule / cron-parser.

ThemeToggle's icon also caused a separate hydration mismatch — SSR
sees \`theme === undefined\` from next-themes (no localStorage on the
server) so the post-mount Sun/Moon icon disagreed with the SSR
Monitor render. Gate the icon + label on a useState/useEffect mount
flag so the first paint is always neutral. Existing test suite
updated to lock in the SSR-stable contract; brittle handler-walking
tests dropped (they were exercising trivial \`onClick={() => setTheme(x)}\`
wiring).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:33:47 +08:00
1d0d06d648 feat(web): floating CTA on mobile so Add/New buttons aren't a wasted row
Adds a `floatingAction` slot to PageShell. Desktop renders it inline
next to the H1 (same as before); mobile drops the entire header row
and floats the action as a fixed pill in the bottom-right corner —
the page now starts straight at content with no wasted vertical
space at the top when only an action exists.

Add Account / New Reminder buttons grow to size-12 circles on mobile
(easy thumb target) and keep the compact h-7 inline pill on desktop.
The action node is rendered twice in the tree — once inline, once
fixed — and switched via responsive utilities.

Bumps mobile bottom padding to pb-20 when a FAB is present so the
last card doesn't sit under the floating button.

Activity's "Clear history" still uses the regular `action` slot — it
keeps the inline header row on mobile because it isn't the page's
primary CTA.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:27:32 +08:00
9b13223966 fix(web): tighten the page-header row — smaller buttons, less padding
Action buttons drop from size=lg with px-6 + font-semibold to a
compact size=sm pill with a subtle shadow. PageShell trims mobile
top padding from 24px to 16px and the inter-section gap from 24px
to 16px on small screens (desktop unchanged) so the header row
doesn't dominate the top of the page when the H1 is hidden.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:25:08 +08:00
7697ea5fcb feat(web): PageShell narrow variant for dense single-column tabs
Adds a 'narrow' prop that wraps the body in 'max-w-2xl mx-auto'
while keeping the header chrome at the standard 5xl. Settings is
the first consumer — its rows are dense text and look adrift at
full width. The header still aligns with the other tabs so the
title position stays consistent.

Covered by 2 SSR tests (narrow path adds the inner wrapper, default
path doesn't).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:23:06 +08:00
3c3a3f57d3 refactor(web): extract EmptyState and reuse on every empty surface
One component now owns the icon / heading / helper / action stack
that the dashboard, accounts list, reminders list, and activity tab
were each rendering inline. The four duplicated 'flex-col items-center
py-12 text-center' Card blocks collapse to one shared surface so the
empty experience reads the same wherever the user lands.

Covered by 4 SSR tests (icon + title + description, omitted helper,
action slot pass-through, centring).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:21:51 +08:00
a5cb8cea46 refactor(web): extract PageShell and apply to every tab
Single component now owns the page chrome — wrapper width, padding,
vertical rhythm, and the page-header row (hidden-on-mobile H1 + an
optional right-aligned action slot). Dashboard, Accounts, Reminders,
Activity, and Settings all use it, replacing five copies of the same
\`<div className=\"max-w-5xl mx-auto px-4 ...\">\` markup.

Settings was previously \`max-w-2xl\` and \`container mx-auto\`; it
now matches the other tabs at 5xl so the chrome stays consistent.

Covered by 5 SSR tests (header order, responsive justify utilities,
wrapper class, action-optional path).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:19:17 +08:00
e38b9ac7b6 feat(web): RunEtaPill at the wizard review step
Renders an advisory ETA badge above the Schedule button:
* Green "Fits before deadline" when the projected finish lands
  before the chosen deadline hour.
* Amber "Likely to pause" with a "Push the deadline later or split
  into smaller runs" hint when it doesn't.

Pill is purely informational — the operator can still schedule a
run that's likely to pause; the pause/resume flow (Phase 3) covers
that case. The pill just removes the surprise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:11:53 +08:00
e6521bd151 feat(web): run-eta helper + bigger pill-shaped add buttons
estimateRunDuration() computes a per-run ETA from a target count, a
fire time, and an assumed per-account send rate (40/min, mirroring
the bot env). Adds a 15% buffer with a 1-minute floor. Pure helper,
covered by 6 round-trip tests including the rate-defaults path.

Header CTA buttons on /accounts and /reminders are now size="lg"
rounded-full pills with a shadow that lifts on hover. Mobile shows
just the plus icon (label collapses) so the button doesn't dominate
narrow screens.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 15:10:09 +08:00