cm_whatsapp_bot_v1

Author	SHA1	Message	Date
yiekheng	31cf845030	feat(scripts): real publish.sh — buildx push of bot + web images Was a stub ('not yet implemented (see plan 4)'). Modeled directly on cm_bot_v2/scripts/publish.sh: - Same registry prefix gitea.04080616.xyz/yiekheng. - Same NO_SUDO toggle + docker info + buildx preflight diagnostics. - Same auth path notes (docker login on the same effective user that runs the build). - Same buildx --push flow with CM_IMAGE_PLATFORMS / BUILD_ARGS overrides and tag from $1 / DOCKER_IMAGE_TAG (default latest). This repo's services are bot + web (tools is dev-only and not published). Resulting tags: gitea.04080616.xyz/yiekheng/cm-whatsapp-bot:<tag> gitea.04080616.xyz/yiekheng/cm-whatsapp-web:<tag> Mark executable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 22:02:13 +08:00
yiekheng	ea7d07b2c8	perf(db): composite index (account_id, name) + hide archived groups Two related follow-ups for the 3 000+ groups-per-account scale path: 1. New B-tree index on whatsapp_groups (account_id, name) (migration 0014). Covers the groups list page's `WHERE account_id=? ORDER BY name ASC LIMIT 200` query so PG streams pre-sorted from the index instead of pulling all rows then sorting. The unique (account_id, wa_group_jid) was the only prior B-tree on this table; it backed the WHERE prefix but not the ORDER BY. 2. listGroupsForAccount now filters `is_archived = false` in both the search and the no-search branch. Soft-archived groups (set when group-sync sees them disappear from the live participant list, or when an operator unpairs the account) used to leak into the wizard picker, letting operators pick a group the bot can no longer reach. Archived rows still exist in DB so reminders that target them keep working; a re-pair flips them back via the on-conflict upsert. README "Deferred" entry for the composite index removed (it's shipped). Search-as-you-type in the wizard picker stays deferred. 482 web + 88 bot tests still green; typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:57:17 +08:00
yiekheng	c906a9fa3a	docs: refresh README + add docs/runbook.md for v1 sign-off - README rewritten to reflect v1 reality: auth bootstrap, AES-GCM cookies, three-layer rate limit, duplicate-pair detection, logout-before-delete, journal-monotonic guard, the new test counts (482 web + 88 bot), and the right scripts (set-password, create-user). Drops the telegram-era 'Status' paragraph and the earlier 'Auth deferred' bullet. - docs/runbook.md is a new manual end-to-end smoke checklist organised by section: pre-flight, auth bootstrap, user management, account pairing (incl. back→re-pair + duplicate-phone regression checks), reminder lifecycle (incl. triple-fire + reschedule regression checks), account lifecycle, sign-out + token-version kill, cross-tenant isolation, log sweep, plus a troubleshooting cheatsheet. Closes P3/T23 + P3/T24. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:45:03 +08:00
yiekheng	47d7c53fda	feat(db): auto-guard against drizzle journal-skip regression Twice now we've shipped a deploy that 500'd in production because drizzle silently skipped freshly-generated migrations whose `when` timestamps were older than a prior manually-bumped entry (0010/0011 in 1b7f553, then 0012/0013 in 2731888). Both times pnpm migrate printed "Migrations applied." while the live DB schema lagged the code's expectations. Three layers of defence: 1. packages/db/src/journal-check.ts — pure helpers - assertJournalMonotonic(entries): walks idx-sorted entries and returns each one whose `when` <= the previous entry's `when`, plus a suggested `when` value to bump it to. - formatJournalViolations(result): renders an actionable multi-line message that points at the offending file path. 2. packages/db/src/migrate.ts — pre-flight Reads _journal.json BEFORE handing it to drizzle.migrate(). If the journal is non-monotonic, it prints the violations + bump instructions and exits with code 2. No more "Migrations applied." while silently skipping. 3. apps/web/src/test/drizzle-journal-monotonic.test.ts — CI guard Reads the committed _journal.json at test time. CI fails on the PR before the bad commit can ship. Imports the helper through a new "./journal-check" subpath export on @cmbot/db so the test doesn't rely on a deep path into the package. Together: a bad commit fails CI; if it somehow got through, migrate itself refuses to run; if migrate is bypassed, the previous deploy's schema stays intact (drizzle wouldn't have skipped anything in any case where the journal is monotonic). Web suite 480 → 482 tests, all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:40:11 +08:00
yiekheng	27318888bc	fix(db): bump 0012/0013 journal timestamps so drizzle applies them Same regression we hit with 0010/0011 in commit 1b7f553: drizzle's migrator skips entries whose 'when' is older than the latest applied migration's recorded created_at. 0012's when (1778412502601) and 0013's when (1778418181504) were generated BEFORE 0011's manually- bumped when of 1778464002000, so 'pnpm migrate' kept reporting 'Migrations applied.' while silently skipping both. Result: web 500'd on every authenticated request — getCurrentUser hit 'column "email" does not exist' because the operators schema in code expected the column 0013 was supposed to add. Bumped 0012 to 0011.when + 1s and 0013 to + 2s, re-ran migrate. operators now has the email column, reminders.delivery_window_end_hour default is now 24 (the off-sentinel), and the web container is back up with no 500s. Note for future: the journal timestamps must be strictly monotonic across the entries[] order. The fix in commit 1b7f553 didn't future- proof us against the next batch. Keeping a long-term automated guard against this is a TODO. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:35:52 +08:00
yiekheng	b988d117a3	fix(db): restore whatsappGroups declaration that perf-notes comment ate A doc-comment refactor in 08f2c0f silently swallowed the 'export const whatsappGroups = pgTable(...)' line and its inner '{' opening brace, leaving the column properties at top level. Bot's typecheck happened to pass on a stale build, but the web container's startup pnpm --filter @cmbot/db build failed with 'Expression expected' / ';' expected at lines 71-77. Re-add the missing 4 lines. Web is back up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:34:27 +08:00
yiekheng	d731390c9d	fix(web): unpair soft-archives groups instead of DELETE — same FK abort Web error log showed unpairAccountAction failing with the same FK violation as group-sync: deleting whatsapp_groups rows that had been used in reminders blew up reminder_targets_group_id_whatsapp_groups_id_fk and aborted the unpair. Switch to UPDATE … SET is_archived=true. The bot's group-sync upsert already flips is_archived back to false on a re-pair (added in the group-sync companion fix in the previous commit), so behaviour is end-to-end equivalent to the old delete + repopulate path without the FK fragility. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:33:03 +08:00
yiekheng	08f2c0fd27	fix(bot): group-sync soft-archives instead of DELETE — fixes FK abort Web error log showed: update or delete on table "whatsapp_groups" violates foreign key constraint "reminder_targets_group_id_whatsapp_groups_id_fk" on table "reminder_targets" Repro: pair finishes, post-open syncGroupsForAccount runs and tries to DELETE rows for groups no longer in the live participant list. If any of those groups had been used in a reminder its row is FK- referenced from reminder_targets, so the DELETE aborts the whole transaction and the operator's pair completion appears to fail. With 3 000+ groups per account this hits anyone with even a small reminder history. Switch the sweep from DELETE to UPDATE … SET is_archived=true. Reminders that targeted the missing group keep working (operator can choose to remove them); a future re-pair where the group reappears flips is_archived back to false via the on-conflict upsert. Returns archived count instead of removed count. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:30:05 +08:00
yiekheng	2fe8459d25	feat: duplicate-pair detection + logout-before-delete + ordering tests Three connected bits of paired-account hygiene: 1. Duplicate-pair guard (apps/bot/src/ipc/pair-handler.ts) Operator scans the QR with a phone that's already linked to another account row → both rows would fight over the same WhatsApp device and sends become a coin flip. After Baileys' `open` event the bot now queries siblings of the same operator, passes them through findDuplicateExistingAccount() (a pure helper extracted to pair-state.ts), and on a hit: - stops the new session (intentional; keeps the original's session intact) - scrubs the partial auth blob from disk - resets the row's status to unpaired and clears phone_number - emits a new session.duplicate event with the existing row's label so PairLive can render a clear message New PairLive 'duplicate' phase: amber icon + "Phone already linked, unpair the existing account first or scan with a different phone". 2. Logout-before-delete (apps/bot/src/ipc/unpair-handler.ts + apps/bot/src/whatsapp/session-manager.ts) Delete used to call account.unpair which only closes the local socket — the operator's phone kept showing a phantom "linked device" pointing at a row that no longer exists. Added: - new account.delete command type (web side and bot side) - sessionManager.logoutAndStop(): calls socket.logout() so WhatsApp drops the device on the server side, THEN closes the local socket. Best-effort; logout RPC failure doesn't strand the delete. - new handleDelete() handler that calls logoutAndStop, removes session files, audits, and notifies. - deleteAccountAction now sends account.delete instead of account.unpair. Unpair stays unchanged — re-pair-friendly, no logout. 3. Tests (bot 77 → 88, web 477 → 480) - findDuplicateExistingAccount: 6 cases covering match, no-match, self-exclusion, null/empty/whitespace handling, whitespace normalisation, deterministic-pick when (defensively) two siblings share a phone. - handleUnpair / handleDelete: handleDelete calls logoutAndStop BEFORE rm; handleUnpair never touches logoutAndStop (regression guard for a refactor that swaps them); audit log payload includes the row's label; audit lookup throwing doesn't strand the delete. - listAccounts ordering: static guard against the rename- reshuffles-list regression. Pins `asc(a.createdAt)` + `asc(a.id)` and rejects `asc(a.label)` in the function body. Bot restarted with the new flow. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:26:58 +08:00
yiekheng	f566e4683a	feat(web): sort accounts by created_at ascending (earliest first) Earlier accounts were ordered by label, so the list reshuffled every time an account was renamed. Switch to created_at ASC + id ASC as a deterministic tiebreaker. Earliest-added accounts now stay on top where the operator added them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:19:01 +08:00
yiekheng	7df3ef9c31	fix(web): bump right-meta column cap a touch (max-w 34% / 9.5rem) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:17:20 +08:00
yiekheng	0fd581b365	fix(web): nudge right-meta column cap up (max-w 28% / 8rem) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:17:00 +08:00
yiekheng	f4da1dd510	fix(web): halve right-meta column cap (max-w 20% / 5.5rem) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:16:40 +08:00
yiekheng	50b7e61037	fix(web): tighter cap on right-meta column (max-w 40% / 11rem) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:15:18 +08:00
yiekheng	89c7b1a84d	fix(web): cap right-meta column on reminders list so name doesn't get starved The recurrence summary ("Every month on days 4, 6, 7, 11, 13, 14 +6 more at 11:32") rendered without truncation in the right meta column, which had `shrink-0` + no max-width — so the column expanded to fit the text and the reminder name on the left was forced to truncate aggressively or wrap. Cap the right column at max-w-[55%] on mobile / sm:max-w-[14rem] on desktop, add min-w-0 to each row inside, and truncate every meta span. Long recurrences now ellipsis with a hover title tooltip; the reminder name reclaims the breathing room it should have. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:13:49 +08:00
yiekheng	32f87e1a92	fix(web): truncate long recurrence description with ellipsis on detail card Switched the reminder detail recurrence line from wrap-on-overflow to single-line truncate (...) so card height stays consistent. The full text is exposed via the native title tooltip, and editing the schedule shows the canonical full description in the wizard. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:12:02 +08:00
yiekheng	e32f633e02	fix(web): wrap long recurrence description on reminder detail card A reminder set to fire on many days of the month renders a long description ("Every month on days 4, 6, 11, 13, 18, 20 +2 more at 11:32"). The recurrence <p> used flex items-center which kept the icon and the text on a single non-wrapping row — the text overflowed horizontally and the card grew wider instead of letting the text break. Switch to flex items-start, wrap the text in a <span min-w-0> so it becomes a shrinkable flex item that wraps internally, and bump the icon down by mt-0.5 to keep it baseline-aligned with the first line of text now that items-start no longer vertically centers it. The list-page card already used <span> for the same text and was unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:11:09 +08:00
yiekheng	429ae0827f	fix(web): only ONE nav item highlighted at a time + drop redundant Close Two related bugs from the same review pass: 1. /settings/users lit up BOTH the Admin and Settings entries in the sidebar/drawer. The active-state check was naïve `pathname.startsWith(href)`, which matches every parent prefix. Replaced with a longest-match helper pickActiveNavKey() in nav-config.ts: the most-specific item wins, parents stay quiet, '/' only matches an exact pathname, and a strict-descendant check (`href + '/'`) prevents `/settingsfoo` from lighting up Settings. 2. <DialogFooter showCloseButton> on the user-row delete (and three other dialogs that I missed earlier) was rendering an extra outline "Close" button next to the operator's own Cancel + Radix's corner X. Stripped the prop from every remaining caller (login, dashboard clear-history, reminder actions-bar, settings/users delete) so each dialog footer shows just Cancel + the primary action. Tests: - nav-config.test.ts: 7 new cases covering the longest-match contract — /settings/users highlights ONLY Admin, /settings highlights ONLY Settings, '/' is exact-match only, sibling-prefix /settingsfoo matches nothing, and a defense-in-depth probe asserts at-most-one nav highlight across a representative pathname set. - test/no-dialog-footer-show-close-button.test.ts: static guard that grep-walks every production .tsx and fails if anything passes `showCloseButton` to <DialogFooter>. Mirrors the existing no-button-wrapping-card guard so the prop can't sneak back in. Also self-checks the regex (matches single-line + multi-line + other-prop combos; ignores clean DialogFooter and same-named props on unrelated components). 463 → 477 web tests, all green; typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:08:40 +08:00
yiekheng	496f882d9c	feat: split user row into 2 lines + reserve operators.email column Layout changes (apps/web/src/app/settings/users/user-row-client.tsx): - Row 1: username + 'you' chip on the LEFT (inline, alongside the username), role badge on the RIGHT. - Row 2: action buttons (Promote/Demote, Reset, Delete) right-aligned. - Earlier: identity stacked vertically with badge under username, and buttons crammed to the right of the same row. Schema (packages/db/src/schema.ts + migration 0013): - Added optional `email` column on operators (nullable, no NOT NULL). Reserved for future contact / recovery flows so today's operators don't need to backfill anything. - Partial unique index on lower(email) WHERE email IS NOT NULL keeps duplicates out without blocking NULLs. Migration applied to dev DB. 463 web tests still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:04:15 +08:00
yiekheng	3af0dc7ca7	feat(web): loosen user-row layout — more breathing room - Card row: gap-2 -> gap-3, p-3 -> p-4 - Row inner gap: gap-2 -> gap-3 (between identity block and buttons) - Identity block: add space-y-1.5 + leading-none on username so the badge row has visible separation from the username - Badge / 'you' chip gap: 1.5 -> 2 - Button group gap: 1 -> 1.5 - CardContent space between rows: space-y-3 -> space-y-4 Pure layout — no behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:00:08 +08:00
yiekheng	adaf087a5f	feat(web): drop '· last admin' label from user row The Demote/Delete buttons are already disabled with proper tooltips implied by their disabled state; the extra inline label was visual clutter on the only-admin's own row. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:44:41 +08:00
yiekheng	f69652d43b	feat(web): AES-GCM cookies + per-username/global rate limit + origin check Three layers of login hardening pulled together — addresses the "don't let middleman / robot easily log in by mimicking headers" follow-up. 1. AES-256-GCM session cookie (apps/web/src/lib/auth-cookie.ts) The old format was base64-encoded JSON + HMAC-SHA256 signature, so anyone with the cookie could read userId/role straight off the bytes. Switched to AES-GCM authenticated encryption: the payload is encrypted with a 256-bit key derived from AUTH_SECRET via SHA-256, a fresh 12-byte nonce is drawn per encryption (never reused — locked in by test), and tampering with either the IV or ciphertext fails the GCM auth tag → decrypt throws → null. Cookie format: <base64url(iv)>.<base64url(ciphertext+tag)> Existing cookies become invalid on deploy because the IV portion doesn't decode to 12 bytes — middleware bounces them to /login. No env bump needed; users just sign in once with the new secret. 2. Three-layer rate limit on loginAction Old: per-IP only. An attacker with a residential-proxy pool or spoofed X-Forwarded-For could hop IPs and brute one account. New: Promise.all of three checkRateLimit calls - per-IP login:<ip> 10 / 5 min - per-username login-user:<lower> 5 / 15 min - global login-global 100 / min (backstop) First-hit wins; logger captures which limit tripped (ip / username / global) without telling the attacker which one. 3. Action-level Origin/Host check serverActions.allowedOrigins already does this at the framework layer; running it inside loginAction lets us log the mismatch and reject before bcrypt + DB. Missing Origin treated as same-origin (RFC: same-origin POSTs may omit it). Malformed Origin → reject. Tests: - auth-cookie.test.ts updated to AES-GCM (15 tests, +4 vs HMAC): fresh IV per encryption, ciphertext doesn't leak userId/role, IV-swap rejected, ciphertext-tamper rejected, wrong-length IV rejected, malformed b64 doesn't throw. - auth.test.ts adds 7 new cases: three-layer key shape, per-username limit alone trips, global limit alone trips, cross-origin rejected, same-origin accepted, missing-Origin treated as same-origin, malformed-Origin rejected. Web suite 453 → 463 tests, all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:41:49 +08:00
yiekheng	6942745085	test(bot): cover the reschedule corner case in scheduleReminderFire Lock down the pre-send cancel that fixed the dropped 8:20 PM fire: - cancel UPDATE always runs BEFORE boss.send (regression: stately dedupe silently rejected the new send when a stale created job existed; now we tombstone the stale row first) - cancel scopes to state='created' only (active and completed jobs must survive — they're in-flight or historical) - cancel filters by THIS reminder's singletonKey (no cross-reminder cancellation) - boss.send still receives singletonKey + startAfter + retryLimit - first-time schedule (zero stale rows) still calls send - cancel UPDATE error degrades to "send anyway" — the handler-level recent-run dedupe will catch any duplicate that lands - boss.send returning null is surfaced (so the caller's logger captures jobId: null instead of silently treating it as success) 77 bot tests now (was 70). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:33:29 +08:00
yiekheng	2e6fbfa7a5	fix(bot): reschedule was silently dropped under stately policy Scheduled reminder for May 10 8:20 PM never fired. Bot logs showed "reminder.fire: scheduled" with jobId: null at 12:18 UTC — pg-boss returned null because the queue was on policy=stately, which dedupes sends across the (created/active/retry) state cone by singletonKey. A previous schedule for the same reminder (next recurring fire, created earlier) was still in 'created' state, so the new send for today 8:20 PM hit the dedupe and was silently rejected. Two fixes: 1. Switch the queue policy back to 'standard' (the default) and force-flip any existing 'stately' queue row on boot. Standard lets us enqueue across reschedules. 2. scheduleReminderFire now does a pre-send cancel: any 'created' job for this singletonKey is moved to 'cancelled' before the new boss.send. The new schedule wins; old stale jobs are tombstoned so the recurring/edit path produces exactly-one upcoming fire. Duplicate-fire safety (the 'qwerd msg three times' bug) is already covered at the handler level by the inner-mutex recent-run check inside fireReminderInner — that's what stately was guarding against, and the inner check works under standard too. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:27:52 +08:00
yiekheng	991b7ae0ab	fix(web): swallow click after a swipe so dragging a row does not navigate Repro: on the reminders list, click-and-drag a card to swipe — the shelf opened AND the wrapped Link fired its click, so the operator landed on the reminder detail page mid-swipe. Track a dragMoved ref in SwipeableRow that flips true when the pointer travels past the standard 6 px tap threshold. On pointerup, if dragMoved is set, register a one-shot capture-phase click handler on the row container that preventDefault + stopPropagation. The synthetic click the browser fires on pointerup is intercepted before it reaches the anchor's onClick, so the row stays put after a swipe and a real tap (under 6 px movement) still navigates as before. A 350ms safety timeout strips the listener if no click materialises (pointerup landed outside the element) so a later legitimate click isn't accidentally swallowed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:54:31 +08:00
yiekheng	b293bbf142	fix(web): suppress native drag on SwipeableRow so anchors do not eat the swipe Reminders and activity rows wrap their card in Link, and anchors are natively draggable. As soon as the operator moves horizontally the browser kicks into drag-link mode and the pointer events never reach SwipeableRow handlers — left/right swipe-to-Pause/Delete silently broke on the reminders list. Add onDragStart preventDefault + draggable=false to the row body once and every SwipeableRow consumer is fixed in place. The existing pan-y touch-action stays — together they give us pointer control on both desktop and mobile. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:52:11 +08:00
yiekheng	a789b61e1f	fix(bot): triple-fire reminder bug — force pg-boss policy + close TOCTOU dedupe Repro: fire a reminder, message lands 2-3 times in WhatsApp (logs showed three 'fire-reminder: done' entries within 1.5 s for the same reminderId). Two interlocking root causes: 1. The queue was created at 'standard' policy (pre-dating the stately rollout). pg-boss's createQueue is idempotent and DOES NOT update the policy on an existing queue row, so re-deploying the code that requested policy=stately silently kept the standard policy. Standard accepts duplicate enqueues with the same singletonKey — three reminder.fire jobs for the same reminderId could all land at once. 2. The handler-level recent-run dedupe was TOCTOU. The check ran OUTSIDE the per-account mutex, so three concurrent invocations all read 'no recent run', then queued up on the mutex one at a time and each INSERTed a fresh run + sent the message. Fixes: - registerReminderJobs now forces the queue policy via direct SQL (UPDATE pgboss.queue SET policy = 'stately' WHERE name = ... AND policy <> 'stately') on every boot. Idempotent + survives pre-existing standard-policy rows. - fireReminderInner re-checks for a recent run AFTER the mutex is held but BEFORE the INSERT. By that point any concurrent winner has already inserted, so the duplicate sees the row and bails cleanly. New test in fire-reminder.test.ts (the TOCTOU repro): outer check returns no recent run, inner check returns a freshly-inserted one, asserts the mutex was acquired but the second findFirst was hit (i.e. we got past the outer check and the inner check stopped us). Verified live: pgboss.queue.policy is now 'stately' for reminder.fire. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:43:41 +08:00
yiekheng	e800882d15	fix: 'Pause sending by' is off by default everywhere The optional 'Pause sending by' deadline was defaulting to 18 (= 6 PM) in three places: - reminders.delivery_window_end_hour schema default (NOT NULL DEFAULT 18) - createReminderAction / editScheduleAction fallback when the field is missing on the input - the Zod refine validator's secondary fallback Net effect: any reminder created before this change has 18 in the DB, so the edit form's checkbox flips ON automatically (the wizard treats 'value !== undefined && value !== 24' as 'opted in'). The wizard's own create flow always sends 24 explicitly when the box is unchecked — but legacy / direct API payloads + the schema default for older rows don't carry that intent through. Switch every default to 24 (the off-sentinel the wizard already uses) so the optional toggle stays off until the operator ticks it. New migration 0012 also backfills existing rows from 18 → 24 so editing old reminders no longer auto-checks 'Pause sending by'. Tests in when-form-deadline.test.tsx already lock in the UI contract (off when initialDeliveryEndHour is undefined or 24, on for any other value). No assertion changes needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:30:09 +08:00
yiekheng	5c48e0e85f	fix(web): wire Refresh Groups button to syncGroupsAction with live SSE refresh The button was a placeholder that submitted to a no-op server action, so clicking did nothing. Replace with a small client component that: 1. Calls syncGroupsAction(accountId) to pgNotify the bot. 2. Listens for the bot's groups.synced event over SSE and router.refresh()es when it arrives so the new rows appear without a manual reload. 3. Disables the button + shows a Syncing… label while the sync is in flight, with a 15s safety timeout if the bot or SSE channel drops so the spinner doesn't strand. Drop the in-place <form action={async() => 'use server'}> placeholder. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:13:24 +08:00
yiekheng	40d788302c	test(bot): cover post-pair-restart re-warming sequence Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:10:46 +08:00
yiekheng	d0db248460	fix(bot): re-warm pairing flag on post-pair-restart close After a successful QR scan Baileys closes with status 515 and the session-manager schedules a reconnect via setTimeout(stop().then(start)). That cleanup stop emits a SECOND close event which arrived at our pair-handler listener with warmingUp already cleared (the first qr cleared it). The decision then resolved to 'treat-as-timeout', detaching the listener and pushing session.timeout to the UI right at the moment the user actually paired successfully — pairing then silently completed in the DB but the UI never got session.connected. Fix: re-arm pairingWarmingUp inside the post-pair-restart branch so the cleanup-stop's close is swallowed too. Cleared again by the following qr/open from the freshly-reopened socket, which then emits session.connected to the UI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:09:08 +08:00
yiekheng	7af7aa35d0	test(bot): cover the back→re-pair close-leak regression Extract the pair-handler's close-event decision into a pure helper decidePairListenerOnClose(warmingUp, restartRequired) returning one of ignore-leaked-close / post-pair-restart / treat-as-timeout. Refactor pair-handler to call the helper instead of the inline if-chain. New tests in pair-state.test.ts: - warmingUp=true → ignore-leaked-close (regression: prior session's close racing the new listener) - warmingUp=true + restartRequired=true → still ignore (defense in depth — a stale 515 must not hand control to the reconnect path) - warmingUp=false + restartRequired=true → post-pair-restart - warmingUp=false → treat-as-timeout Bot suite goes from 60 → 64 tests, all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:06:39 +08:00
yiekheng	68668ef2cd	feat(web): footer reads 'Signed in as <username>' with italic name Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:04:39 +08:00
yiekheng	fe8e14b7a0	fix(bot): swallow leaked close from previous pairing attempt Repro: scan QR window once → click Back → click Pair again → instantly see 'Pairing timed out' (sometimes for several attempts in a row). Root cause: when handleStartPairing hits a still-running session it calls await sessionManager.stop(accountId) and immediately attaches a fresh listener. session.close() resolves before sessionManager broadcasts the close event to listeners (handleEvent has several awaits between close arriving and the listener fan-out). The new listener was already attached by then and saw the OLD session's close as if it were the new session timing out — flipped the row to unpaired and pushed session.timeout to the UI. Fix: track a per-account 'pairingWarmingUp' Set. The new attempt enters warming-up the moment its listener attaches; clears on the first qr or open (those events can only come from the freshly-started session). A close that arrives while still warming is logged and ignored. abandonPair also clears the flag for safety. Also drop the redundant Admin card from /settings — the Admin nav entry on the sidebar/drawer already routes admins to /settings/users, the extra card was duplicate UI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:02:10 +08:00
yiekheng	dbdb156a09	fix(web): drop redundant Close button from account dialogs DialogFooter showCloseButton was rendering a third button (Close) next to the Cancel + 'Yes, delete' / 'Yes, unpair' pair. The corner X icon already closes the dialog, so the extra button was just visual noise. Drop the prop on both account-card dialogs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:57:06 +08:00
yiekheng	6759ca8131	fix(web): client-component delete/unpair cards on accounts/[id] The DialogTrigger asChild + transparent button overlay pattern wasn't emitting a clickable button in the rendered DOM under radix-ui 1.4 + Next 16 (server component context), so Delete and Unpair both became no-ops. Replace each with a small client component that: - holds open-state for the confirm Dialog - drives the Card itself as the click target via role='button', tabIndex, onClick, and Enter/Space keydown handlers - calls the server action through useTransition The Card stays a div (no <button> wrapping a Card → satisfies the existing static-guard test). Removed the unused inline Dialog imports and unpair/delete icons from the page. Also trim the forgot-password dialog body to one sentence per request ('don't write too detail'). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:55:29 +08:00
yiekheng	5d583d9194	fix(web): forgot-password dialog, settings tagline, account dialog triggers - Login page: replace static 'Forget Password? Contact IT' line with a proper dialog button. Clicking opens an explanatory dialog (self- service reset is intentionally disabled; admins can reset from /settings/users or run scripts/set-password.sh). - /settings: drop the 'cm WhatsApp Bot · self-hosted' tagline. - /accounts/[id]: Unpair + Delete cards weren't responding to clicks. Restructure so the transparent <button> overlay is a sibling of <Card> inside a <div className='relative'> wrapper (mirrors the working Pair/Re-pair pattern). The previous layout placed the DialogTrigger inside the Card, which produced no clickable button in the rendered DOM under radix-ui 1.4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:50:28 +08:00
yiekheng	c493101b60	feat(web): password policy, sign-out, dashboard isolation, activity tweaks Multi-fix batch from a rapid feedback round: - Password policy mirrors Facebook's documented rule (≥6 chars + mix of letters with numbers/symbols). Centralised in apps/web/src/lib/password-policy.ts; createUserAction, resetUserPasswordAction, the AddUser form, and the row Reset-password flow all use it. CLI scripts/set-password.ts inlines the same check so the bootstrap path stays consistent. - App shell adds a Sign-out button in both the desktop sidebar footer and the mobile drawer footer, with the signed-in username next to it. Layout passes username down alongside role. Theme toggle was removed from the shell per request — operators don't need it in the chrome. - Dashboard stats: getDashboardStats was running findMany on reminders with NO operator filter, so a brand-new user saw global counts from every tenant. Switched to an INNER JOIN on whatsapp_accounts so the card on / only counts this user's reminders. (Counts had been showing '1 / 1 / 3 / 5' to a fresh user — the cross-tenant leak the user flagged.) - /activity drops the All tab and the Clear-history button. Default filter is now Success when no ?filter= is set; Partial keeps fanning into Paused + Failed; Skipped still merges into Archived. - /settings drops the Display name row entirely and only shows the Role row to admins. Layout receives username so the shell can also surface it next to the Sign-out button. - Tests: password-policy.test.ts (11 cases), updated users.test.ts to use policy-compliant passwords + cover letters-only / digits-only rejection, sidebar-footer assertion swapped from theme-toggle to the new Sign-out + username markup. 453 tests green; typecheck clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:46:29 +08:00
yiekheng	b92ead3a97	feat(web): add-user form + delete confirmation in user management - New AddUserFormClient on /settings/users (admin-only): username + password + role select. Wraps createUserAction. - UserRowClient gains an isLastAdmin prop and a confirm-dialog before delete. Demote and Delete are both disabled on the last remaining admin so an admin can't lock everyone out via the UI (server-side guards in users.ts already cover the API). - Page passes isLastAdmin per row and computes adminCount once. - Role badge uses emerald for admin / slate for user; explicit Promote / Demote arrows replace the bidirectional icon. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:36:03 +08:00
yiekheng	4ddf5c094e	feat(web): admin nav entry + role-aware AppShell - Add an Admin nav item (key 'admin', href /settings/users) with visibleTo=['admin'] so signed-in users with role='user' don't see it. - nav-config exposes navItemsForRole(role) helper that filters NAV_ITEMS by visibleTo. - Root layout fetches getCurrentUser() and forwards role into AppShell. AppShell narrows the role gate to the rendered nav (sidebar + drawer); /login still short-circuits to the bare header. Unknown role falls back to 'user' visibility (defense-in-depth). - Settings page renders an admin-only card linking to Users so admins have a discoverable in-app entry point too. Tests: - nav-config: navItemsForRole admin/user matrix + admin entry shape. - app-shell: admin link visible for admin, hidden for user, hidden for null/unauthenticated, /login bare header strips nav entirely. - actions/auth: cookie payload encodes role=user, unknown role rejected, AUTH_SECRET-unset path, whitespace-only username rejected, rate-limit key contains client IP, unknown-user path still hits DB+bcrypt. 440 tests now (was 423). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:30:58 +08:00
yiekheng	797326e062	feat(web): collapse Skipped→Archived, Partial→Paused+Failed; full-width filter rows - Activity filter tabs drop Partial and Skipped; Partial runs now appear under both Paused and Failed (anything that didn't fully succeed), Skipped runs surface under Archived (history the operator chose not to send). Five tabs left: All / Success / Paused / Failed / Archived. - listActivityRuns flips skipped runs out of the default list and into the archived view at the SQL layer so pagination stays correct. - Tabs row spans the full width and wraps onto a second row when the viewport can't fit them. Account-filter select also span full width on every breakpoint instead of capping at sm:max-w-xs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:26:34 +08:00
yiekheng	ebbbdbdfb8	fix(web): make session cookie secure flag conditional on production Setting Secure on http://localhost cookies works in Chrome (localhost exception) but Firefox/Safari silently drop them, so dev users hit 'redirect to /login on every click' after a 'successful' login. Switch to secure: NODE_ENV === 'production'. Public deploy still gets Secure-only. Also swap the login footer copy from a CLI hint to 'Forget Password? Contact IT' — operator-friendly, doesn't leak the bootstrap mechanism on the public sign-in screen. Test updated to assert secure=true under prod NODE_ENV and a new test locks in secure=false in dev. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:19:59 +08:00
yiekheng	7ab51335a4	fix(compose): pass AUTH_SECRET + OPERATOR_TOKEN_VERSION to web container The web service container only inherited NODE_ENV/DATABASE_URL/DATA_DIR/ MEDIA_DIR/WEB_PORT, so AUTH_SECRET (set in .env.development) was never visible inside the container. Login bailed out with 'Server is not configured for sign-in.' loginAction needs both keys to issue cookies, and OPERATOR_TOKEN_VERSION defaults to 1 (the env-bump session invalidator). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:17:22 +08:00
yiekheng	050292a282	feat(web): bare login header — only centred brand mark The login page lived inside the authenticated AppShell, so the desktop sidebar (with all nav items) and the mobile menu drawer were rendering on the sign-in screen. AppShell now branches on pathname=/login and renders a single centred header (cm + WhatsApp Bot) with no nav, plus the form. Drops the redundant in-card title since the header carries the brand. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:14:03 +08:00
yiekheng	1b7f553e24	fix(db): bump 0010/0011 journal timestamps so drizzle applies them drizzle's migrator skips entries whose 'when' is older than the latest applied migration's recorded created_at. 0010 (1778405570914) and 0011 (1778405817706) were generated before 0009's manually-set when of 1778464000000, so 'pnpm migrate' reported success but never ran the auth + telegram-drop migrations against any DB whose 0009 had landed. Bumping 0010/0011 to 0009.when + 1s/+2s makes the timestamps strictly monotonic so future drizzle migrate runs apply them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:13:55 +08:00
yiekheng	b29d137c84	feat: production hardening — robots, allowedOrigins, container non-root, rate limits, CLI bootstrap robots.ts + metadata.robots blocks indexing. serverActions.allowedOrigins gates cross-origin Server Action posts. Bot + web Dockerfiles add a non-root 'app' user (uid 1000) with chmod 700 on /data/sessions. sendTestAction grows a per-group rate limit (3/60s). resumeReminderRunAction + cancelReminderRunAction get a per-IP rate limit (30/10s). .env.example documents every required key. packages/db/src/scripts/{set-password,create-user}.ts + thin shell wrappers in scripts/ — first admin sets their password via ./scripts/set-password.sh admin before signing in.	2026-05-10 18:05:34 +08:00
yiekheng	67091c294a	feat(web): user-management surface (admin only) createUserAction, setUserRoleAction, resetUserPasswordAction, deleteUserAction — all gated by requireAdmin(). Self-demote and last-admin guards prevent the operator from accidentally locking themselves out. /settings/users page lists every operator with inline Demote/Promote, Reset password, and Delete buttons. 10 unit tests.	2026-05-10 18:01:09 +08:00
yiekheng	b77a9d106d	feat(web): middleware gates non-allowlisted paths on session cookie Edge-runtime check via auth-cookie.verifySession. /api/* paths get a 401 (no body) when unauthenticated; pages get a 307 to /login with the original path encoded into ?next=. Allowlist explicitly excludes /api/events and /api/qr — both were unauthenticated in v1.1.0 and let an unauthenticated client snoop the entire SSE event stream and enumerate paired account QR codes.	2026-05-10 17:57:07 +08:00
yiekheng	5b4787d10e	fix(web): typed-routes + redirect-mock signatures in auth.ts Next.js 16 typed-routes (experimental.typedRoutes in next.config.ts) narrows redirect()'s parameter to RouteImpl<T>, which a runtime string from the form can't satisfy. Cast to any with a comment for the two redirect call sites in auth.ts. The auth.test.ts redirectMock used `() =>` zero-arg signature, which typescript rejected once the action started passing the path through. Change to `(_path: string) =>` so the signature matches and the test still passes (vitest's esbuild-transpiled run was fine; tsc caught it). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 17:54:59 +08:00
yiekheng	4f1056cdcd	feat(web): /login page with username + password form Server-rendered card-style login. Form posts to loginAction; on failure the client renders the generic 'Invalid username or password' error. Centred, mobile-first, autocomplete-friendly so phone PWAs autofill from the keychain on subsequent logins.	2026-05-10 17:52:35 +08:00

1 2 3 4 5

228 Commits