Repro: fire a reminder, message lands 2-3 times in WhatsApp (logs
showed three 'fire-reminder: done' entries within 1.5 s for the same
reminderId).
Two interlocking root causes:
1. The queue was created at 'standard' policy (pre-dating the
stately rollout). pg-boss's createQueue is idempotent and DOES
NOT update the policy on an existing queue row, so re-deploying
the code that requested policy=stately silently kept the
standard policy. Standard accepts duplicate enqueues with the
same singletonKey — three reminder.fire jobs for the same
reminderId could all land at once.
2. The handler-level recent-run dedupe was TOCTOU. The check ran
OUTSIDE the per-account mutex, so three concurrent invocations
all read 'no recent run', then queued up on the mutex one at a
time and each INSERTed a fresh run + sent the message.
Fixes:
- registerReminderJobs now forces the queue policy via direct SQL
(UPDATE pgboss.queue SET policy = 'stately' WHERE name = ...
AND policy <> 'stately') on every boot. Idempotent + survives
pre-existing standard-policy rows.
- fireReminderInner re-checks for a recent run AFTER the mutex is
held but BEFORE the INSERT. By that point any concurrent winner
has already inserted, so the duplicate sees the row and bails
cleanly.
New test in fire-reminder.test.ts (the TOCTOU repro): outer check
returns no recent run, inner check returns a freshly-inserted one,
asserts the mutex was acquired but the second findFirst was hit
(i.e. we got past the outer check and the inner check stopped us).
Verified live: pgboss.queue.policy is now 'stately' for reminder.fire.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>