Symptom
-------
Click "Unpair" on a connected account. The web action sets
\`status='unpaired'\`, but the account detail page often still shows
"Disconnected" — and on accounts that had been previously connected,
the QR pair flow restarts a few seconds later all on its own.
Cause
-----
Two races inside the session manager:
1. The web's \`unpairAccountAction\` notifies the bot via \`pg_notify\`
and then writes \`status='unpaired'\` to the row. The bot's
\`handleUnpair\` calls \`sessionManager.stop()\` which closes the
Baileys socket; Baileys eventually fires a \`connection: close\`
event which the manager's \`handleEvent\` translates into a
\`status='disconnected'\` UPDATE. Whichever write lands second wins.
The user clicks Unpair and sees Disconnected.
2. The same close-event handler schedules a 5-second
\`stop().then(start())\` reconnect for accounts whose
\`lastConnectedAt\` is set. Five seconds after unpair, the bot
silently re-opens the socket, the row flips to \`pending\`, and the
QR carousel restarts.
Fix
---
\`stop(accountId, { intentional: true })\` marks the account in a new
\`intentionalStops\` Set. When the close event lands, \`handleEvent\`
drains the flag (with \`Set.delete()\` returning whether the key was
present, so it's exactly-once and a stale flag can't bleed into a
later session) and skips both the DB UPDATE and the reconnect
schedule. The caller — only \`handleUnpair\` for now — is the one
choosing the row's next state, so we step out of its way.
The flag is set ONLY when callers ask for it. Internal recoveries
(restartRequired auto re-open, ephemeral-close back-off) keep the
default behaviour and continue to write \`disconnected\` + reschedule.
Drive-bys
---------
- Refresh the stale "the row is gone by the time we run" comment in
unpair-handler — the row stays alive now (the operator can re-pair
without retyping the label). Look up the account first so the
audit log carries the real \`operatorId\` instead of \`null\`. The
delete-account flow really does delete the row before notifying us;
the lookup tolerates that and falls back to \`null\`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
42 lines
1.6 KiB
TypeScript
42 lines
1.6 KiB
TypeScript
import { rm } from "node:fs/promises";
|
|
import { join } from "node:path";
|
|
import { db } from "../db.js";
|
|
import { env } from "../env.js";
|
|
import { sessionManager } from "../whatsapp/session-manager.js";
|
|
import { writeAuditLog } from "../audit.js";
|
|
import { pgNotifyWeb } from "./notify.js";
|
|
import { logger } from "../logger.js";
|
|
|
|
/**
|
|
* Unpair handler: stop the live Baileys session and remove the on-disk
|
|
* session files. The web action keeps the account row alive (status =
|
|
* 'unpaired') so the operator can re-pair without retyping the label;
|
|
* the {intentional: true} stop tells the session manager not to race
|
|
* the web's status write with its own "disconnected" update or
|
|
* schedule a reconnect for a session we just chose to tear down.
|
|
*
|
|
* For the delete-account flow the row IS gone by the time we run;
|
|
* the audit log lookup tolerates that.
|
|
*/
|
|
export async function handleUnpair(accountId: string): Promise<void> {
|
|
await sessionManager.stop(accountId, { intentional: true });
|
|
await rm(join(env.SESSIONS_DIR, accountId), { recursive: true, force: true });
|
|
try {
|
|
const row = await db.query.whatsappAccounts.findFirst({
|
|
where: (a, { eq }) => eq(a.id, accountId),
|
|
columns: { operatorId: true },
|
|
});
|
|
await writeAuditLog(db, {
|
|
operatorId: row?.operatorId ?? null,
|
|
source: "web",
|
|
action: "account.unpaired",
|
|
targetType: "whatsapp_account",
|
|
targetId: accountId,
|
|
payload: {},
|
|
});
|
|
} catch (err) {
|
|
logger.warn({ err, accountId }, "unpair: audit log failed (non-fatal)");
|
|
}
|
|
await pgNotifyWeb({ type: "session.disconnected", accountId });
|
|
}
|