311 lines
20 KiB
Markdown
311 lines
20 KiB
Markdown
# B-auth: Login + WebAuthn Passkeys Design
|
|
|
|
**Date:** 2026-05-02
|
|
**Status:** Approved (design)
|
|
**Sequel to:** [2026-05-02-b2-b3-ui-port-pwa-design.md](2026-05-02-b2-b3-ui-port-pwa-design.md)
|
|
**Followed by:** B4 cutover (delete `app/cm_web_view.py`, retire `cm-web` Flask service, rename `cm-web-next` → `cm-web`).
|
|
|
|
## Problem
|
|
|
|
The Next.js dashboard (`cm-web-next`) currently has zero auth. Anyone who can reach `https://heng.04080616.xyz/` (the public vhost) lands directly on the accounts table. The plan was for aaPanel basic auth (C3) to gate the URL — and that's a fine outer defense — but the user wants:
|
|
|
|
1. **In-PWA Face ID / fingerprint sign-in.** Once the PWA is installed, opening it should hit a real WebAuthn flow, not an OS-mediated basic-auth dialog. Passkeys feel native; basic auth in a chromeless PWA feels jarring.
|
|
2. **A password fallback** for first-time login on a new device, or when biometric isn't available.
|
|
|
|
The existing `CM_AGENT_ID` / `CM_AGENT_PASSWORD` env vars already define an operator identity per deployment (rex-cm has an agent, siong-cm has an agent). Reusing those as the dashboard password — instead of building a separate user table — keeps B-auth scope small and avoids duplicating identity state.
|
|
|
|
## Goal
|
|
|
|
Add an in-app login flow to `cm-web-next`:
|
|
|
|
- A `/login` page that shows two options side-by-side: a "Sign in with passkey" button (preferred when one is enrolled on this device), and a username + password form (fallback).
|
|
- Password sign-in compares against the existing `CM_AGENT_ID` and `CM_AGENT_PASSWORD` env vars using a constant-time compare.
|
|
- WebAuthn passkey enrollment (after first password sign-in, on a settings page) lets the operator add a Face ID / Touch ID / fingerprint credential bound to the device. Subsequent visits skip the password.
|
|
- Session state: a signed `httpOnly` cookie via `iron-session`. 30-day rolling expiry; refreshes on activity.
|
|
- All auth state lives in `cm-web-next` — no api-server changes, no mysql schema change. Passkeys are stored as JSON in a docker volume mounted into the container.
|
|
- Middleware gates every dashboard route except `/login` and the WebAuthn Server Actions, which are reachable while logged out.
|
|
|
|
## Non-Goals
|
|
|
|
- **No mysql schema change.** Passkeys live in a JSON file in a docker volume. For one operator with maybe 2-4 devices total, a real DB table is overkill.
|
|
- **No separate identity service** (Authelia, Keycloak, Cloudflare Access). All auth lives in `cm-web-next`. Authelia remains an out-of-scope upgrade path if multi-tenant or multi-deployment SSO ever becomes a need.
|
|
- **No multi-user support.** One operator per deployment, identified by `CM_AGENT_ID`. The passkey JSON is keyed by `CM_AGENT_ID` so that if a deployment ever swaps identity, the passkeys for the old identity stay scoped to the old identity.
|
|
- **No "forgot password" flow.** The password is the env var. If the operator can't remember it, they look it up in the deployment's `.env`. There is no recovery email, no reset token, none of that.
|
|
- **No api-server-side auth.** api-server stays internal-only (per C5), reached only from inside the docker network by web-view and web-next. Auth is a `cm-web-next` concern, not an api-server concern.
|
|
- **No public `/api/*` routes for the auth flow.** WebAuthn challenge/response goes through Server Actions, preserving the "no scrapable JSON surface" architecture.
|
|
- **B4 cutover is not in this scope.** Legacy Flask `cm_web_view.py` keeps running with no auth (gated only by aaPanel basic auth on its `https://...` vhost) until B4 retires it.
|
|
|
|
## Architecture
|
|
|
|
### Identity model
|
|
|
|
One operator per `cm-web-next` instance, identified by `CM_AGENT_ID`. The same env var the bots use to log into cm99.net is reused as the dashboard username. The "session" is a cookie that says "the holder has authenticated as `CM_AGENT_ID`." Nothing more granular.
|
|
|
|
When `CM_AGENT_ID` changes (rex-cm gets a new agent, say), all existing passkeys for the old `CM_AGENT_ID` become inaccessible — by design. The passkey JSON is keyed by username, so swapping identities re-enrolls from scratch.
|
|
|
|
### Login flow — password
|
|
|
|
1. Browser hits `/` → middleware sees no session cookie → 302 to `/login?next=/`.
|
|
2. `/login` page is a Server Component (form is a Client Component for state).
|
|
3. User types `CM_AGENT_ID` and `CM_AGENT_PASSWORD`, submits.
|
|
4. Client calls `loginWithPassword(username, password)` Server Action.
|
|
5. Server Action:
|
|
- Reads `CM_AGENT_ID` and `CM_AGENT_PASSWORD` from env.
|
|
- **Constant-time compare** both fields using `crypto.timingSafeEqual` over equal-length buffers.
|
|
- If both match: sets the session cookie with `{ username: CM_AGENT_ID, authenticatedAt: Date.now() }`.
|
|
- If either doesn't: returns `{ ok: false, error: "invalid credentials" }` (no leakage about which one).
|
|
6. Browser redirects to `next` (default `/`).
|
|
|
|
### Login flow — passkey
|
|
|
|
1. `/login` page detects (client-side) whether `PublicKeyCredential.isUserVerifyingPlatformAuthenticatorAvailable()` returns true and whether at least one passkey is enrolled (server-supplied flag in the page payload).
|
|
2. If both true: render a "Sign in with passkey" button as the primary CTA, password form below.
|
|
3. Click triggers `beginAuthentication()` Server Action → returns `PublicKeyCredentialRequestOptionsJSON` with a fresh server-generated challenge.
|
|
4. Client invokes `@simplewebauthn/browser`'s `startAuthentication()`, which prompts Face ID / fingerprint.
|
|
5. Browser returns signed assertion → client passes to `finishAuthentication(response)` Server Action.
|
|
6. Server verifies via `@simplewebauthn/server`'s `verifyAuthenticationResponse`, looks up the matching credential by ID, increments the counter, sets the session cookie.
|
|
7. Browser redirects to `next`.
|
|
|
|
### Passkey enrollment flow
|
|
|
|
1. Once authenticated (via password), user visits `/settings/passkeys`.
|
|
2. "Add passkey" button → `beginRegistration()` Server Action returns `PublicKeyCredentialCreationOptionsJSON`.
|
|
3. Client invokes `@simplewebauthn/browser`'s `startRegistration()` — Face ID / fingerprint enrolls a new credential.
|
|
4. Client sends attestation to `finishRegistration(response, deviceName)` Server Action.
|
|
5. Server verifies via `verifyRegistrationResponse`, persists `{ id, publicKey, counter, name, createdAt }` to the JSON file.
|
|
6. Page revalidates, the new passkey appears in the list.
|
|
|
|
The settings page lists existing passkeys with their device names + a "Remove" button. Removing a passkey deletes its row from the JSON file.
|
|
|
|
### Session
|
|
|
|
| Concern | Choice |
|
|
|---|---|
|
|
| Library | `iron-session` (single small dep, hooks into Next.js cleanly via App Router cookies API) |
|
|
| Cookie name | `cm_auth` |
|
|
| Cookie attrs | `httpOnly`, `secure` (when `NODE_ENV=production`), `sameSite=lax`, `path=/` |
|
|
| Expiry | 30-day rolling — refresh on every request that touches a page |
|
|
| Secret | `CM_AUTH_SECRET` env var. ≥32 chars random. Operator generates with `openssl rand -hex 32`. |
|
|
| Body | `{ username: string, authenticatedAt: number }` — kept minimal so a stale session doesn't carry stale state. |
|
|
|
|
### Passkey storage
|
|
|
|
JSON file at `/data/auth/passkeys.json` inside the container. Mounted from a named volume `${CM_DEPLOY_NAME:-cm}-web-next-auth-data` so it persists across container restarts and image rebuilds.
|
|
|
|
Schema:
|
|
|
|
```json
|
|
{
|
|
"<CM_AGENT_ID>": [
|
|
{
|
|
"id": "base64url-credential-id",
|
|
"publicKey": "base64url-public-key",
|
|
"counter": 42,
|
|
"transports": ["internal", "hybrid"],
|
|
"name": "iPhone 15 Pro",
|
|
"createdAt": "2026-05-02T12:34:56Z"
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
Top-level keys are `CM_AGENT_ID` values; values are arrays of credential records. The JSON file is read on every WebAuthn flow (small file, no caching needed) and written atomically (write to `passkeys.json.tmp`, fsync, rename).
|
|
|
|
A small wrapper module `web/lib/auth-store.ts` owns the read/write and locks via a single in-process mutex to prevent concurrent writes from racing.
|
|
|
|
### Server Actions inventory
|
|
|
|
All in `web/app/auth-actions.ts` with `"use server"`:
|
|
|
|
| Action | Purpose |
|
|
|---|---|
|
|
| `loginWithPassword({ username, password })` | Constant-time compare → set cookie → return `{ ok }` |
|
|
| `logout()` | Clear cookie → return `{ ok: true }` |
|
|
| `beginRegistration()` | Generate registration options, store challenge in session, return options. Requires authenticated session. |
|
|
| `finishRegistration({ response, deviceName })` | Verify attestation, persist credential to JSON. Requires authenticated session. |
|
|
| `beginAuthentication()` | Generate authentication options, store challenge in session, return options. NO auth required (this IS the login). |
|
|
| `finishAuthentication({ response })` | Verify assertion, set cookie, return `{ ok }`. NO auth required. |
|
|
| `removePasskey({ credentialId })` | Delete from JSON. Requires authenticated session. |
|
|
|
|
The challenge for register/authenticate is stored in the session cookie (small, signed, transient). On the next call (`finishRegistration` / `finishAuthentication`) the server retrieves it from the cookie and clears it.
|
|
|
|
### Middleware
|
|
|
|
`web/middleware.ts` runs on every request:
|
|
|
|
```typescript
|
|
import { NextRequest, NextResponse } from "next/server";
|
|
import { getSessionFromCookie } from "@/lib/auth";
|
|
|
|
const PUBLIC_PATHS = new Set(["/login"]);
|
|
|
|
export async function middleware(req: NextRequest) {
|
|
const path = req.nextUrl.pathname;
|
|
if (PUBLIC_PATHS.has(path)) return NextResponse.next();
|
|
|
|
const session = await getSessionFromCookie(req.cookies);
|
|
if (!session) {
|
|
const url = req.nextUrl.clone();
|
|
url.pathname = "/login";
|
|
url.searchParams.set("next", path);
|
|
return NextResponse.redirect(url);
|
|
}
|
|
return NextResponse.next();
|
|
}
|
|
|
|
export const config = {
|
|
// Skip _next, static, favicon, manifest, icon endpoints, etc.
|
|
matcher: ["/((?!_next|icon|apple-icon|manifest.webmanifest|favicon.ico).*)"],
|
|
};
|
|
```
|
|
|
|
Server Actions live OUTSIDE the matcher (Next.js routes them through a separate POST handler with magic encoded payloads). Auth-required actions check the session manually inside the action body (because middleware doesn't run on Server Action invocations the same way).
|
|
|
|
### Files Created / Modified
|
|
|
|
| File | Operation | Purpose |
|
|
|---|---|---|
|
|
| `web/middleware.ts` | Create | Route gate |
|
|
| `web/lib/auth.ts` | Create | Session create/read/destroy helpers (iron-session wrapper) |
|
|
| `web/lib/auth-store.ts` | Create | JSON-file CRUD for passkeys with in-process write lock |
|
|
| `web/app/auth-actions.ts` | Create | All Server Actions listed above |
|
|
| `web/app/login/page.tsx` | Create | Login UI (Server Component shell) |
|
|
| `web/app/login/login-form.tsx` | Create | Client Component for the form + passkey button |
|
|
| `web/app/settings/passkeys/page.tsx` | Create | Passkey list + add/remove (Server Component) |
|
|
| `web/app/settings/passkeys/passkey-list.tsx` | Create | Client Component handling enrollment + removal |
|
|
| `web/components/nav.tsx` | Modify | Add Settings link + Sign-out button (account menu) |
|
|
| `web/package.json` | Modify | Add `iron-session`, `@simplewebauthn/server`, `@simplewebauthn/browser` |
|
|
| `docker-compose.yml` | Modify | Add `web-next-auth-data` named volume + mount in `web-next` service |
|
|
| `docker-compose.override.yml` | Modify | Same volume mount in dev override |
|
|
| `envs/dev/.env.example` | Modify | Add `CM_AUTH_SECRET=devsecret-32-bytes-or-more-please-rotate` |
|
|
| `envs/rex/.env.example` | Modify | Same with placeholder, operator generates real value |
|
|
| `envs/siong/.env.example` | Modify | Same |
|
|
| `AGENTS.md` | Modify | Add a "Auth" subsection documenting `CM_AUTH_SECRET` and the passkey JSON volume |
|
|
|
|
No file deletions. No changes outside `web/` and the per-deployment env templates and AGENTS.md.
|
|
|
|
### `web/lib/auth.ts` shape
|
|
|
|
```typescript
|
|
import "server-only";
|
|
import { cookies } from "next/headers";
|
|
import { sealData, unsealData } from "iron-session";
|
|
|
|
const COOKIE_NAME = "cm_auth";
|
|
const COOKIE_TTL_SECONDS = 30 * 24 * 60 * 60;
|
|
|
|
type Session = {
|
|
username: string;
|
|
authenticatedAt: number;
|
|
// Transient WebAuthn state (challenge, type) lives here too while a flow is in progress.
|
|
pendingChallenge?: { kind: "register" | "authenticate"; challenge: string; expiresAt: number };
|
|
};
|
|
|
|
function secret(): string {
|
|
const s = process.env.CM_AUTH_SECRET;
|
|
if (!s || s.length < 32) {
|
|
throw new Error("CM_AUTH_SECRET missing or shorter than 32 chars");
|
|
}
|
|
return s;
|
|
}
|
|
|
|
export async function getSession(): Promise<Session | null> { /* read cookie, unseal */ }
|
|
export async function setSession(s: Session): Promise<void> { /* seal, write cookie */ }
|
|
export async function clearSession(): Promise<void> { /* delete cookie */ }
|
|
export async function requireSession(): Promise<Session> { /* throws if no session */ }
|
|
```
|
|
|
|
`server-only` ensures this never bundles into client code (poison import — fails the build if imported from a client component).
|
|
|
|
### `web/lib/auth-store.ts` shape
|
|
|
|
```typescript
|
|
import "server-only";
|
|
import { promises as fs } from "node:fs";
|
|
import path from "node:path";
|
|
|
|
const FILE_PATH = process.env.CM_AUTH_STORE_PATH ?? "/data/auth/passkeys.json";
|
|
|
|
export type PasskeyRecord = {
|
|
id: string;
|
|
publicKey: string;
|
|
counter: number;
|
|
transports: AuthenticatorTransportFuture[];
|
|
name: string;
|
|
createdAt: string;
|
|
};
|
|
|
|
let writeLock: Promise<void> = Promise.resolve();
|
|
|
|
export async function readPasskeys(username: string): Promise<PasskeyRecord[]> { /* ... */ }
|
|
export async function appendPasskey(username: string, rec: PasskeyRecord): Promise<void> { /* lock, read, append, atomic-write */ }
|
|
export async function removePasskey(username: string, credentialId: string): Promise<boolean> { /* lock, read, filter, atomic-write */ }
|
|
export async function bumpCounter(username: string, credentialId: string, counter: number): Promise<void> { /* same */ }
|
|
```
|
|
|
|
The `writeLock` chain serializes writes within a single Node process. With one container (no clustering) this is sufficient. If we ever scale `cm-web-next` horizontally, switch to a real lock file or move to mysql.
|
|
|
|
### Login page UI brief
|
|
|
|
frontend-design generates `login/page.tsx` shell + `login-form.tsx` client component matching the SaaS aesthetic of the rest of the dashboard. Concrete requirements:
|
|
|
|
- Centered card on the workbench backdrop, white with `ring-1 ring-zinc-200/60`, rounded-2xl.
|
|
- Brand mark (small "CM" tile) + "Sign in" heading.
|
|
- **Primary CTA:** "Sign in with passkey" button (large, dark zinc-900) — only rendered if the page payload says a passkey is enrolled AND the browser supports `isUserVerifyingPlatformAuthenticatorAvailable()`.
|
|
- **Below it:** "or username + password" divider, then two inputs (username, password) with a smaller "Sign in" button.
|
|
- Error state: inline red below the form if `loginWithPassword` returns `{ ok: false }`.
|
|
- All inputs use `text-base sm:text-[13px]` (the existing iOS auto-zoom fix).
|
|
- No "remember me" — cookie is rolling 30 days by default.
|
|
- "Forgot your password? Check the deployment's `.env` file" — small zinc-500 footer (matter-of-fact, internal-tool tone).
|
|
|
|
### Settings/passkeys page UI brief
|
|
|
|
- Standard dashboard layout (Nav, page heading "Passkeys").
|
|
- List of enrolled passkeys: name, created date, "Remove" button. Empty state: "No passkeys enrolled yet."
|
|
- "Add passkey" button at the top: opens a modal with a single text input ("Device name", e.g., "iPhone 15"), then triggers `startRegistration`.
|
|
- After successful enrollment: row appears, success toast fires (matches existing toast pattern).
|
|
|
|
### Nav modification
|
|
|
|
Add a small account menu on the right side (next to the existing Accounts/Users tab pills):
|
|
|
|
- A subtle button showing `CM_AGENT_ID` (truncated if long).
|
|
- On click: dropdown with "Passkey settings" → `/settings/passkeys`, and "Sign out" → calls `logout()` Server Action → redirect to `/login`.
|
|
|
|
The dropdown uses the same modal/sheet primitive style — no new component primitive.
|
|
|
|
## Verification
|
|
|
|
1. **Cold start.** `bash scripts/dev.sh up`. Open `http://localhost:8010/`. Redirected to `/login?next=%2F`.
|
|
2. **Password sign-in.** Type `CM_AGENT_ID` and `CM_AGENT_PASSWORD` from the dev `.env`. Submit. Redirect to `/`. Accounts table renders.
|
|
3. **Cookie set.** DevTools → Application → Cookies → `cm_auth` present, `httpOnly`, `secure` (in prod) / not (in dev because `NODE_ENV=development`), `sameSite=lax`, expires ~30 days.
|
|
4. **Wrong password.** Type wrong password. Form shows red "invalid credentials". No success toast. No cookie set.
|
|
5. **Sign out.** Click the user menu → Sign out. Redirected to `/login`. Cookie cleared.
|
|
6. **Passkey enrollment** (Chrome desktop with Touch ID, or iPhone). Sign in with password → settings/passkeys → Add passkey → name "MacBook" → Touch ID prompt → success toast → row appears in list.
|
|
7. **Passkey login.** Sign out. `/login` now shows "Sign in with passkey" as primary CTA. Click → Touch ID → redirect to `/`.
|
|
8. **Passkey persistence.** `bash scripts/dev.sh down && bash scripts/dev.sh up`. Sign-in flow still recognizes the previously enrolled passkey (volume persisted).
|
|
9. **Passkey removal.** Sign in → settings/passkeys → Remove. Row disappears, JSON file no longer contains it.
|
|
10. **Middleware coverage.** While signed out: `/`, `/users/`, `/settings/passkeys` all redirect to `/login`. `/login` itself does not redirect.
|
|
11. **Server Actions auth.** Calling `removePasskey` from a client without a valid session returns an error (auth-action body checks `getSession()` and throws/returns 401-equivalent).
|
|
12. **Constant-time compare.** Manually inspect `loginWithPassword` source — uses `crypto.timingSafeEqual` over zero-padded buffers of equal length. (No timing-channel leak about which field is wrong.)
|
|
13. **Volume preserved across rebuild.** `sudo docker compose -f docker-compose.yml -f docker-compose.override.yml build --no-cache web-next` then `up`. Passkey JSON survives.
|
|
|
|
## Risk
|
|
|
|
Medium.
|
|
|
|
- **JSON-file write durability.** A crash mid-write could corrupt the file. Mitigation: atomic write (`tmp` + `rename`), single in-process mutex. For one operator with low write frequency (passkey adds/removes are rare), this is sufficient. If we ever need multi-writer guarantees, switch to mysql.
|
|
- **`CM_AUTH_SECRET` rotation invalidates all sessions.** Expected behavior — operators understand a secret rotation logs everyone out. Document this.
|
|
- **Passkeys aren't multi-user.** If two operators ever need to share a deployment, they'd share the same `CM_AGENT_ID` identity and the same passkey list — fine for now but a hard scaling cliff. Captured as out-of-scope.
|
|
- **Browser support.** WebAuthn is supported in all modern browsers (iOS 16+, Chrome, Edge, Firefox, Safari). On unsupported browsers the password flow is the only path; we feature-detect and hide the passkey CTA.
|
|
- **iOS PWA standalone WebAuthn.** Apple has had platform bugs in earlier iOS versions where standalone PWAs couldn't trigger WebAuthn. iOS 17+ is reliable. Document the minimum version.
|
|
- **Server Action surface.** Server Actions ARE network-callable (Next.js routes them). They aren't "private functions" — anyone who reverse-engineers the Next.js wire format can call them. Mitigation: every action that requires auth checks the session inside the action body. The cost of reverse-engineering Next.js's encoding is much higher than calling an open `/api/foo` endpoint, so the practical attack surface is similar to a per-route auth-required `/api/*` proxy.
|
|
|
|
## Out-of-Scope Follow-Ups
|
|
|
|
- **B4 cutover** — separate cycle: delete `app/cm_web_view.py`, retire `cm-web` (Flask) service, rename `cm-web-next` → `cm-web`. After B4, the legacy Flask UI (which has no auth) goes away entirely.
|
|
- **Authelia / SSO** — if multi-deployment SSO ever becomes a need, swap the in-app auth for an Authelia container. No timeline; revisit if/when.
|
|
- **Session listing / revocation** — show "active sessions" on settings, allow remote logout. Useful for "I lost a phone" recovery if you want stricter than "rotate `CM_AUTH_SECRET`". YAGNI for now.
|
|
- **CSRF token on Server Actions** — Next.js's Server Action transport already includes a hidden token, but reviewing the framework's CSRF posture for our specific deployment is an exercise we can do separately.
|
|
- **Failed-login lockout** — a small per-IP counter that returns 429 after N bad password attempts. Defense-in-depth; aaPanel C4 rate-limit also helps.
|