cm_bot_v2/docs/aapanel-hardening.md
yiekheng ebccad2094 B4 cutover: retire Flask cm-web, rename cm-web-next → cm-web
End-state: a single web service (Next.js dashboard) per deployment, no
side-by-side Flask UI. The image name 'cm-web' now points at the Next.js
build; the legacy 'cm-web-next' tag is no longer published.

Changes:
- Delete app/cm_web_view.py and the Flask docker/web/Dockerfile.
- Rename docker/web-next/ → docker/web/ (Next.js Dockerfile takes the
  cm-web slot).
- docker-compose.yml: drop the web-view service. Rename web-next → web,
  container ${CM_DEPLOY_NAME}-web-next → ${CM_DEPLOY_NAME}-web, image
  cm-web-next → cm-web, named volume web-next-auth-data → web-auth-data.
  transfer-bot's depends_on no longer references web-view (vestigial
  startup ordering, never a runtime dependency).
- docker-compose.override.yml: same rename, dockerfile path updated.
- envs: drop CM_WEB_NEXT_HOST_PORT. Repurpose CM_WEB_HOST_PORT for the
  Next.js port (8010 dev, 8011 rex, 8012 siong) — same numeric values
  formerly held by CM_WEB_NEXT_HOST_PORT, so aaPanel routes don't move.
- scripts/dev.sh: drops web-view + web-next from up/reset-db/logs;
  --remove-orphans still cleans up legacy containers from before cutover.
- scripts/publish.sh: drop the cm-web-next build target.
- tests/test_debug_enabled.py: drop app.cm_web_view from the helper
  matrix (cm_api is now the only Flask entrypoint with _debug_enabled).
- AGENTS.md / README.md / docs/aapanel-hardening.md: rewrite Flask-era
  references; add migration steps for existing stacks; update aaPanel
  port references (8000/8001/8005 → 8010/8011/8012).
- .gitignore: add .env, .venv/, .playwright-mcp/, node_modules/, .next/
  so 'git add -A' can't accidentally stage secrets or build artifacts.

Operator action required to upgrade an existing deployment:
  1. .env: drop CM_WEB_NEXT_HOST_PORT line. Set CM_WEB_HOST_PORT to
     what CM_WEB_NEXT_HOST_PORT was. Make sure CM_AUTH_SECRET is set.
  2. aaPanel: if proxy_pass pointed at the legacy Flask port
     (8000/8001/8005), switch it to the new one (8010/8011/8012).
  3. Pull the new cm-web image (Next.js) and redeploy the stack. The
     old ${CM_DEPLOY_NAME}-web-view and ${CM_DEPLOY_NAME}-web-next
     containers will be replaced by a single ${CM_DEPLOY_NAME}-web.

Verified locally: docker-compose YAML parses; transfer-bot runtime is
unchanged (only depends_on tidied); 38-test python suite passes.
2026-05-03 10:12:20 +08:00

9.1 KiB
Raw Permalink Blame History

aaPanel Hardening Guide (Operator)

This is the hand-over guide for the C3 (auth), C4 (rate-limit + scanner deflection), and C7 (host firewall) slices of the prod hardening cycle. None of this is implemented in the repo — it lives in your aaPanel configuration and on your Flask host(s).

Companion spec: superpowers/specs/2026-05-02-prod-hardening-c1-c5-c6-design.md.

Threat model

aaPanel terminates TLS for https://<rex-domain>, https://<siong-domain>, and https://heng.04080616.xyz (the dev tier — see "Dev vhost" below) and proxies to LAN-reachable Next.js dashboard ports on each host (8011 rex, 8012 siong, 8010 dev). A scanner on the public internet → aaPanel → app. Without these mitigations, every /.env /.git/config /.aws/config /.htpasswd /php.php probe round-trips through the proxy. With them, aaPanel returns 444 immediately and the app never sees the request.

Post-B4 update. The dashboard now has built-in /cm-auth (password + WebAuthn passkey) that gates every route via Next.js middleware. C3 (basic auth at the proxy) is no longer the primary defense — it's optional belt-and-braces. Keep it only if you want a second factor at the edge before the Next.js middleware sees a request. The C4 (scanner deflection + rate limit) and C7 (host firewall) sections still apply unchanged in spirit; only the port numbers moved.

C3 — (Optional) Basic auth on the rex/siong/dev vhosts

Goal: an extra password challenge at the edge before requests reach /cm-auth. Skip this if /cm-auth is enough for your threat model.

Generate an htpasswd file (one per deployment is cleaner):

# On the aaPanel host, as root:
htpasswd -c /www/server/panel/data/htpasswd-rex   rex-operator
htpasswd -c /www/server/panel/data/htpasswd-siong siong-operator
htpasswd -c /www/server/panel/data/htpasswd-dev   dev-operator
chmod 640 /www/server/panel/data/htpasswd-*
chown www:www /www/server/panel/data/htpasswd-*

Add to the rex vhost's server { ... } block (aaPanel: site → settings → "Configuration File"):

auth_basic "rex restricted";
auth_basic_user_file /www/server/panel/data/htpasswd-rex;

Same shape for siong (htpasswd-siong) and dev (htpasswd-dev). Use a different password per deployment — reusing the same one means a leaked dev credential exposes prod. Reload nginx (aaPanel does this automatically on save).

Phone UX note

Basic auth + iOS/Android keychain + Face ID / Touch ID flow: on first login, save the password into the OS keychain when prompted ("Save password to iCloud Keychain" on iOS, "Save to Google Password Manager" on Android). Subsequent visits trigger Face ID / fingerprint to autofill the basic-auth dialog. Caveats:

  • Safari (iOS): integration is reliable. Face ID prompts almost every visit unless you tick "Remember me on this device" in Safari's password autofill settings.
  • Chrome (Android): Google Password Manager autofills basic-auth in newer Chrome versions; biometric prompt appears.
  • In-app browsers (Telegram, WhatsApp link previews): often don't autofill basic-auth and force you to type. If this matters, share https://... URLs and ask people to open in their default browser.

If autofill behavior is choppy, the upgrade path is Authelia + WebAuthn passkeys — its own future cycle, not in this one.

C4 — Rate limit + scanner deflection

Scanner deflection (444 on known probe paths)

In each vhost's server { ... }:

# Deflect generic web vulnerability scanners. Return 444 (no response,
# closes connection) instead of letting them reach Flask.
location ~* "^/(\.env|\.env\..*|\.git/.*|\.aws/.*|\.dockerenv|\.htpasswd|\.npmrc|.+\.php|i\.php|test\.php|php\.php|wp-(login|admin|content)/)" {
    access_log off;
    return 444;
}

# Robots: tell well-behaved crawlers to leave us alone.
location = /robots.txt {
    add_header Content-Type text/plain;
    return 200 "User-agent: *\nDisallow: /\n";
}

Rate limit (per source IP)

In the http { ... } block (one level above server; in aaPanel typically lives in the global nginx config or in a snippet):

# 10MB shared zone, 30 requests/sec per source IP.
limit_req_zone $binary_remote_addr zone=cm_general:10m rate=30r/s;

Then inside each vhost's server { ... }:

# Allow short bursts (60 reqs above rate) before throttling.
limit_req zone=cm_general burst=60 nodelay;
limit_req_status 429;

30 r/s × per-IP is generous for legitimate UI traffic and tight enough to slow a scanner down to nuisance levels.

Dev vhost — heng.04080616.xyz → dev PC

The dev tier (sub-project A) runs on a dev PC: bash scripts/dev.sh up → Next.js dashboard on 0.0.0.0:8010. Routing aaPanel to it adds public reach (with /cm-auth gating) so you can hand someone a URL to test against without giving them VPN.

aaPanel vhost for heng.04080616.xyz (in addition to the C4/C7 blocks above):

location / {
    proxy_pass http://<dev-pc-lan-ip>:8010;
    proxy_set_header Host              $host;
    proxy_set_header X-Real-IP         $remote_addr;
    proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_set_header X-Forwarded-Host  $host;
    proxy_read_timeout 60s;
}

X-Forwarded-Host and X-Forwarded-Proto are required so WebAuthn passkey enrollment uses the public hostname (heng.04080616.xyz) as the relying-party ID, not the LAN IP — passkeys enrolled at one rpID can't authenticate at another, so a misconfigured proxy will silently break passkey login.

Replace <dev-pc-lan-ip> with the dev PC's address on your LAN.

⚠️ Important: keep CM_DEBUG=false in the dev .env whenever aaPanel proxies the dev PC publicly. Setting CM_DEBUG=true does two things:

  1. The api-server (Flask) exposes the Werkzeug debugger — RCE if reachable.
  2. The Next.js dashboard drops the Secure flag on the session cookie so phone-on-LAN HTTP testing works.

Both are dev-only conveniences. With aaPanel proxying through HTTPS, leave CM_DEBUG=false and use the in-app /cm-auth flow.

C7 — Host firewall on each web host

Restrict the LAN-reachable Next.js dashboard ports to only aaPanel's IP. Without this, anyone else on the LAN can hit the app directly and bypass everything in C4. Apply on each host that runs a stack: rex, siong, and the dev PC.

Replace <aapanel-host-ip> with the address of your aaPanel box.

On rex/siong hosts (ports 8011 / 8012):

sudo ufw allow from <aapanel-host-ip> to any port 8011 proto tcp comment 'rex web ← aaPanel only'
sudo ufw allow from <aapanel-host-ip> to any port 8012 proto tcp comment 'siong web ← aaPanel only'
sudo ufw deny 8011/tcp
sudo ufw deny 8012/tcp
sudo ufw reload
sudo ufw status numbered

On the dev PC (port 8010 — match CM_WEB_HOST_PORT from envs/dev/.env):

sudo ufw allow from <aapanel-host-ip> to any port 8010 proto tcp comment 'dev web ← aaPanel only'
sudo ufw allow from 127.0.0.1 to any port 8010 proto tcp comment 'dev web ← localhost'
sudo ufw deny 8010/tcp
sudo ufw reload

The localhost rule on the dev PC is so you can still load http://localhost:8010 directly while iterating, without going through aaPanel.

Verify from a third machine on the LAN:

nmap -p 8010,8011,8012 <web-host-ip>
# All three ports should show 'filtered' from anywhere except the aaPanel host
# (and except localhost on the dev PC).

If you don't run ufw and prefer iptables directly, the equivalent rules are:

iptables -A INPUT -p tcp --dport 8011 -s <aapanel-host-ip> -j ACCEPT
iptables -A INPUT -p tcp --dport 8012 -s <aapanel-host-ip> -j ACCEPT
iptables -A INPUT -p tcp --dport 8010 -s <aapanel-host-ip> -j ACCEPT
iptables -A INPUT -p tcp --dport 8010 -s 127.0.0.1        -j ACCEPT
iptables -A INPUT -p tcp --dport 8011 -j DROP
iptables -A INPUT -p tcp --dport 8012 -j DROP
iptables -A INPUT -p tcp --dport 8010 -j DROP

(Persist via iptables-save > /etc/iptables/rules.v4 or your distro's preferred mechanism.)

Verification (after all blocks applied)

  1. Hit any UI without a session: curl -sI https://<rex-domain>/307 redirect to /cm-auth?next=/. Same shape for siong and https://heng.04080616.xyz/. (If C3 basic auth is also configured, you get 401 first.)
  2. After signing in via /cm-auth: subsequent requests return 200 OK. Use the browser; curl alone won't carry the cookie unless you -c/-b it.
  3. Scanner path: curl -i https://<rex-domain>/.env → connection closed (444 → curl shows "Empty reply from server"). The app logs show no entry for this request.
  4. Hammer-test rate limit: for i in $(seq 1 200); do curl -s -o /dev/null -w "%{http_code}\n" https://<rex-domain>/; done | sort | uniq -c → mix of 307s up to the burst, then 429s.
  5. From a non-aaPanel host on the LAN: nmap -p 8010,8011,8012 <web-host-ip> → all three ports filtered (localhost on dev PC still allowed).
  6. Dev-specific check. On the dev PC, bash scripts/dev.sh logs api-server | grep "Debugger PIN" should return nothing once CM_DEBUG=false. Sign in via the browser at https://heng.04080616.xyz/cm-auth and confirm the dashboard renders.