diff --git a/docs/superpowers/specs/2026-05-02-debug-mode-hotfix-design.md b/docs/superpowers/specs/2026-05-02-debug-mode-hotfix-design.md new file mode 100644 index 0000000..ef7a7e3 --- /dev/null +++ b/docs/superpowers/specs/2026-05-02-debug-mode-hotfix-design.md @@ -0,0 +1,134 @@ +# Debug-Mode Hotfix: Env-Driven `CM_DEBUG` + +**Date:** 2026-05-02 +**Status:** Approved (design) +**Scope:** Hotfix only. Larger security hardening (real WSGI server, reverse proxy, auth, scanner deflection) is tracked separately under the security-hardening sub-project. + +## Problem + +Both Flask entrypoints currently start with the Werkzeug debugger enabled: + +- `app/cm_web_view.py:748` — `app.run(host='0.0.0.0', port=8000, debug=True)` +- `app/cm_api.py:160` — `def run(self, port=3000, debug=True)`, then `self.app.run(host='0.0.0.0', port=port, debug=debug)` + +Container logs confirm the debugger is active in deployed containers (`* Debug mode: on`, `Debugger PIN: 702-685-302`). The Werkzeug debugger gives remote code execution to anyone who can reach the port and supply the PIN, and the same containers are receiving public-style scanner probes (`/.env`, `/.git/config`, `/.aws/config`, `/.htpasswd`). This is the highest-priority issue in the codebase right now. + +The user wants to keep debug mode available locally (local = dev tier) while ensuring it is off in the rex and siong production deployments. + +## Goal + +Make debug mode opt-in via the `CM_DEBUG` environment variable. Default off. No other behavior changes. + +## Non-Goals + +- Switching from `app.run` to a production WSGI server (gunicorn/uvicorn). Belongs to security hardening. +- Adding a reverse proxy, TLS, auth, or rate limiting. +- Changing `app/cm_bot_hal.py` hardcoded credentials. +- Touching `cm_telegram.py` or `cm_transfer_credit.py` — neither runs a Flask server. +- Adding `robots.txt` or scanner deflection. + +## Design + +### `_debug_enabled()` helper + +Both Flask modules add the same small helper. Defined locally in each file (no new shared module — only two call sites, and `app/__init__.py` is currently a near-empty package marker). + +```python +def _debug_enabled() -> bool: + return os.getenv("CM_DEBUG", "false").strip().lower() in ("1", "true", "yes") +``` + +Accepts `1`, `true`, `yes` (case-insensitive, whitespace-trimmed) as truthy. Anything else, including unset, is false. This matches the lenient parsing pattern already used for env-driven config in the recent refactor (commit `45303d0`). + +### `app/cm_web_view.py` + +Replace the bottom `__main__` block: + +```python +if __name__ == '__main__': + print("Starting CM Web View...") + print("Web interface will be available at: http://localhost:8000") + print("Make sure the API server is running on port 3000") + app.run(host='0.0.0.0', port=8000, debug=_debug_enabled()) +``` + +`os` is already imported at the top of the file (line 10) — no new import needed. + +### `app/cm_api.py` + +Three changes: + +0. Add `import os` at the top of the file (currently absent — only `threading`, Flask, and `.db` are imported). + +1. Change the `run` signature default so callers can still force-override, but unspecified means "read the env": + + ```python + def run(self, port=3000, debug=None): + if debug is None: + debug = _debug_enabled() + ... + self.app.run(host='0.0.0.0', port=port, debug=debug) + ``` + +2. Leave `run_in_thread(self, port=3000, debug=False)` alone. It is only used internally and its `debug=False` default is already safe; passing `debug=None` would break that contract. + +The `__main__` block stays as `api.run(port=3000)` — by passing nothing it now picks up the env-driven default. + +### `docker-compose.yml` + +Add `CM_DEBUG: ${CM_DEBUG:-false}` to the `environment:` blocks of `api-server` and `web-view` (the only Flask services). The `${CM_DEBUG:-false}` form ensures the variable is *always* defined inside the container, even if the operator forgot to set it in their `.env`. Telegram and transfer services do not need it. + +`docker-compose.override.yml` does not need changes — it inherits `environment:` from the base file. + +### `.env.example` + +Add a new section near the top: + +``` +# === Runtime === +# Set to true ONLY in local dev. Werkzeug debugger = RCE if exposed. +CM_DEBUG=false +``` + +### `envs/rex/.env` and `envs/siong/.env` + +These files are intentionally not in git (the directories are committed empty). The operator's existing prod env files do not set `CM_DEBUG`, which makes the default (`false`) apply automatically. No edit needed; the README/AGENTS.md update below documents the convention for any new deployment. + +### Documentation + +- `AGENTS.md` — add a one-line entry under "Build, Test, and Development Commands" or "Security & Configuration Tips" noting `CM_DEBUG=true` is the local-dev override and **must** stay unset in published env files. + +## Files Changed + +| File | Change | +|---|---| +| `app/cm_web_view.py` | Add `_debug_enabled()` helper; pass it to `app.run(debug=...)`. | +| `app/cm_api.py` | Add `import os`; add `_debug_enabled()` helper; change `run()` default to `debug=None` and resolve from env when `None`. | +| `docker-compose.yml` | Add `CM_DEBUG: ${CM_DEBUG:-false}` to `api-server` and `web-view` `environment:` blocks. | +| `.env.example` | New `Runtime` section documenting `CM_DEBUG`. | +| `AGENTS.md` | One-line note about `CM_DEBUG`. | + +No new dependencies. No version bumps. + +## Verification + +1. **Local, debug on.** Set `CM_DEBUG=true` in repo-root `.env`, run `bash scripts/local_build.sh`. Web-view log shows `* Debug mode: on` and a `Debugger PIN: ...` line. API log shows the same. +2. **Local, debug off.** Set `CM_DEBUG=false` (or remove the line). Rebuild. Logs show `* Debug mode: off` and **no PIN line**. Hitting `/api/acc/` and `/api/user/` still returns 200 with valid JSON. +3. **Prod parity check.** With `CM_DEBUG` unset in the deploy env (matches rex/siong today), confirm container logs show debug off. Confirm the existing `192.168.0.210` scanner probes for `/.env` and `/.git/config` still 404 with no traceback or debugger response. +4. **Override path.** From a Python REPL inside the api container, calling `CM_API().run(port=3001, debug=True)` still honors the explicit override (regression check on the `debug=None` sentinel). + +## Risk + +Minimal. The Werkzeug `debug=False` path is the framework default and is what every production Flask deployment uses. The only user-visible behavior loss is the in-browser traceback page and auto-reloader, both of which should never have been on in containers in the first place. + +The one edge case worth naming: the existing `cm_api.py:run()` signature lets a caller pass `debug=False` explicitly and still get debug-off behavior; changing the default to `None` preserves that. Nothing in the repo calls `run()` with a positional `debug` argument (verified via grep before implementation), so the signature change is safe. + +## Out-of-Scope Follow-Ups (for the security-hardening spec) + +Captured here so they aren't forgotten: + +- Replace `app.run` with gunicorn (or waitress) in both `cm_api` and `cm_web_view` Dockerfiles. +- Put a reverse proxy (Caddy/Traefik/nginx) in front of `web-view` with TLS, basic auth or token auth, and rate limiting. +- Add `robots.txt` returning `Disallow: /` and a 410/444 default for unknown paths to deflect noisy scanners. +- Audit `app/cm_bot_hal.py` hardcoded credentials/PIN — already flagged in `AGENTS.md` "Security & Configuration Tips". +- Confirm whether `192.168.0.210` is a NAT hop for public traffic (router/firewall question) and decide whether the host port should be bound only to a private interface.