For 3k+ row deployments, returning the full table in one shot is the
bottleneck — the JSON payload alone is hundreds of KB and the client
mounts thousands of EditableCell instances on every visit. Pagination
with auto-fetch on scroll shrinks both the wire payload and the initial
render to a single page (200 rows).
Server (app/cm_api.py):
- /acc/ and /user/ accept ?limit, ?offset, ?prefix, ?dir, plus ?sort on
/user/ (f_username | last_update_time). Defaults: limit=200 (capped at
1000), offset=0, dir=desc.
- ORDER BY done in SQL with prefix-priority: rows whose username starts
with the configured CM_PREFIX_PATTERN come first, then asc/desc by the
sort column. The 'dir' value is whitelisted to ASC|DESC before string
interpolation; everything else goes through parameterised binding.
- Schema verification (verify_tables_once) deferred to first request via
a Flask before_request hook — keeps create_app() free of MySQL touches
so unit tests + gunicorn preload still work without a live DB.
Web client:
- web/lib/api.ts: getAccountsPage / getUsersPage return { rows, hasMore }.
hasMore = (rows.length === PAGE_SIZE), so the client knows when to
stop fetching. Each page is its own Next.js cache entry (the URL is
the cache key) — caching from the previous commit still applies.
- web/app/actions.ts: loadMoreAccounts / loadMoreUsers Server Actions
for next-page requests; refreshAccounts / refreshUsers force-evict the
cache via revalidateTag before refetching page 1.
- web/app/page.tsx + users/page.tsx: only fetch the first page now.
- web/components/{accounts,users}-table.tsx: rewrote state model. Rows
accumulate as the user scrolls. An IntersectionObserver on a sentinel
div near the bottom triggers loadMore when it enters the viewport
(300px rootMargin so the next page starts loading before the user
reaches the end). useOptimistic wraps the accumulated rows for in-
flight edits; on success the row is committed locally so the change
survives even though we no longer router.refresh.
- Sort toggle now refetches from page 1 with the new dir/sort param.
Local sort over a partial set would be inconsistent.
- Mutations: delete filters from local state; create + refresh both
reset to page 1 so the row appears in its sorted position.
- Header count shows '<loaded>+' when more pages exist so the operator
knows what they're seeing isn't the full table.
Removed AutoRefresh:
- web/app/layout.tsx no longer mounts AutoRefresh.
- web/components/auto-refresh.tsx deleted.
- Reason: router.refresh every 30s would yank the user back to page 1
every time, losing scroll position and accumulated rows. Manual
Refresh button replaces it (now wired to refreshAccounts/refreshUsers
which evict cache + refetch).
Tests: deferred verify_tables_once() means tests.test_bot_cli's
CreateAppFactoryTests pass without DB env vars again. All 38 existing
tests pass.
The previous default of 8 was a regression risk: cm_transfer_credit.py
uses ThreadPoolExecutor with CM_TRANSFER_MAX_THREADS (default 20 in
prod compose), so up to 20 threads concurrently call self.db.query().
With pool_size=8, the 9th-20th threads would hit PoolError, which
gets caught by 'except Error' and silently returns []/False — making
transfers fail with no obvious cause.
Default bumped to 24 (covers the 20-thread default with 4 in reserve).
mysql.connector caps pool_size at 32; clamping with a clear log line
so a future operator who pushes CM_TRANSFER_MAX_THREADS too high gets
a readable message instead of a library traceback.
Operator note: if you raise CM_TRANSFER_MAX_THREADS, also raise
DB_POOL_SIZE to at least the same value (max 32). At 32 threads with
4 services × 32 = 128 conns total, still well under MySQL's default
max_connections=151.
Two wins, one root cause: every API request was opening TWO fresh MySQL
connections plus four wasted round-trips before the real query.
Old per-request shape (GET /acc/):
1. DB() constructor → open conn, SHOW TABLES LIKE 'acc',
SHOW TABLES LIKE 'user', close
2. db.query() → open conn, run SELECT, close
That's ~4 round-trips for ~10 ms of useful work. With the dashboard's
30 s auto-refresh and two open tabs (accounts + users), the api-server
churned through ~10 fresh MySQL connections every minute even when
nothing changed.
Changes:
- app/db.py: introduce a process-wide MySQLConnectionPool (size 8 by
default, override with DB_POOL_SIZE). DB() now just touches the cached
pool — no schema check, no fresh handshake. query()/execute() rent a
connection from the pool and return it via conn.close().
- app/db.py: extract the schema check into verify_tables_once() — runs
once at WSGI boot inside create_app() so a misconfigured DB still
fails fast at startup.
- app/cm_api.py: _close_database_connection() removed; the finally
blocks that wrapped every route are gone too. Pool reclamation lives
inside DB now.
- app/cm_api.py: create_app() and run() invoke verify_tables_once()
once at startup instead of CM_API.__init__ doing nothing useful.
Net: ~4× round-trip reduction per request, no MySQL handshake on the
hot path. With two gunicorn workers × pool_size 8 = 16 max in-flight
connections, well under MySQL's default max_connections=151.
(The user asked about 'batching the queries' — but the queries already
return the full row set in one shot. The bottleneck was connection
churn, not query shape. If row count grows past the comfortable single-
fetch range later, swap to LIMIT/OFFSET pagination at the API + table
component layer.)
End-state: a single web service (Next.js dashboard) per deployment, no
side-by-side Flask UI. The image name 'cm-web' now points at the Next.js
build; the legacy 'cm-web-next' tag is no longer published.
Changes:
- Delete app/cm_web_view.py and the Flask docker/web/Dockerfile.
- Rename docker/web-next/ → docker/web/ (Next.js Dockerfile takes the
cm-web slot).
- docker-compose.yml: drop the web-view service. Rename web-next → web,
container ${CM_DEPLOY_NAME}-web-next → ${CM_DEPLOY_NAME}-web, image
cm-web-next → cm-web, named volume web-next-auth-data → web-auth-data.
transfer-bot's depends_on no longer references web-view (vestigial
startup ordering, never a runtime dependency).
- docker-compose.override.yml: same rename, dockerfile path updated.
- envs: drop CM_WEB_NEXT_HOST_PORT. Repurpose CM_WEB_HOST_PORT for the
Next.js port (8010 dev, 8011 rex, 8012 siong) — same numeric values
formerly held by CM_WEB_NEXT_HOST_PORT, so aaPanel routes don't move.
- scripts/dev.sh: drops web-view + web-next from up/reset-db/logs;
--remove-orphans still cleans up legacy containers from before cutover.
- scripts/publish.sh: drop the cm-web-next build target.
- tests/test_debug_enabled.py: drop app.cm_web_view from the helper
matrix (cm_api is now the only Flask entrypoint with _debug_enabled).
- AGENTS.md / README.md / docs/aapanel-hardening.md: rewrite Flask-era
references; add migration steps for existing stacks; update aaPanel
port references (8000/8001/8005 → 8010/8011/8012).
- .gitignore: add .env, .venv/, .playwright-mcp/, node_modules/, .next/
so 'git add -A' can't accidentally stage secrets or build artifacts.
Operator action required to upgrade an existing deployment:
1. .env: drop CM_WEB_NEXT_HOST_PORT line. Set CM_WEB_HOST_PORT to
what CM_WEB_NEXT_HOST_PORT was. Make sure CM_AUTH_SECRET is set.
2. aaPanel: if proxy_pass pointed at the legacy Flask port
(8000/8001/8005), switch it to the new one (8010/8011/8012).
3. Pull the new cm-web image (Next.js) and redeploy the stack. The
old ${CM_DEPLOY_NAME}-web-view and ${CM_DEPLOY_NAME}-web-next
containers will be replaced by a single ${CM_DEPLOY_NAME}-web.
Verified locally: docker-compose YAML parses; transfer-bot runtime is
unchanged (only depends_on tidied); 38-test python suite passes.
api-server is internal-only after C5 (no host port in prod compose),
so the permissive 'CORS(app)' default never fires in normal operation.
Removing it eliminates a stale '*' Access-Control-Allow-Origin that
would become attack surface if a host port were ever accidentally
re-exposed.
Server-side fetches from web-view (legacy Flask) and web-next
(Next.js RSC) don't trigger CORS — that's a browser-only mechanism.
flask_cors stays in requirements.txt because cm_web_view.py still
imports it; both get removed in B4 when the legacy web-view retires.
api-server gets /create-acc-data and /create-user-data POST routes
that INSERT into the respective tables with required-field validation.
Frontend adds an 'Add' button next to Refresh in each table head;
opens a native <dialog> form with all fields. Inputs use 16px font on
phone (sm:text-[13px] desktop) so iOS doesn't auto-zoom.
A small form-dialog-shell helper centralizes the modal chrome,
field label, and input class so create-account-dialog and
create-user-dialog stay focused on their fields and validation.
- Remove all hardcoded credentials and config from Python source code:
- db.py: DB host/user/password/name/port → env vars with connection retry support
- cm_bot_hal.py: prefix, agent_id, agent_password, security_pin → env vars
- cm_bot.py: base_url → env var, fix register_user return values
- cm_web_view.py: hardcoded '13c' prefix → configurable CM_PREFIX_PATTERN
- cm_telegram.py: hardcoded 'Sky533535' pin → env var CM_SECURITY_PIN
- Parameterize docker-compose.yml for multi-deployment on same host:
- Container names use ${CM_DEPLOY_NAME} prefix (e.g. rex-cm-*, siong-cm-*)
- Network name uses ${CM_DEPLOY_NAME}-network
- Web view port configurable via ${CM_WEB_HOST_PORT}
- All service config passed as env vars (not baked into image)
- Add per-deployment env configs:
- envs/rex/.env (port 8001, prefix 13c, DB rex_cm)
- envs/siong/.env (port 8005, prefix 13sa, DB siong_cm)
- .env.example as template for new deployments
- Remove .env from .gitignore (local server, safe to commit)
- Improve telegram bot reliability:
- Add retry logic for polling with exponential backoff
- Add error handlers for Conflict, RetryAfter, NetworkError, TimedOut
- Add /9 command to show chat ID
- Add telegram_notifier.py for alert notifications
- Fix error handling in /2 and /3 command handlers
- Fix db.py cursor cleanup (close cursor before connection in finally blocks)
- Fix docker-compose.override.yml environment syntax (list → mapping)
- Update README with multi-deployment instructions
- Add AGENTS.md
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- move Python sources into app package and switch services to module entrypoints
- relocate Dockerfiles under docker/, add buildx publish script, override compose for local builds
- configure images to pull from gitea.04080616.xyz/yiekheng with env-driven tags and limits
- harden installs and transfer worker logging/concurrency for cleaner container output