- Single interactive script (arrow-key TUI via simple-term-menu) replaces download.py, upload.py, and export_cookies.py - Add sync command: streams new chapters site -> R2 directly without saving locally (uses RAM as cache) - Add R2/DB management submenu (status, delete specific, clear all) - Multi-select chapter picker with already-downloaded marked grayed out - Chapter list fetched via /v2.0/apis/manga/chapterByPage with pagination - Cover image captured from page network traffic (no extra fetch) - Filter prefetched next-chapter images via DOM container count - Chrome runs hidden via AppleScript on macOS (except setup mode) - DB records only created after R2 upload succeeds (no orphan rows) - Parallel R2 uploads (8 workers) with WebP method=6 quality=75 - Update CLAUDE.md to reflect new architecture - Add requirements.txt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2.5 KiB
2.5 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
Manga downloader and uploader toolkit. Currently supports m.happymh.com, designed for future multi-site support.
manga.py— Single interactive CLI. Download, upload, and sync manga. Launches real Chrome via subprocess, connects via CDP, bypasses Cloudflare. Uploads to R2 + PostgreSQL.
Architecture
Anti-bot Strategy
- Chrome launched via
subprocess.Popen(not Playwright) to avoid automation detection - Playwright connects via CDP (
connect_over_cdp) for scripting only - Persistent browser profile in
.browser-data/preserves Cloudflare sessions - All navigation uses JS (
window.location.href) orpage.gotowithwait_until="commit" - Images downloaded via
response.body()from network interception (no base64)
Data Flow
- Input:
manga.json— JSON array of manga URLs - Download: Chrome navigates to manga page → API fetches chapter list → navigates to reader pages → intercepts image URLs from API → downloads via browser fetch
- Local storage:
manga-content/<slug>/with cover.jpg, detail.json, and chapter folders - Upload: Converts JPG→WebP → uploads to R2 → creates DB records
Key APIs (happymh)
- Chapter list:
GET /v2.0/apis/manga/chapterByPage?code=<slug>&lang=cn&order=asc&page=<n> - Chapter images:
GET /v2.0/apis/manga/reading?code=<slug>&cid=<chapter_id>(intercepted from reader page) - Cover: Captured from page load traffic (
/mcover/responses)
Directory Convention
manga-content/
<slug>/
detail.json # metadata (title, author, genres, description, cover URL)
cover.jpg # cover image captured from page traffic
1 <chapter-name>/ # chapter folder (ordered by API sequence)
1.jpg
2.jpg
...
R2 Storage Layout
manga/<slug>/cover.webp
manga/<slug>/chapters/<number>/<page>.webp
Environment Variables (.env)
R2_ACCOUNT_ID=
R2_ACCESS_KEY=
R2_SECRET_KEY=
R2_BUCKET=
R2_PUBLIC_URL=
DATABASE_URL=postgresql://...
Future: Multi-site Support
Current code is specific to happymh.com. To add new sites:
- Extract site-specific logic (chapter fetching, image URL extraction, CF handling) into per-site modules
- Keep shared infrastructure (Chrome management, image download, upload) in common modules
- Each site module implements:
fetch_chapters(page, slug),get_chapter_images(page, slug, chapter_id),fetch_metadata(page)