- Single interactive script (arrow-key TUI via simple-term-menu) replaces download.py, upload.py, and export_cookies.py - Add sync command: streams new chapters site -> R2 directly without saving locally (uses RAM as cache) - Add R2/DB management submenu (status, delete specific, clear all) - Multi-select chapter picker with already-downloaded marked grayed out - Chapter list fetched via /v2.0/apis/manga/chapterByPage with pagination - Cover image captured from page network traffic (no extra fetch) - Filter prefetched next-chapter images via DOM container count - Chrome runs hidden via AppleScript on macOS (except setup mode) - DB records only created after R2 upload succeeds (no orphan rows) - Parallel R2 uploads (8 workers) with WebP method=6 quality=75 - Update CLAUDE.md to reflect new architecture - Add requirements.txt Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
68 lines
2.5 KiB
Markdown
68 lines
2.5 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
Manga downloader and uploader toolkit. Currently supports m.happymh.com, designed for future multi-site support.
|
|
|
|
- `manga.py` — Single interactive CLI. Download, upload, and sync manga. Launches real Chrome via subprocess, connects via CDP, bypasses Cloudflare. Uploads to R2 + PostgreSQL.
|
|
|
|
## Architecture
|
|
|
|
### Anti-bot Strategy
|
|
- Chrome launched via `subprocess.Popen` (not Playwright) to avoid automation detection
|
|
- Playwright connects via CDP (`connect_over_cdp`) for scripting only
|
|
- Persistent browser profile in `.browser-data/` preserves Cloudflare sessions
|
|
- All navigation uses JS (`window.location.href`) or `page.goto` with `wait_until="commit"`
|
|
- Images downloaded via `response.body()` from network interception (no base64)
|
|
|
|
### Data Flow
|
|
1. **Input**: `manga.json` — JSON array of manga URLs
|
|
2. **Download**: Chrome navigates to manga page → API fetches chapter list → navigates to reader pages → intercepts image URLs from API → downloads via browser fetch
|
|
3. **Local storage**: `manga-content/<slug>/` with cover.jpg, detail.json, and chapter folders
|
|
4. **Upload**: Converts JPG→WebP → uploads to R2 → creates DB records
|
|
|
|
### Key APIs (happymh)
|
|
- Chapter list: `GET /v2.0/apis/manga/chapterByPage?code=<slug>&lang=cn&order=asc&page=<n>`
|
|
- Chapter images: `GET /v2.0/apis/manga/reading?code=<slug>&cid=<chapter_id>` (intercepted from reader page)
|
|
- Cover: Captured from page load traffic (`/mcover/` responses)
|
|
|
|
## Directory Convention
|
|
|
|
```
|
|
manga-content/
|
|
<slug>/
|
|
detail.json # metadata (title, author, genres, description, cover URL)
|
|
cover.jpg # cover image captured from page traffic
|
|
1 <chapter-name>/ # chapter folder (ordered by API sequence)
|
|
1.jpg
|
|
2.jpg
|
|
...
|
|
```
|
|
|
|
## R2 Storage Layout
|
|
|
|
```
|
|
manga/<slug>/cover.webp
|
|
manga/<slug>/chapters/<number>/<page>.webp
|
|
```
|
|
|
|
## Environment Variables (.env)
|
|
|
|
```
|
|
R2_ACCOUNT_ID=
|
|
R2_ACCESS_KEY=
|
|
R2_SECRET_KEY=
|
|
R2_BUCKET=
|
|
R2_PUBLIC_URL=
|
|
DATABASE_URL=postgresql://...
|
|
```
|
|
|
|
## Future: Multi-site Support
|
|
|
|
Current code is specific to happymh.com. To add new sites:
|
|
- Extract site-specific logic (chapter fetching, image URL extraction, CF handling) into per-site modules
|
|
- Keep shared infrastructure (Chrome management, image download, upload) in common modules
|
|
- Each site module implements: `fetch_chapters(page, slug)`, `get_chapter_images(page, slug, chapter_id)`, `fetch_metadata(page)`
|