sunnymh-manga-dl/CLAUDE.md

# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Project Overview

Manga downloader and uploader toolkit. Currently supports m.happymh.com, designed for future multi-site support.

- `manga.py` — Single interactive CLI. Download, upload, and sync manga. Launches real Chrome via subprocess, connects via CDP, bypasses Cloudflare. Uploads to R2 + PostgreSQL.

## Architecture

### Anti-bot Strategy
- Chrome launched via `subprocess.Popen` (not Playwright) to avoid automation detection
- Playwright connects via CDP (`connect_over_cdp`) for scripting only
- Persistent browser profile in `.browser-data/` preserves Cloudflare sessions
- All navigation uses JS (`window.location.href`) or `page.goto` with `wait_until="commit"`
- Images downloaded via `response.body()` from network interception (no base64)

### Data Flow
1. **Input**: `manga.json` — JSON array of manga URLs
2. **Download**: Chrome navigates to manga page → API fetches chapter list → navigates to reader pages → intercepts image URLs from API → downloads via browser fetch
3. **Local storage**: `manga-content/<slug>/` with cover.jpg, detail.json, and chapter folders
4. **Upload**: Converts JPG→WebP → uploads to R2 → creates DB records

### Key APIs (happymh)
- Chapter list: `GET /v2.0/apis/manga/chapterByPage?code=<slug>&lang=cn&order=asc&page=<n>`
- Chapter images: `GET /v2.0/apis/manga/reading?code=<slug>&cid=<chapter_id>` (intercepted from reader page)
- Cover: Captured from page load traffic (`/mcover/` responses)

## Directory Convention

```
manga-content/
  <slug>/
    detail.json          # metadata (title, author, genres, description, cover URL)
    cover.jpg            # cover image captured from page traffic
    1 <chapter-name>/    # chapter folder (ordered by API sequence)
      1.jpg
      2.jpg
      ...
```

## R2 Storage Layout

```
manga/<slug>/cover.webp
manga/<slug>/chapters/<number>/<page>.webp
```

## Environment Variables (.env)

```
R2_ACCOUNT_ID=
R2_ACCESS_KEY=
R2_SECRET_KEY=
R2_BUCKET=
R2_PUBLIC_URL=
DATABASE_URL=postgresql://...
```

## Future: Multi-site Support

Current code is specific to happymh.com. To add new sites:
- Extract site-specific logic (chapter fetching, image URL extraction, CF handling) into per-site modules
- Keep shared infrastructure (Chrome management, image download, upload) in common modules
- Each site module implements: `fetch_chapters(page, slug)`, `get_chapter_images(page, slug, chapter_id)`, `fetch_metadata(page)`