yiekheng cfd3308477 feat(media): unsupported image/video/audio formats fall back to document delivery
Old behaviour: HEIC/AVIF photos, .mov / .webm / .mkv videos, and niche
audio (FLAC, etc.) got rejected outright at upload with "Images are
not supported" / "Videos are not supported" errors. Strict but
unfriendly — recipients could still receive these as a downloadable
file via WhatsApp's document path; we just weren't using it.

New behaviour: anything not playable inline gets routed through the
document path automatically. The recipient downloads the file and
opens it in their default app. The 100 MB document cap applies
instead of the inline 5 / 16 / 16 MB caps. Only oversized uploads
get rejected.

Where the policy lives
----------------------
The classifier moved into a new `@cmbot/shared/whatsapp-media`
module so the web upload validator AND the bot's fire-reminder send
path share one source of truth:

  - resolveDeliveryKind(mime, bytes?) → "image" | "video" | "audio"
    | "document". Native types stay as-is; HEIF / AVIF / QuickTime /
    WebM / Matroska / non-MP3-or-M4A audio all collapse to "document".
  - Bytes argument is optional but recommended — sniffing the first
    12 bytes of the file catches iOS Safari's habit of labelling
    a HEIC as image/jpeg or a .mov as video/mp4. Bytes win when they
    disagree with the mime.

Web side
--------
- `lib/whatsapp-media.ts` re-exports the shared helpers and keeps
  only the validator + byte-formatter. `validateForWhatsApp` calls
  resolveDeliveryKind internally; the size cap it returns is for the
  RESOLVED kind (so a HEIC routes to document and gets the 100 MB
  cap). The "Images are not supported" / "Videos are not supported"
  rejection messages are gone — there's no format rejection anymore.
- `actions/media.ts` collapses the previous explicit-mime + byte-sniff
  pair into a single `validateForWhatsApp(mime, size, bytes)` call.
- Compose-step upload-zone hint updated to spell out the per-kind
  caps: "JPEG/PNG up to 5 MB · MP4/3GP up to 16 MB · MP3/M4A/OGG
  up to 16 MB · documents up to 100 MB".

Bot side
--------
- `fire-reminder.ts` reads the first 12 bytes of the file before
  dispatching and calls `resolveDeliveryKind(mimeType, head)` to
  pick the senderKind. So a HEIC on disk (whose mime claims
  image/jpeg) gets sent via Baileys' document path — no failed
  thumbnail extraction, message arrives as a downloadable .heic.
- New `readHeadBytes(filePath, n)` helper opens, reads N bytes,
  closes — no full-file slurp.

Tests
-----
249 web + 31 shared + 26 bot = 306 passing total.

Web (`lib/whatsapp-media.test.ts`):
- "HEIC at 30 MB allowed: routes to document (100 MB cap)"
- "HEIC at 110 MB rejects: exceeds the document cap"
- "MOV at 50 MB allowed (would be 16 MB cap as video, 100 MB as
  document)"
- "MOV pretending to be mp4 demotes to document (50 MB allowed)"
- "FLAC audio routes to document path"
- "genuine MP4 byte-sniff path keeps it as video"

Shared (`packages/shared/src/whatsapp-media.test.ts`, new):
- The cross-package contract: 11 tests covering size limits,
  classifyMediaKind, resolveDeliveryKind for native + demoted +
  byte-sniff cases, plus the underlying helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 13:07:54 +08:00

179 lines
5.9 KiB
TypeScript

/**
* WhatsApp media classification + size limits.
*
* Lives in @cmbot/shared because both the web upload validator AND
* the bot's fire-reminder loop need to agree on:
*
* 1. Per-kind size caps (image 5 MB, video 16 MB, audio 16 MB,
* document 100 MB).
* 2. Which mime types WhatsApp Web reliably plays inline. If the
* uploaded mime is outside the inline-playable set, we fall
* back to delivering the file as a document (a downloadable
* attachment) instead of rejecting the upload — recipients
* can still get the file, they just open it in their default
* app.
* 3. ISOBMFF magic-byte sniffs that catch the common case of iOS
* Safari uploading a HEIC photo (or .mov video) with a lying
* Content-Type like image/jpeg / video/mp4. The bytes win;
* mime is treated as a hint.
*
* The bot calls `resolveDeliveryKind(mime, bytes)` against the first
* 12 bytes of the file on disk to pick the correct Baileys sender
* path (image / video / audio / document). The web calls the same
* function with the buffered upload to pick the size limit AND
* decide whether to reject (only on size — never on format).
*/
const MB = 1024 * 1024;
const KB = 1024;
export const WA_LIMITS = {
image: 5 * MB,
video: 16 * MB,
audio: 16 * MB,
document: 100 * MB,
sticker: 100 * KB,
} as const;
export type WaMediaKind = keyof typeof WA_LIMITS;
/** The largest single-file upload WA will accept across all kinds. */
export const WA_MAX_BYTES = WA_LIMITS.document;
// ---------------------------------------------------------------------------
// Mime classification
// ---------------------------------------------------------------------------
/** Map a MIME type to a coarse delivery kind based on its top-level
* category. Anything not image / video / audio falls through to
* "document". */
export function classifyMediaKind(mimeType: string): WaMediaKind {
if (mimeType.startsWith("image/")) return "image";
if (mimeType.startsWith("video/")) return "video";
if (mimeType.startsWith("audio/")) return "audio";
return "document";
}
const UNSUPPORTED_IMAGE_MIMES: ReadonlySet<string> = new Set([
"image/heic",
"image/heif",
"image/heic-sequence",
"image/heif-sequence",
"image/avif",
]);
export function isUnsupportedImageMime(mimeType: string): boolean {
return UNSUPPORTED_IMAGE_MIMES.has(mimeType.toLowerCase());
}
const SUPPORTED_VIDEO_MIMES: ReadonlySet<string> = new Set([
"video/mp4",
"video/3gpp",
"video/3gpp2",
]);
export function isSupportedVideoMime(mimeType: string): boolean {
return SUPPORTED_VIDEO_MIMES.has(mimeType.toLowerCase());
}
const SUPPORTED_AUDIO_MIMES: ReadonlySet<string> = new Set([
"audio/mpeg",
"audio/mp4",
"audio/aac",
"audio/ogg",
"audio/amr",
"audio/wav",
"audio/x-wav",
]);
export function isSupportedAudioMime(mimeType: string): boolean {
return SUPPORTED_AUDIO_MIMES.has(mimeType.toLowerCase());
}
// ---------------------------------------------------------------------------
// Magic-byte sniffs (HEIF / AVIF / QuickTime)
// ---------------------------------------------------------------------------
const UNSUPPORTED_IMAGE_BRANDS: ReadonlySet<string> = new Set([
"heic", "heix", "hevc", "heim", "heis", "mif1", "msf1", "avif", "avis",
]);
const MP4_COMPATIBLE_BRANDS: ReadonlySet<string> = new Set([
"mp41", "mp42", "isom", "iso2", "iso3", "iso4", "iso5", "iso6",
"m4v ", "f4v ", "3gp4", "3gp5", "3gp6",
]);
function readFtypBrand(bytes: Uint8Array): string | null {
if (bytes.length < 12) return null;
if (
bytes[4] !== 0x66 || // 'f'
bytes[5] !== 0x74 || // 't'
bytes[6] !== 0x79 || // 'y'
bytes[7] !== 0x70 // 'p'
) {
return null;
}
return String.fromCharCode(
bytes[8]!,
bytes[9]!,
bytes[10]!,
bytes[11]!,
).toLowerCase();
}
/** True when the bytes are an ISOBMFF container with an
* image brand the bot's Sharp can't decode (HEIF / AVIF). */
export function sniffUnsupportedImage(bytes: Uint8Array): boolean {
const brand = readFtypBrand(bytes);
return brand !== null && UNSUPPORTED_IMAGE_BRANDS.has(brand);
}
/** True when the bytes are an ISOBMFF container with a brand that
* ISN'T MP4-compatible (typically QuickTime "qt " from .mov files). */
export function sniffUnsupportedVideo(bytes: Uint8Array): boolean {
const brand = readFtypBrand(bytes);
if (brand === null) return false;
return !MP4_COMPATIBLE_BRANDS.has(brand);
}
// ---------------------------------------------------------------------------
// resolveDeliveryKind — the cross-package contract
// ---------------------------------------------------------------------------
/**
* Resolve the actual Baileys sender path the bot should take, given
* the stored mime AND (optionally) the first 12 bytes of the file.
*
* - JPEG / PNG / WebP / GIF → "image"
* - HEIC / HEIF / AVIF → "document" (no inline preview)
* - MP4 / 3GP → "video"
* - .mov / WebM / MKV / AVI → "document"
* - MP3 / M4A / OGG / AAC / AMR / WAV → "audio"
* - other audio → "document"
* - everything else → "document"
*
* Bytes are optional but recommended — they catch the case where
* iOS Safari uploads a HEIC photo with mime `image/jpeg` (or a
* QuickTime .mov with mime `video/mp4`), which mime alone misses.
*/
export function resolveDeliveryKind(
mimeType: string,
bytes?: Uint8Array,
): WaMediaKind {
const native = classifyMediaKind(mimeType);
if (native === "image") {
if (isUnsupportedImageMime(mimeType)) return "document";
if (bytes && sniffUnsupportedImage(bytes)) return "document";
return "image";
}
if (native === "video") {
if (!isSupportedVideoMime(mimeType)) return "document";
if (bytes && sniffUnsupportedVideo(bytes)) return "document";
return "video";
}
if (native === "audio") {
if (!isSupportedAudioMime(mimeType)) return "document";
return "audio";
}
return "document";
}