A patch series adding subtitle format conversion to FFmpeg, maintained against master and the 8.1 release branch. Currently in its sixth iteration.
Prior art. Two attempts stalled. Coza (2022) put libass in libavcodec — a dependency violation. softworkz (2021–2025) proposed full subtitle filter infrastructure across 25 patches and 89 files; design questions remain unresolved.
This series takes the minimal path. libass (rendering) and Tesseract (OCR) already live in libavfilter. Small utility APIs there, called from fftools in the same pattern as sub2video, handle both directions. No filter infrastructure. No AVFrame changes.
Stable: pgs5-8.1 (23 patches). Latest: v5 (23 patches). Series A/B are independent. C depends on B. D and E are independent additions. F builds on A and B.
av_quantize_* APIvf_paletteuseff_sub_render_* API in libavfilterlibavutil/sub_util.{h,c}ffmpeg_enc_sub.c (render, quantize, animate, coalesce)av_quantize_* and ff_palette_map_applyff_sub_ocr_* API in libavfilterffmpeg_dec_sub.c (OCR, grayscale, dedup, positioning)ffmpeg_enc.c encoding pipeline with clear Display Sets via AV_CODEC_PROP_EXPLICIT_END-sub_ocr_lang, -sub_ocr_datapath CLI optionsdoc/encoders.texi (PGS), doc/ffmpeg.texi (OCR options), ChangelogPalette quantization PSNR at 256 colours (PGS maximum). NeuQuant is the default; Median Cut and ELBG available via -quantize_method. Full comparison.
Tested across 114 languages with UDHR Article 1 roundtrip (text → PGS → text). 105 pass, 9 fail (Tesseract training data limitations). Bitmap deduplication skips OCR on palette-only changes (PGS fade sequences). PSM fallback handles RTL and complex scripts. Per-language preserve_interword_spaces tuning prevents word merging in Arabic, Hebrew, and Persian.
13 FATE tests run on every push via CI. 12 are self-contained; sub-pgs uses fate-suite samples.
PGS encoder
Animation & pipeline
Quantization & GIF
OCR
Full OCR language coverage (114 languages, 92% pass rate) on the OCR languages page.
Known upstream failures (not our patches)
Fails on clean FFmpeg 8.1 without our patches. Pre-existing libswscale issue.
Decoder model grounded in the Panasonic and Sony HDMV patents (US20090185789A1, US8638861B2, US7620297B2). Buffer sizes, transfer rates, and palette limits taken from the patents, not reverse-engineered from player behaviour.
SUPer by cubicibo used as a hardware-validated reference.
Each iteration informed by testing, review, and upstream FFmpeg conventions. History preserved as git tags.
docs/pgs-specification.md)ff_sub_render_* / ff_sub_ocr_* in libavfilter, pipeline in fftools#include .c patternAV_CODEC_PROP_EXPLICIT_END, avpriv_ cross-library prefixeslibavutil/sub_util.{h,c}, internal structs to palettemap_internal.hdoc/encoders.texi (PGS), doc/ffmpeg.texi (OCR options), ChangelogAV_CODEC_PROP_EXPLICIT_END and av_subtitle_needs_clear()Measured against real PGS output.
Overlapping events: A 1s-5s, B 3s-7s DS 1 T=1.0s Epoch Start A alone DS 2 T=3.0s Epoch Start A+B composite DS 3 T=5.0s Epoch Start B alone (A expired) DS 4 T=7.0s Normal clear (B expired)