A patch series adding subtitle format conversion to FFmpeg, maintained against master and the 8.1 release branch. Currently in its seventh iteration.
Prior art. Two attempts stalled. Coza (2022) put libass in libavcodec — a dependency violation. softworkz (2021–2025) proposed full subtitle filter infrastructure across 25 patches and 89 files; design questions remain unresolved.
This series takes the minimal path. libass (rendering) and Tesseract (OCR) already live in libavfilter. Small utility APIs there, called from fftools in the same pattern as sub2video, handle both directions. No filter infrastructure. No AVFrame changes.
Stable: pgs7-8.1 (29 patches). Latest: pgs7 (29 patches on master). Plus a standalone DVB fix (1 patch). Structured as independent submission groups for upstream review.
force_all option for forced subtitles; max_cdb_usage CDB rate controlAV_DISPOSITION_FORCED ↔ per-rect flag propagation-forced_subs_filter CLI option to split streams by forced flagav_quantize_* APIvf_paletteuseffmpeg_enc_sub.c (render, quantize, animate, coalesce)forced_style option: ASS style names → forced flag (comma-separated, opt-in)ffmpeg_dec_sub.c)-sub_ocr_lang, -sub_ocr_datapath CLI optionsPalette quantization PSNR at 256 colours (PGS maximum). NeuQuant is the default; Median Cut and ELBG available via -quantize_method. Full comparison.
Tested across 114 languages with UDHR Article 1 roundtrip (text → PGS → text). 105 pass, 9 fail (Tesseract training data limitations). Bitmap deduplication skips OCR on palette-only changes (PGS fade sequences). PSM fallback handles RTL and complex scripts. Per-language preserve_interword_spaces tuning prevents word merging in Arabic, Hebrew, and Persian.
18 FATE tests run on every push via CI. 17 are self-contained; sub-pgs uses fate-suite samples.
PGS encoder
Forced subtitles & rate control
Animation & pipeline
Quantization & GIF
OCR
Full OCR language coverage (114 languages, 92% pass rate) on the OCR languages page.
Known upstream failures (not our patches)
Fails on clean FFmpeg 8.1 without our patches. Pre-existing libswscale issue.
Decoder model grounded in the Panasonic and Sony HDMV patents (US20090185789A1, US8638861B2, US7620297B2). Buffer sizes, transfer rates, and palette limits taken from the patents, not reverse-engineered from player behaviour.
SUPer by cubicibo used as a hardware-validated reference.
Each iteration informed by testing, review, and upstream FFmpeg conventions. History preserved as git tags.
docs/pgs-specification.md)ff_sub_render_* / ff_sub_ocr_* in libavfilter, pipeline in fftools#include .c patternAV_CODEC_PROP_EXPLICIT_END, avpriv_ cross-library prefixeslibavutil/sub_util.{h,c}, internal structs to palettemap_internal.hdoc/encoders.texi (PGS), doc/ffmpeg.texi (OCR options), ChangelogAV_CODEC_PROP_EXPLICIT_END and av_subtitle_needs_clear()force_all option, disposition bridge (AV_DISPOSITION_FORCED ↔ per-rect flags)-forced_subs_filter CLI option: split streams by forced/non-forcedforced_style: ASS style names → forced flag (comma-separated, opt-in)max_cdb_usage drops events exceeding HDMV buffer modelpgs-test-util.hMeasured against real PGS output.
Overlapping events: A 1s-5s, B 3s-7s DS 1 T=1.0s Epoch Start A alone DS 2 T=3.0s Epoch Start A+B composite DS 3 T=5.0s Epoch Start B alone (A expired) DS 4 T=7.0s Normal clear (B expired)