← FFmpeg Subtitle Tools

Colour Quantization: Three Algorithms, One API

Three colour quantization algorithms integrated into a single av_quantize_* API. Quality and performance data on representative subtitle images.

All measurements from a 128×64 test image with quality=10. PSNR computed per-pixel in sRGB against the original, averaged over RGB channels.

The Algorithms

ELBG

Patane & Russo, 2001

Enhanced LBG vector quantizer. Iteratively refines codebook entries by splitting high-distortion cells and merging low-utility ones. General N-dimensional; wrapped here for 4D RGBA.

Median Cut

Heckbert, 1982

Recursively splits colour space along the axis of greatest variance. Deterministic and predictable. Extracted from FFmpeg's vf_palettegen filter. Uses OkLab perceptual space internally.

NeuQuant

Dekker, 1994

Kohonen self-organising map in colour space. Neurons compete to represent input colours via competitive learning. Fast, but non-deterministic and quality-sensitive to learning rate.

Quality — PSNR (dB)

Higher is better. ≥40 dB is visually lossless for subtitles. Measured on five test images at three palette sizes.

16-colour palette

ImageELBGMedian CutNeuQuant
Simple white text99.9999.9999.99
Multicolour (4 regions)42.4033.9532.83
Karaoke highlight99.9999.9999.99
RGB gradient23.3821.7720.72
HSV colour sweep20.7319.1415.32

64-colour palette

ImageELBGMedian CutNeuQuant
Simple white text99.9999.9999.99
Multicolour (4 regions)53.7354.5142.00
Karaoke highlight99.9999.9999.99
RGB gradient29.3827.3624.97
HSV colour sweep26.3724.1420.06

256-colour palette (PGS maximum)

ImageELBGMedian CutNeuQuant
Simple white text99.9999.9999.99
Multicolour (4 regions)64.9369.2041.76
Karaoke highlight99.9999.9999.99
RGB gradient35.3133.2724.78
HSV colour sweep32.3129.4620.84
Key finding: For typical subtitle images (white text, karaoke, limited colours), all three algorithms produce perfect or near-perfect results. Differences only appear on high-colour-count stress tests. ELBG leads on gradients; Median Cut edges ahead on structured multi-region images at large palette sizes.

Quality — Visual Bars (256 colours)

PSNR on the hardest test cases. The "simple" cases are all 99.99 dB and omitted.

Multicolour (4 regions)
ELBG
64.93 dB
Median Cut
69.20 dB
NeuQuant
41.76 dB
RGB gradient
ELBG
35.31 dB
Median Cut
33.27 dB
NeuQuant
24.78 dB
HSV colour sweep
ELBG
32.31 dB
Median Cut
29.46 dB
NeuQuant
20.84 dB

Performance — Time (ms)

Generate palette + map pixels for 8,192 pixels (128×64). Lower is better.

HSV colour sweep (worst case)

Palette sizeELBGMedian CutNeuQuant
16 colours2.548.412.16
64 colours7.3010.902.86
256 colours24.5917.915.06

Simple white text (best case)

Palette sizeELBGMedian CutNeuQuant
16 colours1.510.740.49
64 colours4.330.720.67
256 colours15.980.761.40
Key finding: NeuQuant is consistently the fastest. ELBG scales poorly with palette size (O(n×steps×codebook)). Median Cut is fast on simple images but slows down on high-entropy inputs due to histogram construction in OkLab space. For PGS subtitles (typically <100 unique colours), all three complete in under 5 ms.

Recommendation

For most subtitle work: use the default (NeuQuant). Subtitles typically have fewer than 100 unique colours. All three algorithms achieve perfect or near-perfect PSNR at this complexity. NeuQuant is the fastest and produces excellent results.

For complex ASS with gradients or many colours: consider Median Cut or ELBG. The data shows 10–15 dB improvement on gradient stress tests. ELBG is best on pure gradients; Median Cut is best on structured multi-region images and runs faster.

Methodology

Links