Colour Quantization: Three Algorithms, One API

Three colour quantization algorithms integrated into a single av_quantize_* API. Quality and performance data on representative subtitle images.

All measurements from a 128×64 test image with quality=10. PSNR computed per-pixel in sRGB against the original, averaged over RGB channels.

The Algorithms

ELBG

Patane & Russo, 2001

Enhanced LBG vector quantizer. Iteratively refines codebook entries by splitting high-distortion cells and merging low-utility ones. General N-dimensional; wrapped here for 4D RGBA.

Median Cut

Heckbert, 1982

Recursively splits colour space along the axis of greatest variance. Deterministic and predictable. Extracted from FFmpeg's vf_palettegen filter. Uses OkLab perceptual space internally.

NeuQuant

Dekker, 1994

Kohonen self-organising map in colour space. Neurons compete to represent input colours via competitive learning. Fast, but non-deterministic and quality-sensitive to learning rate.

Quality — PSNR (dB)

Higher is better. ≥40 dB is visually lossless for subtitles. Measured on five test images at three palette sizes.

16-colour palette

Image	ELBG	Median Cut	NeuQuant
Simple white text	99.99	99.99	99.99
Multicolour (4 regions)	42.40	33.95	32.83
Karaoke highlight	99.99	99.99	99.99
RGB gradient	23.38	21.77	20.72
HSV colour sweep	20.73	19.14	15.32

64-colour palette

Image	ELBG	Median Cut	NeuQuant
Simple white text	99.99	99.99	99.99
Multicolour (4 regions)	53.73	54.51	42.00
Karaoke highlight	99.99	99.99	99.99
RGB gradient	29.38	27.36	24.97
HSV colour sweep	26.37	24.14	20.06

256-colour palette (PGS maximum)

Image	ELBG	Median Cut	NeuQuant
Simple white text	99.99	99.99	99.99
Multicolour (4 regions)	64.93	69.20	41.76
Karaoke highlight	99.99	99.99	99.99
RGB gradient	35.31	33.27	24.78
HSV colour sweep	32.31	29.46	20.84

Key finding: For typical subtitle images (white text, karaoke, limited colours), all three algorithms produce perfect or near-perfect results. Differences only appear on high-colour-count stress tests. ELBG leads on gradients; Median Cut edges ahead on structured multi-region images at large palette sizes.

Quality — Visual Bars (256 colours)

PSNR on the hardest test cases. The "simple" cases are all 99.99 dB and omitted.

Multicolour (4 regions)

ELBG

64.93 dB

Median Cut

69.20 dB

NeuQuant

41.76 dB

RGB gradient

ELBG

35.31 dB

Median Cut

33.27 dB

NeuQuant

24.78 dB

HSV colour sweep

ELBG

32.31 dB

Median Cut

29.46 dB

NeuQuant

20.84 dB

Performance — Time (ms)

Generate palette + map pixels for 8,192 pixels (128×64). Lower is better.

HSV colour sweep (worst case)

Palette size	ELBG	Median Cut	NeuQuant
16 colours	2.54	8.41	2.16
64 colours	7.30	10.90	2.86
256 colours	24.59	17.91	5.06

Simple white text (best case)

Palette size	ELBG	Median Cut	NeuQuant
16 colours	1.51	0.74	0.49
64 colours	4.33	0.72	0.67
256 colours	15.98	0.76	1.40

Key finding: NeuQuant is consistently the fastest. ELBG scales poorly with palette size (O(n×steps×codebook)). Median Cut is fast on simple images but slows down on high-entropy inputs due to histogram construction in OkLab space. For PGS subtitles (typically <100 unique colours), all three complete in under 5 ms.

Recommendation

For most subtitle work: use the default (NeuQuant). Subtitles typically have fewer than 100 unique colours. All three algorithms achieve perfect or near-perfect PSNR at this complexity. NeuQuant is the fastest and produces excellent results.

For complex ASS with gradients or many colours: consider Median Cut or ELBG. The data shows 10–15 dB improvement on gradient stress tests. ELBG is best on pure gradients; Median Cut is best on structured multi-region images and runs faster.

Methodology

Test images: 128×64 procedurally generated (no external dependencies)
Five scenarios: white text on black, 4-quadrant multicolour, karaoke highlight, RGB gradient, full HSV sweep
Three palette sizes: 16, 64, 256 (matching DVD, DVB, and PGS maximums)
Quality parameter: 10 (default) for all algorithms
PSNR: per-pixel squared error in sRGB, averaged over RGB channels, 10×log10(255²/MSE)
Timing: clock_gettime(CLOCK_MONOTONIC), includes both palette generation and pixel mapping
Platform: Linux x86_64, GCC, single-threaded

Links

FFmpeg Subtitle Tools Colour distance metrics Patches

Measurements generated from libavutil/quantize.c test infrastructure. Developed with assistance from Claude (Anthropic).