Compression algorithm assigning shorter codes to frequent data — foundation of JPEG, MP3, most image codecs. In workflow: explains why re-encoding compressed formats causes generational loss.
When you work with compressed image formats—and you do that daily—an algorithm developed by David Huffman in 1952 runs in the background. It works on a simple, ingenious principle: frequently occurring values get short bit sequences, rare ones get longer ones. This saves storage without information having to be lost—theoretically, at least.
On set or in post, you're not usually directly interested in this. You hit Export, choose JPEG or H.264, and you're done. But Huffman coding is the core reason why these formats become so small. JPEG uses it after DCT (Discrete Cosine Transform) to compress the transformed coefficients. MP3 does the same with audio. The encoder analyzes your data, counts frequencies, generates an optimal code tree, and then only stores the shortest codes—along with a lookup table so the decoder can decode again.
Where it becomes critical for you: Generational Loss. When you open a JPEG, you decompress it. The Huffman coding is reversed—but the information discarded during DCT quantization is gone. If you save the image again as JPEG, Huffman coding is recalculated. Each recompression exacerbates the quality loss. That's why you archive intermediate footage in lossless formats (ProRes, DNxHD) or uncompressed—Huffman coding doesn't even need to be involved there.
A practical tip: If you need to store large RAW sequences or high-res proxies, look into lossless codecs. Huffman coding is reversible there but costs more storage. This is relevant in DCP workflows or for archiving policies—your colorist will thank you if they don't have to work with artifact-ridden compressions. In short: Huffman coding is the invisible workhorse that makes your files small, but never works losslessly when quantization is involved.