Due to patent issues with newer schemes [1], Huffman Coding remains to be the go-to compression algorithm of the day, being applied in ZIP, JPEG, PNG… and much more.
The core idea of Huffman Coding, is to use shorter codes to represent more frequent characters. As for why we always see a mysterious tree of zeros and ones, well, that’s essentially how computers store anything: with binary codes. Essentially the goal of constructing the Huffman binary tree is: to assign the most frequent characters to the shortest zero/one path, walking from tree root to end of a branch.
How to create such a tree?
Continue reading →