29
Jan
16

Huffman Coding – Compression results

Before the Christmas break I started writing a encoding/compression library based on a technique called Huffman coding. I’ve since completed and tested the encoder and decoder, today we’ll discuss the results of several compression tests in comparison to the RLE encoding technique. Here are links back to the posts for Huffman Coding and Run-Length Encoding.

So I’ve collected a few different types of data files with different characteristics. First up is a MS-DOS executable, these binary data files typically don’t contain runs of data and will tend to contain most if not all symbols. Next is a ASCII text file, which will typically only use alphabetic, numeric and punctuation characters. Lastly a small number of graphics files, in raw bitmap form, these particular images are fairly basic (only a handful of colours).

Here’s a table showing the results.

Compression ratio table

You’ll note that the Huffman encoder achieves better results in every example! The worst result for it being the MS-DOS executable, which is of course expected. I was pleasantly surprised that the small graphic files compressed quite well, this is likely because of how few symbols are in these files. If there had been a large variety of symbols the dictionary and the encoded data would be much larger.

Run-length encoding was obviously much worse, and in the case of ASCII text and executable data it actually increased the size of the data! Knowing how it works this is hardly surprising. I did expect RLE to perform better than Huffman coding for the graphic data, but I suspect the low colour count in these files is an influencing factor.

So I thought I’d create a graphic data file that had a higher colour count that would favour RLE to see if it can do better under specific conditions. The graphic data I created is a sprite 32×32 pixels with horizontal coloured stripes, each a unique colour. The size of the raw file came out at 1026 bytes.

The Huffman encoder produced a file of 766 bytes and the RLE produced one of 68 bytes! Whilst the Huffman encoder still managed to compress the data, it couldn’t match the best case scenario for RLE.

There are good reasons to use both compression techniques. Run-length encoding doesn’t require much CPU power to encode or decode, so it can be done on very weak machines (even old 8-bit machines), but it only compresses data with many runs in it well. This turns out to be useful for graphics and level data. Huffman coding will pretty much always reduce the size of the data encoded, sometimes significantly, but it is much more complex. It requires more RAM and CPU power to achieve this result, so it can’t be used on every machine, although most modern processors would have no trouble at all.

I’ve made the code for the Huffman encoder/decoder available in pascal for download. be aware the encoder still has some debug code in it that will write to the screen.

Advertisements

0 Responses to “Huffman Coding – Compression results”



  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Blogs I Follow

Enter your email address to follow this blog and receive notifications of new posts by email.


Mister G Kids

A daily comic about real stuff little kids say in school. By Matt Gajdoš

Random Battles: my life long level grind

completing every RPG, ever.

Gough's Tech Zone

Reversing the mindless enslavement of humans by technology.

Retrocosm's Vintage Computing, Tech & Scale RC Blog

Random mutterings on retro computing, old technology, some new, plus radio controlled scale modelling.

ancientelectronics

retro computing and gaming plus a little more

Retrocomputing with 90's SPARC

21st-Century computing, the hard way

lazygamereviews

MS-DOS game reviews, retro ramblings and more...

%d bloggers like this: