Sunday, 21 July 2013

Huffman Data Compression

Computers store text in several ways. The most common encoding is ASCII where each character is stored in an 8 bit byte.  for example, the character 'e' is the bit sequence '001100101'. As any byte can take one of 256 distinct values (0 to 255) there is plenty of room for the Latin alphabet, the numbers zero to nine and lots of punctuation characters such as periods, commas, newline indicators and so on. Actually, if we stick to English text, we can get by the first 127 values. Other European languages require some more letters and use the upper range 128-255 as well.
Still other languages, like Russian or Chinese, require more elaborate coding systems such as Unicode where multiple bytes are used per character.

The secret of compressing data  is based on the frequent used letter is coded less than normal 8 bit that we  can reduce the memory usage

using racket , JavaScript and Python i  created the Huffman Compression      
to see the source codes go to Github Repo

No comments:

Post a Comment