TECH PACK: Huffman Data Compression

Computers store text in several ways. The most common encoding is ASCII where each character is stored in an 8 bit byte. for example, the character 'e' is the bit sequence '001100101'. As any byte can take one of 256 distinct values (0 to 255) there is plenty of room for the Latin alphabet, the numbers zero to nine and lots of punctuation characters such as periods, commas, newline indicators and so on. Actually, if we stick to English text, we can get by the first 127 values. Other European languages require some more letters and use the upper range 128-255 as well.
Still other languages, like Russian or Chinese, require more elaborate coding systems such as Unicode where multiple bytes are used per character.

The secret of compressing data is based on the frequent used letter is coded less than normal 8 bit that we can reduce the memory usage

using racket , JavaScript and Python i created the Huffman Compression
to see the source codes go to Github Repo

TECH PACK

Sunday, 21 July 2013

Huffman Data Compression

No comments:

Post a Comment