Compression Models Flashcards
(9 cards)
Entropy compression model ( Huffman coding )
Generates codewords based on the distribution of inputs.
Optimal in terms of compression rate.
Primary tests are required for learning the statistics of the data.
Static compression model ( MORSE ASCII)
Same compression model for all input symbols and the compressed codewords have all the same length for different input symbols.
Fast but not optimal because symbols have statistical probability.
Adaptive compression model (LWZ)
Progressively learns and updates the model of incoming text.
Highly efficient despite the lack of statistics of incoming data.
Huffman coding
- Sort symbols by probability
- Start from 2 symbols with smallest probability and join them into a new symbol.
- assign 0 to one of the symbols and 1 to the other
Terminate when the sum of the two probabilities reaches 100
Huffman encoding (dynamic)
Advantages:
The code is uniquely decodable and prefix free
Used in many applications ( compression jpeg)
Limitations:
Need to know probabilities of the symbols
Error propagation
Lzw (adaptive)
Encoding and decoding complexity
Nlog(n)
Why not use entropy for compression of symbols with similar probability?
Because the associated code words would have the same length so the compression rate would be low.
Entropy
Is the average amount of bits of any uniquely decodable source needs to be grater or equal than the entropy.
Compression rate
Len uncompressed/ Len compressed compare with entropy