Lossless compression (1.3.1 b) Flashcards by Millie Hardy

What is dictionary encoding?

replaces common phrases with a reference to the original data
e.g Romeo_Romeo_where_art_tho_Romeo = Romeo_ _where_art_tho

How well did you know this?

Not at all

Perfectly

What is run length encoding?

turns runs of identical data into a single value and the number of repetitions
e.g AAABBBBCCCCDDBBBBAAA =3A4B4C2D4B3A

How well did you know this?

Not at all

Perfectly

Run length encoding

simplest compression method
also works with images

How well did you know this?

Not at all

Perfectly

Run length encoding limits

does not work with files of natural origin
englishrarely features words with lots of repeated character
images often don’t have regions of identical colour (especially photographs)

How well did you know this?

Not at all

Perfectly

Dictionary encoding

replace repeated values with pointers to the original data
I am Sam Sam I am
token | data
———|———- 231123
1 | Sam
2 | I
3 | am
aren’t limited to individual words or characters

How well did you know this?

Not at all

Perfectly

Dictionary limitations

dictionary and compressed file can have a larger file size than the uncompressed file
need for a dictionary and pointers is a major overhead
if the file doesn’t have sections that repeat the efficiency reduces

How well did you know this?

Not at all

Perfectly

Real life lossless

often combine multiple compression approaches
e.g ZIP files use both dictionary encoding and variable length encoding (Huffman encoding)

How well did you know this?

Not at all

Perfectly

Lossless compression (1.3.1 b) Flashcards

(7 cards)