In Memory Storage Flashcards
(15 cards)
What is Buffer Pool Overhead?
- accessing page through page table (lookup cost)
- calculating memory pointer to tuple
3 Advantages of Column Stores for OLAP
- higher read efficiency if only few columns need to be accessed
- better compression schemes
- enables vector processing
Advantage of Heavyweight compression
Data can be heavily compressed
Advantage of lightweight compression
- queries can often directly operate on compressed data
Basic idea of dictionary compression
Encode values in any column as integers
What property does dictionary compression have to have?
- Order preserving
Main problem with order preserving dictionaries?
Updates
2 Solutions for updates on order preserving dictionaries
- store new values in separate partition and merge in regular intervals
- leave gaps between codes
Basic idea of bit -packing
- Map a wide and sparse domain into a dense domain
- pack multiple codes in a processor word
Run-Length Encoding
- runs of values are encoded as (value, start, length)
Frame of reference encoding
- encodes values as offset from frame using a fixed number of bits for offset
- escape code indicates that next value cannot be represented as offset
Differential encoding
- encode values as offset from previous value using fixed number of bits
- escape code can be used to mark exceptions
Early Materialization
Decompress columns and reconstruct tuples as part of your table scan
Late materialization
Wait as long as possible with decompression and tuple reconstruction
Heavyweight compression use case
Accelerate I/O path from SSD using GPu