Exam Flashcards
(20 cards)
Name advantages of colum-wise storage (DMS)
Store only one column per page
=> All data is relevant for query
- Fetch less data from memory
- Data compression might work better
- if lucky, full data fits into cache
Explain Binary Association Tables (MonetDB)
All tables have two columns (head and tail)
Each column yields one binary association table (BAT)
Identify matching entries by using object identifiers (oid)
Using virtual oids they do not need to be materialized in memory (using only an index)
Describe Hardware Trends and memory wall, and their implications
Processors get performance boosts faster than memory, yielding a performance gap between CPU and meory speed which is called “memory wall”.
This means that CPUs spend much of their time waiting for memory
Do modern processors implement write back or write through?
Write back
Explain Direct-Mapped Cache
A block can appear at only one place in the cache
- simpler to implement
- faster
- increased chances of conflict
Explain Fully Asociative Cache
A block can be loaded into any cache line
- does not scale
- used for small Translation Lookaside Buffers (TLB)
Explain Principle of Locality
Pareto Principle (80-20)
Spatial Locality:
- Code contains loops
- Data is spatially close
Temporal Locality:
- Reuse functions
- Reuse data
Explain Set-Associative Cache
- Group cache lines into sets
- map blocks to sets
- place block anywhere within a set
Explain the consequences of operator-at-a-time processing
Consume and produce full columns
Materialize every (sub)result in memory
Run each operator exactly once
Extremely tight loops
- fit into instruction caches
- -can be optimized by compilers:
- – Loop unrolling
- – Vectorization (SIMD)
- – Hardware prefetching
Explain the consequences of tuple-at-a-time processing
Tightly interleaved
=> Combined instruction footprint large
=> Instruction Cache Misses
Large function call overhead
=> Millions of calls
Very Low instructions-per-cycle (IPC) ratio
Combined states too large for cache
=> Data Cache Misses
Hard to optimize by compiler
Explain Volcano Iterator Model (tuple-at-a-time)
- operator request tuples from their input using next()
- “Pipelining”
- operators keep their states
Explain Word Size
Word Size refers to the size of the natural unit of data “word”
A wordsize typically today is 32bit or 64bit
1 Word or multiple words are used to save integers/floats/adresses
Explain Write Back
Data is only written into cache
- Status field acts as modified marker
Write on eviction of dirty cache lines
Explain Write Through
Data is directly written to lower-level memory (and cache)
- Writes stall the CPU
- Simplifies Data Coherency
How do systems compensate for slow memory? (name 6)
Systems use caches:
- DRAM with high capacity, but long latency
- SRAM with better latency, but low capacity
- Memory Hierarchy
- Cache Lines
- Set associativity
- Locality of data and code
How to identify a block?
Derive a tag from the memory address
Implications of Cache Parameters (Size, n-way associative)
Cache misses decrease
Name a disadvantage of row-based storage (NSM)
Tuples are stored sequentially on a database page
- Load irrelevant information into the cache
Name advantages of colum-wise storage (DMS)
Store only one column per page
=> All data is relevant for query
- Fetch less data from memory
- Data compression might work better
- if lucky, full data fits into cache
Name three different strategies of block replacement
Least Recently Used LRU
First In First Out FIFO
Random