Memory and Cache Flashcards
(22 cards)
What is cache?
A small amount of fast memory which holds data fetched from and written to main memory
Why is the memory hierarchy useful?
It provides a balance between capacity, cost, and access time.
What is spatial locality?
Data nearby in memory is more likely to be used
What is temporal locality?
Data will be re-used within a relatively short amount of time
How is spatial locality maintained?
Data is transferred from main memory to cache in fixed-size blocks called cache lines. When adjacent memory locations are addressed, they are likely already in cache.
What is a cache hit?
What the data being read is in cache and doesn’t have to be loaded from main memory.
What is a cache miss?
When the next piece of data is not in cache and needs to be loaded from memory.
Why is cache useful for speeding up finite difference calculations?
Finite differences in the j direction access memory sequentially resulting in spatial locality
What is cache blocking?
Splitting the work into cache-sized blocks and working on one block at a time before moving onto the next. This allows all data that is required to be held in cache speeding up access.
How does test case size affect testing?
Usually, working with smaller test cases allows running in a manageable time. However reducing the problem size can sometimes change performance characteristics, such as when a significant data structure can or can’t fit in cache.
What is compiler optimisaiton?
The process of the compiler making adjustments to the code to produce equivalent results faster. This includes techniques such as loop interchange and cache blocking.
How are compiler optimisations turned on?
Using the -O3 command line flag for gcc
What is arithmetic intensity?
The ratio of floating point operations to data movement.
What does the roofline model tell us?
The roofline model tells us about floating point performance based on peak performance, memory bandwidth and arithmetic intensity.
What are the two types of performance bound?
Memory Bound and Compute Bound
What is a memory bound algorithm?
An algorithm with lower arithmetic intensity that is limited by memory bandwidth.
What is a compute bound algorithm?
An algorithm with higher arithmetic intensity that can make more efficient use of floating point hardware and is limited by floating point performance.
What is NUMA?
NUMA (non-uniform memory access) is the phenomenon that memory at various points in the address space of a processor has different performance characteristics.
How does NUMA affect compute nodes?
Because compute nodes are commonly made up of two sockets, accessing memory on the other controller is slower than accessing memory on the local controller.
What is first touch memory allocation?
Physical memory is allocated the first time it is accessed (touched) rather than when it is allocated. Physical memory is allocated on the first memory controller to touch a given memory page.
What affect does NUMA have on OpenMP?
If an array is first used in a linear loop then all memory will be touched in one controller first slowing down future parallel loops.