Lecture 4: Shared Memory Multiprocessors, Lecture 6: MESI and MOESI Cache Coherence, Lecture 7: Directory-Based Cache Coherence Flashcards

1
Q

How does the structure of multiprocessors differ for shared and distributed memory?

A

Shared memory: all cpus have access to the same memory

Distributed memory: each cpu has its own local memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Whhich is more prevalent in mutliprocessors, shared or distributed memory, and why? Are there any drawbacks?

A

Shared memory because its easier to program. Hardware is more complex

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What problem does hardware complexity of shared memory multiprocessors lead to?

A

bus-based cache coherence systems that do not scale well

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are caches?

A

Fast and small local memory holding recently used data and instruction. Can have different levels (L1, L2 (local), L3 (shared))

Main memory cannot keep up with processor speed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are two cache functions?

A
  1. Fetch data from RAM on cache misses
  2. Write modified data back to RAM
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the cache coherency problem?

A

Inconsistency of shared data across multiple caches in multi-core systems. Can occur when one core updates a value in its cache, leaving outdated copies in other caches.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the solution to the cache coherency problem?

A

cache to cache communication with bus snooping

for performance to avoid involving the slow memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How does bus snooping work in maintaining cache coherency in multiprocessor systems?

A

Hardware attached to each core’s cache (one bus). Observes all transactions on the bus and able to modify the cache independently of the core. This hardware can take action on seeing pertinent transactions on the bus

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

True or false

Cache has 2 control bits for each line it contains, indicating its state

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How many cache states are there?

A

Three:
1. Modified
2. Invalid
3. Shared

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q
  1. Modified state
  2. Invalid
  3. Shared

A. Implicit. A valid cache entry exists and the line has the same values as main memory. Several caches can have the same line in that state

B. there may be an address match on this line but the data is not valid. We must go to memory and fetch it or get it from another cache

C. the cache line is valid and has been written to but the latest values have not been updated in memory yet. A line can be in the modified state in at most 1 core

A
  1. C
  2. B
  3. A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Which of the following states are legal?

a. modified invalid
b. modified modified
c. shared shared
d. invalid shared
e. modified shared
f. invalid invalid

A

a, c, d, f

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are the aspects of state transitions?

A
  1. messages
  2. access made to main memory
  3. state changes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the messages sent between caches?

A
  1. Read messages: 1 core request a cache line from another
  2. Invalidate messages: 1 core asks another to invalidate one of its cache lines
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Describe the state transitions from the following state

modified invalid

A

Read on core 1: cache hit, served from cache
Write on core 1: cache hit, served from cache
Read on core 2: Overall change to state: shared/shared
Write on core 2: Overall state changes to (a’): invalid/modified

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Describe the state transitions from the following state

invalid invalid

A

Read on core 1: Go to (c’) shared/invalid
Read on core 2: symmetry - goes to state (c): invalid/shared
Write on core 1: State goes to (a) modified/invalid
Write on core 2: symmetry - goes to state (a’): invalid/modified

17
Q

Describe the state transitions from the following state

invalid shared

A

Read on core 1: Overall state goes to (d): shared/shared
Read on core 2: Cache hit, stays in (c): invalid/shared
Write on core 1: Overall state goes to (a): modified/invalid
Write on core 2: Overall state goes to (a’): invalid/modified

18
Q

Describe the state transitions from the following state

shared shared

A

Read on core 1 or 2: cache hit, served from the cache
Write on core 1: Overall state goes to: modified/invalid
Write on core 2: symmetry, state goes to (a’): invalid/modified

19
Q

Describe state transitions beyond 2 cores

A

Snoopy bus messages are broadcasted to all cores
1. Any core with a valid value can respond to a read request
2. Upon receiving an invalidate request:
Any core in S invalidates without writeback
A core in M writes back then invalidates

20
Q

What are the snooping protocols?

A

Write-Invalidate:
When a core updates a cache line, other copies of that line in other caches are invalidated
Future accesses on the other copies will require fetching the updated line from memory/other caches
Most widespread protocol, used in MSI (this lecture), MESI, MOESI (next video)

Write-update:

21
Q

What is a major implication of chache coherence?

A

All cores must always see exactly the same state of a location in memory
If one core writes and broadcast invalidate: No other core must be able to perform a read/write to that location
All cores must see the invalidate at the same time, i.e. within the same bus cycle
The coherence protocol is a major limitation to the number of cores that can be supported

22
Q

True or false

Multiple cores can use the bus at a time

A

False

23
Q

True or false

Invalidate always needs to be broadcasted with a write

A

False

Sometimes other cores are already invalid

24
Q

Describe MESI

A

Split the S state into:
E: exclusive
Switch to E after a read causing a fetch from memory
S: (truly) shared
Switch to S after a read that gets value from another cache

25
Q

Describe MOESI

A

Split the M state into two:
Modified: not in sync with memory only copy
Owned: not in sync with memory, other valid copies in S

Owner has exclusive rights to make changes
Broadcast the changes to the shared copies. No memory writeback needed
Writeback only when data in O or M is evicted

26
Q

True or false

Even with optimisations such as MESI and MOESI, bus-based systems can’t scale to large multiprocessor counts

A

True

27
Q

Describe directory structure

A

Each directory entry has:
1. Present bitmap: which core has a copy
2. Dirty bit: only one owner

Each line in caches also valid and dirty bits

28
Q

What is the bottleneck of directory based coherence? How can it be solved?

A

Central directory.
Distribute directory and cache it e.g. NUMA, Non-Uniform Memory Access, more than one directory each with its own address space

29
Q

NUMA drawbacks

A
  1. Slower communications
  2. Long/variable delays requires handshakes
30
Q

True or false

Directory-based coherency, improved on MESI MOESI, and is an optimal solution.

A

False

scaling to large numbers of cores is still a problem