P4L3: Distributed Shared Memory Flashcards

1
Q

What are four granularities at which we can share state?

A
  1. Cache line
  2. Variable
  3. Page
  4. Object
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are the tradeoffs associated with sharing at the granularity of the cache line?

A
  • Sharing at the granularity of the cache line is too fine-grained.
  • The amount of coherence traffic outweighs any consistency benefits
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the tradeoffs associated with sharing at the granularity of the variable?

A

PROS
This level of granularity makes it possible for programmers to specify sharing semantics on a per-variable basis

CONS
This level of sharing is still too fine-grained, and the level of network overhead will still be too high.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the tradeoffs associated with sharing at the granularity of the page?

A

PROS

  • It’s a viable option because it does not generate the coherence traffic of cache line or variable-level sharing
  • It also makes sense to the OS! So it’s readily generalizable

CONS
- Like any larger granularity, false sharing is a potential issue (which can occur when two processes are concurrently accessing different portions of the same page. In this case, the coherence traffic that gets generated is unnecessary)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the tradeoffs associated with sharing at the granularity of the object?

A

PROS

  • We avoid the coherence traffic of cache line or variable-level sharing
  • The OS doesn’t need to be modified

CONS

  • The OS does not understands objects, so this requires a specific language runtime.
  • This makes object granularity a less generalizable solution than page granularity.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

For distributed state management systems (think distributed shared memory) to maintain consistency, what abilities must it have?

A
  • when a node requests data it gets a relatively recent copy of that data.
  • we broadcast when state has changed.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Why do we differentiate between a global index structure to find the home nodes and local index structure about the portion of state they are responsible for?

A
  • The global index structure helps nodes to always find the home node for an address/page, which can ensure that a node can immediately get the most recent value for an object
  • The local index structures maintained by a home node are necessary to drive coherence mechanisms that are directed only at affected nodes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Do you have some ideas about how you would go about implementing a distributed shared memory system?

A

The OS should be involved when we try to access shared memory, but not when we try to access local memory.

  • We can use the memory management unit (MMU) for this. When we try to access a remote address locally, it will be an invalid memory references. This will generate a fault and trap to the OS.
  • The OS will then detect that the memory address is remote and use the global map to look up the home node for the requested address.
  • The OS will message that node via IPC and request the data at the address.
  • When the data is received, the OS can cache it on the CPU and return it to the process that requested it.

We also need the OS involved when a process tries to write a piece of shared data, but not local data.

  • To accomplish this we write-protect virtual addresses that correspond to shared state.
  • Writing to these addresses will cause a fault, which traps to the OS.
  • The OS will see that the access points to shared data
  • If the requesting node is not the home node, the OS will send a message to the home node asking for that state in order to update it
  • If the requesting node is the home node, the OS will update the data and broadcast coherence messages to nodes that also store that data.
  • It can determine which nodes hold the changing data by maintaining per page data structures that contain a list of nodes that have accessed that page.
  • This means that when a node requests a page or an address, it should send its node ID as part of the request.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What’s a consistency model?

A

A consistency model is an agreement between state (memory for example) and upper software layers.

It guarantees that state changes will be made visible to upper-level applications if those applications follow certain behaviors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is a strict consistency model?

A

A strict consistency model guarantees:

  1. All updates are made available everywhere, immediately.
  2. Every node in the system will see all writes in the same order.

This strategy is not possible in practice. SMPs do not even offer this guarantee on single nodes. The added latency and message reordering/loss makes this even harder.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a sequential consistency model?

A

This is next best to the strict consistency model.

  • Updates from different processors can be arbitrarily interleaved (ordered) so long as the ordering would be possible on a single processor system.
  • All processes see the same interleaving (ordering)!

EXTRAS

  • Updates are not required to be immediately visible.
  • Updates from the same process must maintain their ordering.
  • Concurrent reads will see the same value
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is a causal consistency model?

A

This is a little less strict than sequential consistency.

  • Causal consistency detects causally related writes and ensures that they maintain their order.
  • Loosens the sequential requirement that all observed orderings are the same.

EXTRA:

  • Updates from the same node cannot be arbitrarily interleaved, just like sequential model
  • But NO guarantee about concurrent writes.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is a weak consistency model?

A

Instead of inferring causal relationships on its own, a weak consistency model makes a new operation available to the upper software layers: synchronization points.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What are the four consistency models discussed in this course?

A
  • strict
  • sequential
  • causal
  • weak
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Distributed Shared Memory?

A

Distributed shared memory is a service that manages memory across multiple nodes so that applications will have the illusion that they are running on a single shared-memory machine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Why is Distributed Shared Memory important?

A

Distributed shared memory mechanisms are important because they permit scaling beyond the limitations of how much memory we can include in a single machine.

Single machines with large amounts of memory can cost hundreds of thousands of dollars per machine. In order to scale up memory affordably, it’s imperative to understand DSM concepts and semantics so many cheap machines can be connected to give the illusion of a high memory “single” machine.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Describe hardware DSM

A

Hardware-supported DSM relies on some physical interconnect. The OS running on each physical node is under the impression that it has access to much larger physical memory, and is allowed to establish virtual to physical mappings that point to physical addresses on other nodes.

Memory accesses that reference remote memory locations are passed to the network interconnect card, which translates remote memory accesses into interconnect messages that are then passed to the correct remote node.

The NICs are involved in all aspects of the memory management, access and consistency and even support some atomics.

While it’s very convenient to rely on hardware to do everything, this type of hardware is typically very expensive and as a result is reserved for very high-end machines.

18
Q

What four things does a software DSM system have to do?

A

DSM is often realized in software. The software will have to

  • detect local vs remote memory accesses
  • create and send messages to the appropriate node
  • accept messages from other nodes and perform the encoded memory operations
  • be involved in memory sharing and consistency support
19
Q

What is a cache line?

A

The smallest unit of memory that can be transferred between the main memory and the cache.

Rather than reading a single word or byte from main memory at a time, each cache entry is usually holds a certain number of words, known as a “cache line” or “cache block” and a whole line is read and cached at once. This takes advantage of the principle of locality of reference.

20
Q

What is false sharing?

A

False sharing occurs when a process accesses data that is not being altered by another process, but shares a cache block with it (like a page, or an object). This triggers coherence mechanisms even though they are unnecessary.

Consider a page that internally has two variables, x and y. A process on one node is exclusively accessing and modifying x. Similarly, a process on another node is exclusively accessing and modifying y. When x and y are on the same page, the DSM system will interpret the two write accesses as an indication of concurrent access to a shared page. This will trigger coherence mechanisms which, while logically viable, are functionally superfluous.

21
Q

How do the basic cloud computing service models differ?

A

The offerings differ primarily along the axis of ownership, with cloud providers owning different portions of an application stack for different models.

22
Q

How do we find a particular page in a DSM system?

A

First we check the global map for the manager node, then check the manager node for page (object) metadata.

GLOBAL MAP (replicated across nodes)

  • Each page (object) has an address = node ID + page frame number.
  • Node ID = ID for home/manager node. This node knows everything about this page.
  • This is captured in a Global Map (maps page address to manager node ID)
  • This map must be available on every node.

LOCAL PAGE METADATA (partitioned across nodes)
- Each manager node has all info for the page (object) that it manages

23
Q

What do we do if we want more flexibility from our global map?

A

The global map uses the page address to find the manager node. If we want to change the manager, we have to change the page address!

Instead, we can use a Global Mapping Table. This uses the object (page) ID to index into a table that returns the manager node. So instead changing the object address, we can just edit the table.

24
Q

How does the sequential consistency model treat operations from the same process?

A

They must maintain their original ordering

25
Q

Under the sequential consistency model, are updates required to be immediately visible?

A

No

26
Q

How does the sequential consistency model treat concurrent reads?

A

Concurrent reads will see the same value

27
Q

How does the causal consistency model treat operations from the same process?

A

They must maintain their original ordering

28
Q

How does the causal consistency model treat concurrent reads?

A

It makes no guarantees!

29
Q

How do we make sure that when a node requests data, it gets a relatively recent copy of that data?

A

Every node must maintain a map that connects an address/page to a specific home/manager node in the system. The requesting node will contact the home node in order to request the data that that node manages. The requesting node is then free to cache that data until a coherence request is sent out.

30
Q

How do we make sure that we broadcast a change in state?

A

The home node responsible for the changed state must maintain a per-page index of all of the nodes that have requested that page in the past.

This allows it to contact all nodes that have cached that subset of state (like a page).

31
Q

What is a global index structure (global map)?

A

The global index structure helps nodes to always find the home node for an address/page, which can ensure that a node can immediately get the most recent value for an object.

32
Q

What is a local index structure necessary for?

A

The local index structures maintained by a home node are necessary to drive coherence mechanisms that are directed only at affected nodes.

33
Q

What operations are available under the strict, sequential, and causal consistency models? What new operation is made available under weak consistency?

A
  • Read and write is available in all of them

- Under weak, memory system makes synchronization points available

34
Q

What does a synchronization point do?

A
  • When P1 synchronizes, it makes all updates from other processes available to P1
  • It also makes all updates from P1 available to other processes
  • BUT, updates are not immediately seen by other processes. They have to sync first to see them.
35
Q

What are the three variations on weak consistency?

A
  • Basic: Single sync operation syncs everything in entire shared memory
  • Separate sync per subset of state (like page)
  • Separate entry/acquire and exit/release operations (so two ops instead of one)
36
Q

What are the pros/cons of offering separate sync operations (entry/acquire and exit/release)?

A

PROS
- Limit data movement and number of coherence operations

CONS
- The shared memory layer must maintain additional state to enable these operations

37
Q

What are the responsibilities of each node in a shared memory system?

A
  • Each node owns some portion of the physical memory, and provides the operations (reads/writes) on that memory.
  • Each node needs to be involved in some consistency protocols to ensure that shared accesses to the state have meaningful semantics.
38
Q

What does a software DSM system need to be able to do?

A
  • detect local vs remote memory accesses
  • create and send messages to the appropriate node
  • accept messages from other nodes and perform the encoded memory operations
  • be involved in memory sharing and consistency support
39
Q

At what level(s) can a software DSM system operate?

A
  • operating system

- programming language runtime

40
Q

What are “sharing semantics”?

A

Sharing semantics define when changes to state made by one process are made available to other processes.