7 - Communication & Synchronisation Flashcards

1
Q

How do work items/threads communicate?

A

Through memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the idea case for memory?

A

One type that is large, cheap and fast

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What are the attributes of large, cheap and fast memory?

A

Large = slow/expensive

Cheap = small/slow

Fast = small/expensive

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are 4 types of GPU memory types?

A

Private memory, local memory, global memory, constant memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the attributes of private memory?

A

Very fast, only accessible by a single work item, registers, 10/100 bytes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are the attributes of local memory?

A

Fast, accessible by all work items within a single work group, user accessible cache, K/MB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are the attributes of global memory?

A

Slow, accessible by threads from all work groups, DRAM, GB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the attributes of constant memory?

A

Fast, also accessible, by all threads, part of global memory but cached, not writable, relatively small, KB

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What should you minimse time spent on?

A

Memory operations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How do you minimise time spent on memory operations?

A

Move frequently accessed data to a faster memory

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the order of fast memory?

A

host&raquo_space; global&raquo_space; local&raquo_space; private

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What doesn’t benefit from moving frequently accessed data to a faster memory?

A

Single or sporadic accesses

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When does data become global memory?

A

When it is transferred from host to device

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is local memory?

A

Making a local copy of the input to make accesses faster

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Why do you need synchronisation?

A

Accesses to shared locations need to be correctly synchronised/coordinated to avoid race conditions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What are 3 types of synchronisation mechanisms?

A

Barriers/memory fences
Atomic operations
Separate kernel launches

17
Q

What do barriers do?

A

Ensure that all work items within the same work group reach the same point

18
Q

Which has lower overhead, global or local memory barriers?

19
Q

Where should you avoid putting barriers?

A

In conditional statements, should always apply to all work items from the group otherwise deadlock

20
Q

What is impossible in modern GPU/CPU hardware?

A

Synchronise different work groups

21
Q

How do you synchronise different workgroups?

A

By writing and launching separate kernels

22
Q

What do Atomic functions do?

A

Provide a mechanism for atomic (without interruption) memory operations

23
Q

What do Atomic functions guarantee?

A

Race free execution

24
Q

How are Atomic updates performed?

A

Serially, so performance penalty

25
What is the order in Atomic functions?
The order is unspecific, so can only be used with associative and commutative operators
26
What are the limitations of Atomic functions?
Atomics are slower than normal accesses Performance degrades with many simultaneous attempts to perform atomic operations on the same data
27
What is the usage for Atomic functions?
For infrequent, sparse and/or unpredictable global communication Attempt to use shared memory and structure algorithms to avoid synchronisation whenever possible
28
What is does global memory reads by GPU involve?
Reading entire blocks of data
29
What is memory coalescing?
Sequential data access for better performance. When another value is requested and is from the same block then no additional memory access is required
30
What are the effects of the Stride in Strided memory access?
A stride affects the access the pattern, if the stride is larger than the block size then the benefits of blocking are gone