Parallel Patterns Flashcards
(15 cards)
What are the different types of dependencies?
- Read after write (RAW) or true dependency or flow dependency
- Write after read (WAR) or anti dependency
- write after write (WAW) or output dependency
What are loop carried dependencies?
It is when iteration I reads or writes data produced by iteration J, preventing full parallelization
Define the map, reduce, stencil, and scan patterns
Map - Apply an elemental operation independently to each element of a collection
Reduce - associative combine of all elements into one result
Scan - computes all partial reductions
Stencil - each elements computation reads neighbor data
What is the complexity of reduce? What about scan? What if we did not parallelize them?
The complexity of reduce if not parallelized is linear regarding the size of the input O(n). When parallelized it can be done in log(n) steps (O(log(n))
The scan is the same thing
Define the pack/unpack, pipeline, gather/scatter patterns
Pack -
Unpack -
Pipeline -
Gather -
Scatter -
Define the gather, shift, Zip, and unzip patterns.
Define the scatter pattern and the collision rules to handle concurrent writes
The scatter pattern gets the values of a given collection and place them in a new collection according to the indexes in another collection
Beyond the thread safety assumption we also assume that the scatter pattern handles concurrent writes:
- Atomic: writes are done
- Merge: use a commutative and associative function to merge the different values
- Permutation: all indexes are different
- Priority: a given value has more priority then the other
Describe the packs and unpack patterns
How could they be implemented?
Pack could be implemented using a prefix sum to count the number of ones BEFORE the current element in the mask and building a new collection using this scan operation
Then using both the result of the scan and the mask, every time we get one in the mask we put the value in the index defined by the scan operation
Describe the Split and Bin patterns
How can we implement the pack and unpack patterns?
Pack: Exclusive Scan (prefix sum) + Map (check the mask, match with original values) + Random Writes
Unpack: Map (match values with the mask) + Random Reads
How could we implement the bin and split patterns?
We could implement the split pattern by doing an exclusive scan on the zero values
With this we know when the ones start (exclusive scan plus one)
For the zeros we loop through the array values when we find zeros we write the value in i on the indexes[i] computed by the exclusive scan on zeros
The we write the ones in the positions equal to the current index i minus the value at the computed index position (this yields the number of ones seen before) plus the last value computed by the scan plus one
Example:
[1,0,0,1,1,0,0,0]
[A,B,C,D,E,F,G,H]
Exclusive Scan: [0,0,1,2,2,2,3,4]
Ones start at 5
For the ones we have:
- first: 5 + (0 - 0)
- second: 5 + (3 - 2) =6
- third: 5 + (4 - 2) =7
Define the stencil pattern. What are its characteristics? What are its assumptions?
We assume that the function applied is thread safe (has no side effects)
How could we implement stencil pattern with shifts? What are the advantages and disadvantages of this approach?
Define the pipeline parallel pattern
You divide the code in different stages and each stage is executed by a different thread
What are the assumptions of the map parallel pattern? How could we optimize the application of multiple maps?
We assume that the elemental function has no side effects (thread safe)
We could fuse the elemental functions (this way we avoid all intermediary writes and reads from the memory)
We could also do cache fusion