Factors affecting parallel performance Flashcards
(33 cards)
What are the SLOW factors affecting parallel performance?
- Starvation
- Latency
- Overhead
- Waiting
How does Starvation affect parallel performance?
Insufficient parallel work to keep the processors busy sometimes, due to an uneven distribution of work.
How does Latency affect parallel performance?
The time taken for information to travel from one part of the system to another slows down the entire system.
How does Overhead affect parallel performance?
The work required in addition to the computation such as starting and stopping OpenMP parallel regions.
How does Waiting affect parallel performance?
When multiple threads/processes are accessing a shared resource they are in contention for memory or network bandwidth.
What is parallel speedup?
How much faster the parallel program is than the linear version.
How is parallel speedup calculated?
SN = T0/TN
where T0 is the time for the serial program to run and TN is the time taken for the parallel program to run on N processors
How is parallel efficiency calculated?
EN = SN/N
Where SN is the speedup on N processors
What is strong scaling?
When the total problem size is fixed and the number of processors is increased to reduce run time.
What is Amdahl’s law?
The idea that parallel speedup is limited by the fraction of the program that can be parallelised.
How is parallel speedup calculated using Amdahl’s Law?
SN = 1/(s + p/N)
Where SN is the parallel speed up, s is the fraction that can not be parallelised, and p is the fraction that can be parallelised.
What did Gustafson observe?
He observed that in practice the problem size scales with the number of processors.
What is weak scaling?
The idea that speed up increases linearly with the number of processors so the run time remains constant and it is easier to make use of a large number of processors.
What is the equation for weak scaling?
Sn = s + pN
How are parallel regions kept synchronised?
There is an implied barrier after parallel regions and work share constructs which acts as a synchronisation point.
How is the implicit synch point removed?
Using the nowait command.
How does nowait improve performance?
It can improve performance by avoiding unnecessary waiting but you need to be careful the program still works correctly.
How is synchronisation added?
Using the command barrier
What are the three loop scheduling options?
- Static
- Dynamic
- Guided
How does static loop scheduling work?
Iterations are divided into pieces of size chunk and distributed round-robin between threads.
How does dynamic loop scheduling work?
iterations divided into pieces of size chunk, when a thread finishes its chunk it is given another to work on.
How does guided loop scheduling work?
Dynamic scheduling with decreasing chunk size. for chunk=1 chunk size is proportional to the number of unassigned iterations divided by the number of threads in the team. chunk=k sets minimum chunk size.
What is interconnect?
Compute nodes are linked by an interconnect which carries these messages. the time spent communicating needs to be minimised to achieve high parallel efficiency and scale to large numbers of processors.
What are two types of interconnect?
- Gigabit ethernet
- Infiniband