High performance Computing Flashcards
(199 cards)
Week 1
What new science is a major user of high performance computing?
Life sciences for applications such as genome processing
How can we determine the performance of a high performance computer using floating point mathematics
- Linpack is a performance benchmark which measures floating point operations per
second (flops) using a dense linear algebra workload - A widely used performance benchmark for HPC systems is a parallel version of Linpack
called HPL (High-Performance Linpack)
Why is parallelization important in HPC
High requirement floating point calculations can be run in parallel to make use of multiple cores
What is dennard scaling?
Dennard scaling is a recipe for keeping power per unit area (power density) constant as transistors were scaled to smaller sizes
As transistors became smaller they also became faster (delay reduction) and more
energy efficient (reduce threshold voltage)
With very small features limits associated with the physics of the device (e.g. leakage
current) are reached
Dennard scaling has broken down and processor clock speeds are no longer
increasing
What is the current most common supercomputer architecture
Current systems are all
based on integrating many
multi-core processors
- The dominant architecture is
now the “commodity cluster” - Commodity clusters
integrate off-the-shelf (OTS)
components to make an
HPC system (cluster)
Give the proper definition for a Commodity cluster
A commodity cluster is a cluster in which both the network and the compute nodes are commercial products available for procurement and independent application by organisations (end users or separate vendors) other than the original equipment manufacturer.
Give four components of a cluster algorithm
Compute nodes: provide the processor cores and memory required to run the workload
* Interconnect: cluster internal network enabling compute nodes to communicate and access storage
* Mass storage: disk arrays and storage nodes which provide user filesystems
* Login nodes: provide access (e.g. ssh) for users and administrators via external network
Why do High performance Computers use compiled languages
Maximizes performance.
Compilers parse code and generate executables with optimizations.
Optimizations at compile-time are less costly than at runtime.
What are common langauges for High performance computers?
C, C++ and Fortran
Why must parellisation be done manually
Parallelization is too complex for compilers to handle automatically.
Programmers add parallel features.
Week 2
What can we use OpenMP for?
OpenMP provides extensions to C, C++ and Fortran
* These extensions enable the programmer to specify where parallelism should be added and how to add it
* The extensions provided by OpenMP are:
- Compiler directives
- Environment variables
- Runtime library routines
What does it mean to say that OpenMP uses a fork join execution model?
Execution starts with a single thread (master thread)
- Worker threads start (fork) on entry to a parallel region
- Worker threads exit (join) at the end of the parallel region
What can we use the OMP_NUM_THREADS header for?
We can use the OMP_NUM_THREADS environment variable to control the number of threads forked in a parallel region e.g.
- export OMP_NUM_THREADS=4
- OMP_NUM_THREADS is one of the environment variables defined in the standard
- If you don’t specify the number of threads the default value is implementation defined
(i.e. the standard doesn’t say what it has to be)
What does openMP provide that we can call directly from our functions?
OpenMP provides compiler directives, environment variables and a runtime library with functions we can call directly from our programs
What header file must be included to use the open.mp library
<omp.h>
</omp.h>
Why is conditional compilation useful in programs that use OpenMP?
It ensures that the program can compile and run as a serial version when OpenMP is not enabled, avoiding errors caused by missing OpenMP compiler flags.
What is the role of the C pre-processor in the compilation process?
The C pre-processor processes source code before it is passed to the compiler, handling directives such as #include and #ifdef.
What does the _OPENMP macro indicate when it is defined?
It indicates that OpenMP is enabled and supported by the compiler.
hat is the syntax of the #ifdef directive used for conditional compilation?
ifdef MACRO
// Code included if MACRO is defined #else // Code included if MACRO is not defined #endif
What is the main benefit of using conditional compilation with OpenMP programs?
It allows the same source code to support both serial and parallel execution by enabling or disabling OpenMP-related code.
What is one good way to distribute the workload when working in parallel
One way to do this in OpenMP is to parallelise loops
- Different threads carry out different iterations of the loop
- We can parallelise a for loop from inside a parallel region:
#pragma omp for - We can start a parallel region and parallelise a for loop:
#pragma omp parallel for
What changes about the order of loop iterations when they are executed in parallel?
When the loop is parallelised the iterations will not take place in the order specified by the loop iterator
* We can’t rely on the loop iterations taking place in any particular order