Final Exam Questions Flashcards
In the context of computational science, describe verification
- Process of checking that a computational model is solving the equations it was designed to solve.
- Checks if the model is correctly implemented.
- Involves checking the consistency + correctness of code with respect to the underlying mathematical + physical models.
- Does not guarantee model is accurate, only that it is implemented correctly.
In the context of computational science, describe validation
- Process of checking that a computational model accurately represents the real-world phenomenon it is intended to model.
- Involves assessing the uncertainties associated with the model predictions + the experimental data used for validation.
In the context of computational science, distinguish carefully between verification and validation.
Verification - ensures code is implemented correctly.
Validation - ensures that the model accurately represents reality.
Write down a finite difference approximation to the Laplacian operator.
Explain how the accuracy of a finite difference approximation to the Laplacian operator can be characterised
- Grid spacing: Δx and Δy. As they become smaller, approximation becomes more accurate
- Approximation is exact in the limit as Δx and Δy approach 0.
- Order of finite difference scheme. Higher-order scheme = more accurate results. Need more function evaluations and computationally more expensive.
- Truncation error
- Round-off error. Error introduced by finite precision of computer arithmetic
Suppose an exact solution is known to a problem involving the Laplacian operator of the previous question. Explain how to demonstrate that a computed solution converges to this exact solution.
- Convergence to an exact solution at the expected rate is the strongest possible test of a computer code.
- Method of manufactured solutions:
- Exact solution is known + is used to derive the RHS of differential equation.
- RHS is then used as input to numerical method being tested + resulting solution is compared to known exact solutions.
a. write down model equation
b. write down a manufactured solution (any function satisfying boundary conditions)
c. Construct the manufactured source term M
d. Show that the computer code under test converges as expected (with M included) to the manufactured solution. Then the code is deemed verified.
e. Set M to 0
f. Proceed to calculations of practical interest.
Explain the role of uncertainty quantification in a validation procedure:
- Helps to estimate the level of confidence in computational results.
- Helps to compare the computed results with experimental data / other models to determine the accuracy.
- Helps to identify the sources of uncertainty in the computational model + assess their impact on the results.
- Helps to determine the range of values that the computed results can take due to uncertainty in the input data / model parameters.
- Helps to improve the accuracy of the computational model by identifying the areas that require more accuracte data / parameter values.
In the context of verifying a computer simulation program, outline the concept of a manufactured solution
- Manufactured solution = analytic function used to test the accuracy of a computer simulation program.
- Chosen to satisfy the differential equation being solved, as well as any necessary boundary conditions.
- Used to compute the RHS of differential equation, which is then used as input to simulation program.
- Simulation program is run using manufactured solution & resulting numerical solution is compared to manufactured solution to compute the error.
- error can be calculated at each point or using the L2 error norm
In the context of verifying a computer simulation program, explain how a manufactured solution can be used as part of a correctness test.
- Manufactured solution can be used to test the accuracy of the simulation under different conditions (different boundary conditions, nonlinearities, other sources of complexity)
- can also be used to verify the order of accuracy of the simulation program (rate at which the error decreases as the grid spacing is decreased)
Consider a mass, m, connected to a spring with force constant, k, with a nonlinear perturbation characterised by a parameter, α, such that:
F(x) = -kx(1+αx^3)
Write down the equation(s) of motion for this mass, and suggest a suitable numerical procedure for finding a solution.
Equation of motion - derived from newtons second law.
F(x) = -kx (1 + ax^3)
m d^2x/dt^2 = -kx - k α x^4
m d^2x/dt^2 + kx + kαx^4 = 0
Can be solved using 4th order Runge Kutta method: (RK-4)
Suggest a suitable manufactured solution for this nonlinear problem, and derive the required manufactured source term. Hence write down the modified equations of motion to be solved in a verification exercise using the method of manufactured solutions.
This question deals with concepts relating to High Performance Computing.
Describe the Strong scaling used to assess parallel performance. State what it reveals about an application.
Strong scaling
1. Run time for a given problem size vs number of processors.
2. Size of problem is fixed, number of processors increased.
3. Reveals how well the parallel algorithm can solve a problem faster as the number of processors is increased
4. Ideally, computation time should decrease as the number of processors increases.
5. Reveals the degree of parallelisation efficiency
This question deals with concepts relating to High Performance Computing.
Describe the weak scaling used to assess parallel performance. State what it reveals about an application.
Weak scaling
1. problem size is increased proportionally with the number of processors.
2. ideally, computation time should remain constant as the number of processors increases.
3. reveals the stability of an application
What function call must be made before a call to MPI_Bcast?
Before MPI_Bcast, MBI_Init must be called to initialise the MPI environment
Describe the operation of MPI_Bcast ensuring that the function definition is referred to in the answer.
MPI_Bcast - collective communication routine in MPI library.
- broadcasts a message from the process with rank ‘root’ to all other processes in the communicator ‘comm’
Function arguments:
- buffer - pointer to the buffer containing the data to be broadcasted,
- count - number of data items to broadcast
- datatype - datatype of each data item
- root - rank of the process sending the message
- comm - communicator that defines the group of processes involved in the broadcast.
MPI_Bcast operation works as follows:
1. process with rank ‘root’ copies the data into the buffer to its own local buffer
2. the root process sends this local buffer to all other processes in the communicator ‘comm’
3. Each receiving process receives the broadcasted data and stores it in its local buffer
Why might this be used in place of MPI_Send and MPI_Recv operations.
Can be used in place of MPI_Send and MPI_Recv when the same data needs to be sent from one process to all other processes in the communicator, reducing the amount of code needed.
Explain the concept of granularity.
- Computation to Communication ratio
- Level of detail in a simulation / computation.
- Determines the accuracy and computational cost of a simulation.
- Can also refer to level of parallelism in computation (size of work units assigned to different processors in a parallel algorithm)
Discuss fine grained applications
- Frequent communication
- Opportunity for load balancing
- Limited potential for optimisation
- Have small units of work that can be executed independently and concurrently by multiple processing elements.
- Suitable for problems with a high degree of parallelism, such as particle simulations or numerical integration.
- Can achieve high levels of concurrency and potentially good load balancing.
- Require frequent communication between processing elements, which can limit scalability and efficiency.
- May have limited potential for optimisation due to the overhead of communication and synchronisation.
- May require more low-level programming and synchronisation
Discuss coarse-grained applications
- High computation to communication
- Load balancing difficult
- Optimisation more efficient
- Have larger units of work that are assigned to fewer processing elements.
- Are suitable for problems with a lower degree of parallelism, such as large-scale simulations or data analytics.
- Have a higher computation-to-communication ratio, which can improve scalability and efficiency.
- May have more challenging load balancing requirements, especially if the workload is not evenly distributed.
- May have more opportunities for optimisation, such as cache optimisation or vectorisation.
- May be more amenable to high-level programming models such as OpenMP or MPI.
Describe the 2 types of communication available in the Message Passing Interface and give an example of each with a brief explanation of its use POINT-TO-POINT COMMUNICATION
- Communication between specific pairs of processes, known as the source and destination processes.
- Can be used for sending / receiving data from one process to another.
- Examples MPI_Send and MPI_Recv functions
- Usage: Process A can use MPI_Send to send data to process B, and process B can use MPI_Recv to receive the data from process A
- e.g. In a molecular dynamics simulation - each process may represent a different portion of the system. Processes need to exchange information about positions and velocities of the particles at regular time steps to update the simulation.
Describe the 2 types of communication available in the Message Passing Interface and give an example of each with a brief explanation of its use COLLECTIVE COMMUNICATION
- Involves communication among a group of processes collectively.
- Can be used for broadcasting data to all processes, reducing data across processes, or gathering data from all processes.
- Examples - MPI_Bcast, MPI_Reduce, MPI_Gather
- Usage - Process A can use MPI_bcast to broadcast data to all other processes in the group, or process B can use MPI_Reduce to reduce data across all processes in the group.
- e.g. In a large scale parallel simulation of fluid dynamics - each process may compute local properties of the fluid in its subdomain. At certain time steps, the processes need to exchange boundary data with their neighbours to compute the global properties of the fluid, such as pressure or velocity.
Distinguish, with exampes, between parametric uncertainty and parametric variability. PARAMETRIC UNCERTAINTY
– Parametric uncertainty A parameter has a definite value, but we don’t know what that value is (e.g. a rate constant)
- Refers to the lack of knowledge or precision in determining the values of input parameters
- Can arise from measurement errors, limited data, or assumptions in a model
- Often described using probability distributions or ranges of possible values for a parameter
- Requires probabilistic methods to quantify and propagate the uncertainty through the simulation
EXAMPLE - In a computational fluid dynamics simulation, the viscosity of a fluid is an input parameter that affects the behaviour of the flow. The viscosity may be measured with some uncertainty, and this uncertainty can propagate through the simulation, leading to uncertainty in the results.
Distinguish, with exampes, between parametric uncertainty and parametric variability. PARAMETRIC VARIABILITY
– Parametric variability We are simulating an experimental situation that is not exactly reproducible (the normal situation). So some parameters do not have well-defined values
- Refers to the inherent variability or randomness in the values of input parameters
- Can arise from physical or biological variability in a system or process
- Often described using statistical distributions that reflect the variability in the parameter values
- Requires stochastic methods to account for the variability in the simulation
EXAMPLE - In a population dynamics model, the birth rate of a species is an input parameter that can vary across individuals or populations. The birth rate may be modeled using a statistical distribution that reflects the variability in the parameter values, and the simulation can be run multiple times with different values of the parameter to account for the variability.
Discuss how a Monte Carlo procedure could be used to estimate the influence of parametric uncertainty on the outcome of a computer simulation procedure?
”” “” by randomly sampling values from the input parameter distributions and running the simulation multiple times. The results from the simulation runs can then be analyzed to estimate the uncertainty and variability in the simulation output due to the uncertainty in the input parameters.
- Define the input parameter distributions: For each uncertain input parameter, specify the probability distribution or range of possible values based on available information, such as experimental data or expert knowledge.
- Generate random samples: Use a random number generator to generate a large number of random samples from each input parameter distribution. The number of samples should be large enough to achieve a desired level of statistical accuracy.
- Run simulations: For each set of input parameter values, run the computer simulation to generate the corresponding output.
- Analyze results: Calculate statistics such as mean, standard deviation, and percentiles for the simulation output over all the runs. 5. These statistics can be used to estimate the uncertainty and variability in the simulation output due to the uncertainty in the input parameters.
- Sensitivity analysis: Conduct a sensitivity analysis to determine which input parameters have the greatest influence on the simulation output. This can be done by calculating the correlation coefficients between each input parameter and the simulation output.