Chapter 7 Computational Linear Algebra Flashcards
(22 cards)
What is the computational cost of the standard matrix multiplication algorithm?
O(n^3), where n is the matrix dimension.
What does the product ŷ = Xb represent in linear regression?
The predicted outputs for a dataset X given the linear weights b.
How does the cost of adding matrices compare to multiplying them?
Addition is O(n^2), while multiplication is O(n^3) using the schoolbook method.
What is Strassen’s algorithm and its complexity?
A divide-and-conquer matrix multiplication algorithm with complexity O(n^2.81).
How many matrix multiplications does Strassen’s algorithm use?
Seven multiplications per recursive step.
Why is computing A⁻¹b to solve Ax = b a bad idea?
It is computationally inefficient and numerically unstable.
What is LU decomposition?
Decomposing matrix A into the product of a lower triangular matrix L and an upper triangular matrix U (A = LU).
What is the total computational cost of solving Ax = b using LU decomposition?
O(n^3) for LU decomposition and O(n^2) for forward and back substitution.
Why is LU decomposition without pivoting potentially unstable?
Small pivot values can lead to large errors due to division instability.
What does the ‘P’ in LUP decomposition stand for?
Permutation, used to improve numerical stability by row swapping.
What is machine epsilon?
The smallest number that, when added to 1 in floating point, gives a result different from 1.
What is a well-conditioned problem?
A small change of input values from a to a’ results in only a small change of the output
What defines a numerically stable algorithm?
An algorithm is numerically stable if it doesn’t amplify small input errors into large output errors.
f dash. (A,b) = f(A dash, b dash) for some slightly perturbed A dash, b dash with small relative errors
What are eigenvectors and eigenvalues?
Vectors e such that Ae = λe, where λ is a scalar (eigenvalue).
Why are eigenvalues important in data science?
They enable dimensionality reduction by identifying dominant directions in data.
What is the power method used for?
Finding the dominant eigenvector of a matrix.
When does the power method converge to an eigenvector?
When applied repeatedly to a vector under a matrix with a unique largest eigenvalue.
What does the equilibrium distribution π of a Markov chain satisfy?
Pπ = π, where P is the transition matrix.
Ill-conditioned problem
Small changes in the input can cause large changes in the output (e.g taking inverses)
Approach to solve conditioning problems: e.g |a - ā|< ϵ
Know that ā is a perturbation of a such that we want to find the error rate. We want to say δa (error rate) is = to |a - ā| and δb = |b-−˜b|
We then bound this by using the triangle inequality such that |δa+δb| <= |δa|+|δb|
Why is matrix inversion problematic ?
Solving Ax = b via x = A^-1 b is numerically unstable because :
- Inversion involves many divisions and subtractions
- These operations are vulnerable to rounding errors
- if A is ill-conditioned, small errors can cause huge output errors
LU decomposition unstable?
Lower/Upper triangular decomposition. If some pivot points are small and near eps or zero it will become numerically unstable so as a solution we can use LUP (permutation matrix)