Chapter 1 Flashcards
(23 cards)
If there are p variables, how many pairwise scatterplots can be produced? How implications does this have?(1)
p*(p-1)/2
Means scatterplots arent best for large p.
Heat maps can be used if data is not categorical as a replacement with yellow and white indication higher values and reds and oranges for lower values.
What does covariance indicate?(1)
Positive indicates that one variables value larger (or smaller) from its mean generally means larger or smaller for the other variable. Conversely, negative covariance indicates larger values than mean will produce smaller in the other variable (and vice versa).
Note 1/n-1 in the formula gives an unbiased estimate.
What is the sample mean vector of the data matrix X?(1)
1/nX^T1n
What does pre or post multiplying by a vector of ones do?(1)
Has the effect of calculating the column sums or row sums respectively.
What is the identity mstrix?
Zeros everywhere 1 diagonals.
What is the centering matrix (Hn)? What does this do?(2)
In-1/n1n^T1n.
Pre-multiplying X by the centering matrix Hn has the effect of subtracting the appropriate sample mean from each element. Therefore the centred data matrix has a sample mean vector of 0.
What lis the sample covariance matrix of the data matrix X?(1)
(1/n-1)X^THn*X
NEED TO LEARN PROOF!!
Note that is a symmetric and semi-positive definite-learn proof.
What are 2 properties of the centering matrix?(2)
LEARN PROOFS
Symmetric Hn^T=Hn
and idempotent Hn^2=Hn (Idempotence means that multiple applications of a particular operation do not change the result. In other words, if we try to center the
centering matrix we are left with the centering matrix.)
Calculated rij for correlation matrix.(1)
rij = sij/sisj
where si = √sii is the sample standard deviation of the ith variable.
What does R = Ip mean?(1)
Variables are uncorrelated as correlation matrix shows variables only correlate with one another.
How would you calculate the sample correlation matrix R from sample covariance matrix S?(1)
D^-1SD^-1.
Where D is diagonal matrix with standard deviations of the variables.
Thud R is positive semi-definite from S also being psd.
What are 2 single measures of multivariate scatter?(2)
Generalised variance=Det(S)
Total variation=Tr(S) (sum of diagonals of S)
What is a linear functional?(1)
q=1 for transformation f(x)=a^Tx where a is a vector length p Rp–>R
What is an affine transformation?(1)
Linear transformation combined with a shift in location
Rp–>Rqbf(x)=Ax+b for A qxp b vector length b.
What is a 2d projection?(1)
A linear transformation of x y=AX where A=(ej^T,ek^T)^T and ej is the length-p vector with 1 in element j and 0 everywhere else
-A selection of a pair of variables for a scatterplot matrix plot.
What is a 2d rotation?(1)
Rotation of the point x=(x1,x2)^T anticlockwise through the angle theta
Linear transformation y=Ax, y=(y1,y2)^T,
A(costheta,-sintheta,
sintheta,costheta) ie 2x2 matrix
What us the matrix formulation of affine transformations?What are the mean and covariance matrix?(1,2)
Y=XA^T + 1nb^T
ybar=Axbar+b, Sy=ASA^T
Learn proof for mean and cov!!
Note Sy is symmetric due to S being symmetric and also semi-positive definite
What is the spectral (eigen)decomposition?(1)
L/\L^T
/=diag(lambda1,lambda2…lambdap) diagonal matrix of eigenvalues of A (lambda1>lambda2>…>lambdap)
L=matrix of normalised eigenvectors (i.e lj^Tlj=1 corresponding to lambdaj)
What is the orthogonal matrix?(1)
LL^T=Ip=L^TL i.e. li^Tli=1 and li^Tlj=0 for i/=j
How do we calculate the roots lambda?What about eigenvectors?(1)
|A-lambda*In|=0
Av=lambdav
How to calculate matrix square root? Under what conditions does this apply? Can this be generalised?(2)
If A is a symmetric positive semi-definite matrix with spectral decomposition L/\L^T then the matrix square root of A is:
A^(1/2)=L/\^(1/2)L^T
Where /\^(1/2) is the diagonal matrix with square roots of lambda
Yes it can be for A^alpha replace sqrt with power of alpha in the above where alpha is any real number
Note: Cholesky usually cheaper in computation if only require solution to GG^T=A where A=LL^T L is a lower triangle matrix.
What is the scaling (standardisation) transformation?(1)
Y=(X-1nxbar^T)*D^-1
This pulls all variables onto a common scale but maintains the correlation between them.
What is the Mahalanobis transformation?What is its purpose?(1)
Aims to place variables on a common scale AND remove correlation between them. y=S^(-1/2) Where S(-1/2)=L/\^(-1/2)L^T The transformation is a p-dimensional linear transformation yr=Axr+b with transformation matrix A=S^(-1/2) and b=-S(-1/2)xbar, each transformed variable has 0 mean, unit variance and uncorrelated variables.