Chapter 1 Flashcards

Question 1

Q

If there are p variables, how many pairwise scatterplots can be produced? How implications does this have?(1)

Answer

A

p*(p-1)/2
Means scatterplots arent best for large p.
Heat maps can be used if data is not categorical as a replacement with yellow and white indication higher values and reds and oranges for lower values.

Question 2

Q

What does covariance indicate?(1)

Answer

A

Positive indicates that one variables value larger (or smaller) from its mean generally means larger or smaller for the other variable. Conversely, negative covariance indicates larger values than mean will produce smaller in the other variable (and vice versa).
Note 1/n-1 in the formula gives an unbiased estimate.

Question 3

Q

What is the sample mean vector of the data matrix X?(1)

Question 4

Q

What does pre or post multiplying by a vector of ones do?(1)

Answer

A

Has the effect of calculating the column sums or row sums respectively.

Question 5

Q

What is the identity mstrix?

Answer

A

Zeros everywhere 1 diagonals.

Question 6

Q

What is the centering matrix (Hn)? What does this do?(2)

Answer

A

In-1/n1n^T1n.
Pre-multiplying X by the centering matrix Hn has the effect of subtracting the appropriate sample mean from each element. Therefore the centred data matrix has a sample mean vector of 0.

Question 7

Q

What lis the sample covariance matrix of the data matrix X?(1)

Answer

A

(1/n-1)X^THn*X
NEED TO LEARN PROOF!!
Note that is a symmetric and semi-positive definite-learn proof.

Question 8

Q

What are 2 properties of the centering matrix?(2)

Answer

A

LEARN PROOFS
Symmetric Hn^T=Hn
and idempotent Hn^2=Hn (Idempotence means that multiple applications of a particular operation do not change the result. In other words, if we try to center the
centering matrix we are left with the centering matrix.)

Question 9

Q

Calculated rij for correlation matrix.(1)

Answer

A

rij = sij/sisj

where si = √sii is the sample standard deviation of the ith variable.

Question 10

Q

What does R = Ip mean?(1)

Answer

A

Variables are uncorrelated as correlation matrix shows variables only correlate with one another.

Question 11

Q

How would you calculate the sample correlation matrix R from sample covariance matrix S?(1)

Answer

A

D^-1SD^-1.

Where D is diagonal matrix with standard deviations of the variables.

Thud R is positive semi-definite from S also being psd.

Question 12

Q

What are 2 single measures of multivariate scatter?(2)

Answer

A

Generalised variance=Det(S)

Total variation=Tr(S) (sum of diagonals of S)

Question 13

Q

What is a linear functional?(1)

Answer

A

q=1 for transformation f(x)=a^Tx where a is a vector length p Rp–>R

Question 14

Q

What is an affine transformation?(1)

Answer

A

Linear transformation combined with a shift in location

Rp–>Rqbf(x)=Ax+b for A qxp b vector length b.

Question 15

Q

What is a 2d projection?(1)

Answer

A

A linear transformation of x y=AX where A=(ej^T,ek^T)^T and ej is the length-p vector with 1 in element j and 0 everywhere else
-A selection of a pair of variables for a scatterplot matrix plot.

Question 16

Q

What is a 2d rotation?(1)

Answer

Study These Flashcards

A

Rotation of the point x=(x1,x2)^T anticlockwise through the angle theta
Linear transformation y=Ax, y=(y1,y2)^T,
A(costheta,-sintheta,
sintheta,costheta) ie 2x2 matrix

Question 17

Q

What us the matrix formulation of affine transformations?What are the mean and covariance matrix?(1,2)

Answer

Study These Flashcards

A

Y=XA^T + 1nb^T
ybar=Axbar+b, Sy=ASA^T
Learn proof for mean and cov!!
Note Sy is symmetric due to S being symmetric and also semi-positive definite

Question 18

Q

What is the spectral (eigen)decomposition?(1)

Answer

Study These Flashcards

A

L/\L^T
/=diag(lambda1,lambda2…lambdap) diagonal matrix of eigenvalues of A (lambda1>lambda2>…>lambdap)
L=matrix of normalised eigenvectors (i.e lj^Tlj=1 corresponding to lambdaj)

Question 19

Q

What is the orthogonal matrix?(1)

Answer

Study These Flashcards

A

LL^T=Ip=L^TL i.e. li^Tli=1 and li^Tlj=0 for i/=j

Question 20

Q

How do we calculate the roots lambda?What about eigenvectors?(1)

Answer

Study These Flashcards

A

|A-lambda*In|=0

Av=lambdav

Question 21

Q

How to calculate matrix square root? Under what conditions does this apply? Can this be generalised?(2)

Answer

Study These Flashcards

A

If A is a symmetric positive semi-definite matrix with spectral decomposition L/\L^T then the matrix square root of A is:
A^(1/2)=L/\^(1/2)L^T
Where /\^(1/2) is the diagonal matrix with square roots of lambda
Yes it can be for A^alpha replace sqrt with power of alpha in the above where alpha is any real number

Note: Cholesky usually cheaper in computation if only require solution to GG^T=A where A=LL^T L is a lower triangle matrix.

Question 22

Q

What is the scaling (standardisation) transformation?(1)

Answer

Study These Flashcards

A

Y=(X-1nxbar^T)*D^-1

This pulls all variables onto a common scale but maintains the correlation between them.

Question 23

Q

What is the Mahalanobis transformation?What is its purpose?(1)

Answer

Study These Flashcards

A

Aims to place variables on a common scale AND remove correlation between them.
y=S^(-1/2)
Where S(-1/2)=L/\^(-1/2)L^T
The transformation is a p-dimensional linear transformation yr=Axr+b with transformation matrix A=S^(-1/2) and b=-S(-1/2)xbar, each transformed variable has 0 mean, unit variance and uncorrelated variables.

Chapter 1 Flashcards

(23 cards)