Week 2: GLM part 1 Flashcards

1
Q

Once we have collected the data we want to know which areas are active. How do we do this?

A

Through model-based techniques (e.g., linear regression) and model-free approaches (e.g., PCA)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What general steps do we follow to perform model-based analysis of fMRI data?

A

1) model the predictors; 2) fit the resulting models (one per predictor) to the data (if we have more than one predictor, sum the two models and fit this sum to the actual data); 3) how good do the models fit (e.g., through t-stat)?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Univariate analysis

A

We treat each voxel’s time course independently

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Hemodynamic response function

A

A function that represents the change in blood-oxygen levels in response to neural activity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

T-statistic definition and formula

A

Signal-to-noise measure. Formula: beta_hat-beta0 / SE(b_hat)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Contrast testing

A

Conducting hypothesis testing about our betas (for which we have one model each). Example: we have b1 and b2; we create one model for each, we sum these models and we fit the sum to the actual data. Now, we want to know whether the amplitude of the b1 model is the same or not as the amplitude of b2. In this case, the H0 would be: b1 = b2 and the HA woud be: b1 != b2. Bringing everything to the left of the “=” sign, getting b1-b2 in this case, and calculate the contrast we need to answer the hypothesis. In this case, this would be solving for […] * [b0,b1,b2,b3]= b1-b2, which gives [0, 1, -1, 0].

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

In which of the following steps do we get the regressors that we will fit to the actual data?

  1. creating stimulus vector for each stimulus/task
  2. convolving the vector with the HRF
  3. fitting the model to the data
A

At step 2. Convolving the initial vector with the HRF will give us the regressors (Xs) which we will then fit to the actual data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Probability

A

Expected relative frequency of a particular outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Random variable

A

variable determined by random experiment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Expected value (mean)

A

Mean of random variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Variance (of a random variable)

A

How the values of the random variable are dispersed around the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Covariance

A

How much two random variables vary together

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Bias, variance, estimator

A

How much on average the estimate is correct; the reliability of the estimate; something that estimates a parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Regression equals…

A

…association (not causality!!)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Simple linear regression model

A

Yi = beta0+beta1Xi + e

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

The error in linear regression is assumed to have a mean of ….

A

0 (this means that if we took the mean of all our error terms, e1, e2, e3, …., eN, then the result of the sum would be 0, and then dividing by N would again give 0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

The variance of Yi (Var(Yi)) equals…

A

sigma^2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

The formula for sigma^2 is …

A

sum(e^2) / ( # of independent pieces of information - # of parameters in the model, including b0 )

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

The most used loss function for linear regression

A

Least squared errors function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

Gauss Markov theorem states that…

A

…assuming the following assumptions of the GLM (linear regression model) are not violated:
1. Linearity: The relationship between X and the mean of Y is linear.
2. Homoscedasticity: The variance of residual is the same for any value of X
3. Independence: Observations are independent of each other (hence randomly sampled)
4. Errors have mean of 0

…then the OLS estimators b0 and b1 are the Best Linear Unbiased Estimators of b0 and b1. The OLS method is therefore used to estimate parameters of a linear regression model (e.g., GLM).

Note: assumptions of BLUE are 1) the model is unbiased (the results will, on average, hit the bull’s eye) and 2. it has the lowest variance among all the unbiased estimators

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

Diagonal matrix

A

only have non-0 entries in the diagonal

22
Q

Identity matrix

A

has only 0s, except only 1s on the diagonal

23
Q

Matrix inverse

A

The inverse of a matrix A is the matrix A^(-1) which, if multiplied by A, yields the identity matrix I

24
Q

A matrix is invertible only if…

A

1) it is a square matrix and 2) it has full-rank

25
Q

A rectangular matrix (B) is invertible only if…

A

the columns and rows are linearly independent and B * B.T gives a square matrix

26
Q

Formula for estimating beta_hat

A

(X.T* X) ^(-1) X.T * Y = beta_hat

27
Q

X (the design matrix) should have [more/less] … rows than columns

A

“more”. This means that we should try to have more subjects/observations than parameters

28
Q

T/F: X and Y must have the same first dimension. Also exaplain your decision.

A

True; because we need a Yi for each subject/observation, and since the subjects are stored in the rows of X (hence representing the first dimension of X), X and Y should have the first same dimension.

29
Q

Relation between GLM, OLS and BLUE

A

The GLM states that y = bX + e. We can estimate the betas using the OLS (ordinary least squares model). The OLS gives us the best linear unbiased estimate IF the conditions specified by the Gauss Markov theorem are met (namely, that X and Y have a linear connection and that the unexplained signal is actually noise with a normal distribution)

30
Q

In the context of fMRI data analysis, what would a TWO-sided hypothesis test be interested in?

A

Whether beta is different than 0

31
Q

In the context of fMRI data analysis, what would a ONE-sided hypothesis test be interested in?

A

Whether beta is positive or negative

32
Q

What is the meaning behind the p-value?

A

Assuming our null hypothesis is true, how likely are we to obtain a value more extreme than our statistic?

33
Q

Type I and Type II erros

A

Type I: incorrectly rejecting the H0; Type II: incorrectly rejecting the HA

34
Q

Definition of “contrast”

A

The difference between two (groups of) betas

35
Q

Example: What contrast-vector would you need to test whether beta2 differs significantly from 0 (assuming we have beta0, beta1, beta2 and beta3)?

A

1) Start with:

H0: b2 = 0

HA: b2 !=0

2) Re-arrange with 0 on the right; in this case, it is already like that, so we have:

[…]* [betas] = beta2

Since we care about beta2, we set all the other entries in the contrast to 0 and only the one for beta2 to 1, so that we can get only its value > the final answer is [0, 1, 0, 0]

36
Q

What do the betas (beta1, … , betaN) represent?

A

The average change in the dependent variable for a 1-unit increase in the corresponding predictor (X1, … , XN)

37
Q

The intercept (b0) represents (without mean centering)…

A

…the value of Yi (the response) when all the Xs are 0

38
Q

Some info about mean centering the predictors:

A

1) it shifts the scale, but retains the unit
2) the slope remains the same, but the interpretation of the intercept changes to being the mean of the dependent variable Yi

39
Q

We’ve established that the meaning of b0 changes if we mean center then predictors; does the meaning of b1, … , bN change?

A

No; simply, instead of saying “b1 is how Y changes for a one-unit change in the predictor” we say “b1 is how Y changes for a one-unit change in the mean centered predictor”

40
Q

The canonical HRF (definition)

A

A model of the change in blood-oxygenation-level-dependent (BOLD) signal through time in response to neural activation. It represents how we think a voxel is going to respond to a stimulus.

41
Q

In terms of voxel activity, what does Yi represent?

A

It represent the time series activity of a single voxel (i). We get one activity point (Yi) per time point (shape of Y).

42
Q

In the context of fMRI data-analysis, we use the t-statistic to…

A

…measure how many standard errors the estimated parameter (beta_hat) is away from the beta_0. We want a high t-statistic, because this means that the nominator (beta_hat - beta0) is greater than the SE(beta_hat).

43
Q

LTI theorem of BOLD signal

A

Linearity + Time invariance

1) Linearity: if one signal gives x response, then two individual signals next to each other will give 2x response, etc; neuronal signal with twice the magnitude results in BOLD signal twice as large
2) Time-invariance: doesn’t matter when it happens, the response is just shifted

44
Q

Example of canonical HRF function

A

double gamma model: the first gamma is about the peak, the second gamma is about the undershoot

45
Q

Why do we wish to convolve a signal with the HRF at a high temporal resolution?

A

To avoid including more than one stimulus (lasting less than our sampling value) under one estimation

46
Q

fMRI is a [continuous/discrete] sampled signal

A

discrete

47
Q

Why do we care about resampling the predictor X prior to any analyses

A

The predictor X and the signal Y must be on the same timescale. If that is not the case, we downsample the predictor to the time scale of the signal

48
Q

What arguments does the canonical HRF care more about?

A

Time of repetition; oversampling factor (we first want the HRF on the same ORIGINAL time scale as the predictor; then, once we have the convoluted signal, we can resample it once more to the scale of the signal); length of the HRF

49
Q

T/F: The HRF is a lot smoother when defined on a less precise time scale

A

False; The HRF is a lot smoother when defined on a more precise time scale

50
Q

MSE formula

A

sum(e^2) / N ; we want a low MSE

51
Q

R^2 formula (coefficient of determination)

A

1- [ sum((y - y_hat) ^2) / sum((y - y.mean)^2) ] ; we want a high R^2. It provides information about the goodness of fit of the model.

52
Q

Temporal basis functions

A

They model the HRF as a combination of hemodynamic response functions; we convolve the predictors with multiple HRFs. For example, when using the double-gamma basis function, we convolve the predictor with both the original HRF AND with the temporal derivative of the HRF. The cool thing about this is that the temporal derivative can correct lag/onset with more precision than the canonical HRF, while the second derivative (if used) can add precision the width of the BOLD response