W7: Intro LMM Flashcards

1
Q

What is the assumption that can be relaxed when using linear mixed models instead of linear regression models?

A

The assumption of independent observations and residual errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are 2 examples of non-independent observations used by LMMs?

A
  1. Repeated measures (e.g longitudinal studies)
  2. Individuals as clusters / groups (e.g people within families / schools = cluster within higher order unit)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

In an intercept only model, what does the intercept represent?

A

Unconditional (not conditioned on predictors) expectation of y
Same as the mean of y

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What are the predictors in this equation:
lm( hp ~ 1 + mpg, data = mtcars) )

A

1 (the intercept) and mpg (explanatory variable)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does inclusion of fixed intercept assume for the mean of residuals?

A

Mean of residuals will be 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What are 2 reasons we fit a constant (fixed intercept) in models?

A
  1. Errors will be unbiased
  2. Regression line will be fit to find its own intercept in a way that minimizes the mean squared error (i.e distance between regression line and all data points)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What happens if we make the constant 0?
lm ( hp ~ 0 + mpg, data = mtcars)

A

There would be no intercept, leads to biases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What format do we want our LMM data to be?

A

Long format
1 ID have multiple rows
IDs can have different number of rows

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What are 3 conditions of RM data to be met in order to use RM ANOVA to analyze them?

A
  1. Discrete time points (E.g T1, T2, T3)
  2. Everyone has the same number of time points
  3. Outcome is continuous, normally distributed data
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What are 4 things RM ANOVA can’t handle?

A
  1. Continuous time (if 1 person completes day 0, 13, 22 and another 0, 1, 20)
  2. Continuous predictors (e.g age in years)
  3. Missing data on any time points (completely excluded unless imputation)
  4. Non-linear outcomes
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are 2 variations linear regression can’t capture for non-independent data?

A
  1. Different intercepts (mean) by ID (between person variation)
  2. Different slopes (r-ship between predictor + outcome) by ID (within person variation)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Linear regression has 1 fixed intercept and 1 fixed slope which violates what assumption if it’s used to analyze RM data? This also means it can’t capture what kind of effect?

A

Violates assumption of independence
Can’t capture random effects (different regression coefficient across people)
Coefficient includes intercept and slope

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What are 2 other names for LMMs?

A
  1. Multilevel models (MLMs)
  2. Hierarchical linear models (HLMs)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Why are LMMs called mixed?

A

Includes both
Fixed effects (reg coeffs identical for everyone) +
Random effects (reg coeffs vary randomly for each ppt)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

When do you use H (hierarchical) LMs?

A

When you have multiple hierarchical levels (different levels of nesting, e.g kids nested within classroom / obsv nested within ppl)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

All HLMs are LMMs.
Are all LMMs HLMs?

A

No

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

In an intercept only linear regression model, what is the intercept (mean) assumed to be for all IDs?

A

Identical (fixed)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is the equation for linear regression, intercept only model?

A

yi = b0 * 1 + ei

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the value of M and SD for fixed effects?

A

M = estimated mean
SD = 0 (no variation in everybody’s intercept, identical)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the value of M and SD for random effects?

A

M = estimated mean
SD = estimated SD (SD is free to vary, can be > 0, individual variations in intercept/mean)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

With mixed models, the total variance is composed of 2 variabilities:
Between (intercept) +
Within (slope) person variations.
The ratio of between variance to total variance is captured by what?

A

Intraclass correlation coefficient (ICC)
Varies from 0 to 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does ICC = 0 indicate?

A

All variability occurs within individuals
individual means are identical (no between variation)

23
Q

What does ICC = 1 indicate?

A

All variability occurs between individuals
Individual means differ
Within individuals, all values are the same

24
Q

What does ICC of 0.40 tell us?

A

40% of total variance occur between people
60% of total variance occur within people

25
Q

What function do you use in R to calculate ICC?

A

iccMixed (“dStress”, id = “ID”, data = d)

26
Q

The residual output from iccMixed indicates what?

A

Within person variance

27
Q

Is lower ICC more or less stable from day to day than higher ICC?

A

Less stable (more within variation)

28
Q

How do you interpret the mean/intercept for between effects /variation?

A

Mean/intercept is constant/the same for single person across days
Each person has a different mean/intercept

29
Q

How do you interpret the mean/intercept for within effects /variation?

A

Mean/intercept has daily fluctuations (changes across days)
I.e individual deviations from individual’s own mean (aka residual variance)

30
Q

What is the equation for linear mixed, random intercept only model?

A

yij = b0j * 1 + eij

31
Q

Explain each component of the linear mixed, intercept only model equation:
yij = b0j * 1 + eij

A

yij = outcome with observations for specific unit (j), at specific time point (i)
- assumed to follow normal distribution
b0j = estimated intercept (fixed + random) for each unit (j)

32
Q

What is the equation for b0j (random intercept)?
b0j = ___ + ___

A
  • b0j = y00 (mean (fixed) intercept) + u0j (individual unit deviations (random) from y00)
33
Q

How many parameters does intercept only linear mixed model have:
yij = b0j * 1 + eij?

A

3 (fixed intercept, SD of individual intercepts, residual errors)

34
Q

Besides assumption that observations don’t have to be independent, what is the another new assumption of LMMs?

A

Random intercept assumed to follow normal distribution (bc added new parameter of SD of individual intercepts)

35
Q

Linear regression model uses lm(), what function does LMMs use instead?

A

lmer()

36
Q

What is the equation for stress predicted by fixed and random intercept using lmer()?

A

lmer( stress ~ 1 + (1 |ID), data = d)

37
Q

What does this function show:
fixef(x)

A

Seeing fixed effects coefficients only
1 intercept value

38
Q

What does this function show:
coef(x)

A

Seeing random effects coefficients only
Many intercept values

39
Q

What is shrinkage and what does it do to individual estimates?

A

Difference between model estimated intercept (BLUPs) and actual (raw) mean across IDs
Random intercept tend to shrink individual estimates towards overall fixed effect estimate

40
Q

What are best linear unbiased predictors (BLUPs) an estimation of?

A

random effects including shrinkage

41
Q

Under which 2 conditions are the degree of shrinkage (difference between BLUPs and raw means) largest?

A
  1. More extreme intercepts (people whose intercept = further away from fixed intercept)
  2. People with less data points within ID
42
Q

What function do you use to check diagnostics of LMMs?

A

plot(modelDiagnostics( x, ev.perc = .001) )

43
Q

What are the 3 plots from model diagnostics for model with random intercept?

A
  1. Density plot of residuals (assumption of normally distributed residuals + identify outliers)
  2. QQ plot of residuals (assumption of homoscedasticity/equal variance)
  3. Density plot of random effects - titled “ID: (intercept)”
    Assumption that random effects (intercept coefficients) are normally distributed
44
Q

What function do you use to calculate each person’s mean (between) and deviations from their mean (within variables)?

A

dd [ !is.na(ID), c ( “Bstress” , “Wstress”)
:= meanDeviations (dStress), by = “ID”)

45
Q

When cleaning for extreme values for stress, should we start with between or within variables first?

A

Within
Extreme stress value at within level will affects between level stress value

46
Q

What is step 1 of cleaning extreme values, starting with examining within level stress data?

A

Plot/examine distribution of Wstress data
plot (testDistribution (dd [!is.na (ID)$Wstress,
extremevalues = “theoretical, ev.perc = .005 )

47
Q

What is step 2 of cleaning EVs?

A

Subset extreme values (pick only rows with EVs)
testDistribution(dd2$Wstress,
extremevalues = “theoretical”, ev.perc = .005)$Data[isEV == “Yes”]

48
Q

After removing EVs for within level data, what do we do?

A

Recreate between and within person data using meanDeviations because removing EVs on within level changes between level (average) values

49
Q

When examining between level data after cleaning within level data, which rows should we use?

A

Doesn’t matter (between level value = same for all IDs)
Just remove rows where ID is duplicated:
dd.noev[ ! duplicated( ID ) ]

50
Q

If there are EVs for within level data, do we exclude entire IDs or specific days?

A

Specific days

51
Q

If there are EVs for between level data, do we exclude entire IDs or specific days?

A

Entire IDs (and all rows associated with that ID)

52
Q

b0j = y00 + u0j.
u0j follows what distribution with mean and SD of what?

A
  • u0j assumed to follow normal distribution (mean = 0, SD = SD of deviations)
53
Q

Will people with more data will have a BLUP closer / further to the observed mean of their own data / average mean of all people in an intercept only model?

A

Closer to observed mean of own data

54
Q

Will people with less data (say only 1-2 observations) will have a BLUP that is closer / further to the observed mean of their own data / average mean of all people?

A

Closer to average mean of all people
(assumed that the mean of their 1-2 data points is likely very noisy/inaccurate due to small sample size)