2nd Stats Exam MCQ Flashcards
(137 cards)
What is the violation of the assumption of independence? (Multi Level Modelling)
When one data point is dependent on another data point (one data point gives info to another data point)
Linear R’s assume independence so when this is violated you use a multi level model of logistic R
What does the General Linear Model change to once you start using multi-level modelling?
It becomes part of the GENERALIZED linear Model
What is the difference in calculation between independent and paired samples t test?
The way SE/variance is calculated
Independent samples - When SE is calculated > variance for both groups is incorporated into the calculation = pooledSE
Paired samples > whatever variance occurs in time 1 also occurs in time 2 e.g. individual differences (hunger, mood, time…)
Therefore do not need to count variance for both time points. If you did it would be double counting! and get drastically incorrect p values
Just like with Binary logistic regression, we assess mixed effects models using what?
Hint: similar to SSE reduced
-2 Log Likelihood
What do we find out from the Estimates of Covariance Parameter box? Specifically, the intercept variance box?
A p value either under .05 (signif) or over .05 (nonsignif)
If the p value is signif it tells us there is SIGNIF VARIANCE in intercepts and hence we were right to conduct a MLM
What is the new value introduced in Binary Logistic R, used in MLM? and what is it equivalent to?
Walde stat - equivalent to t score
Calculation:
1) estimate(bvalue)/SE (found in estimates of covariance parameter box)
2) then SQUARE
What does the ‘Empirical Best Linear Unbiased Predictions’ box show us?
Each ppts u0 variable estimate - the difference between their most ideal ‘intercept’ and b0 (deviation of ideal intercept from b0)
(
Later on in model with random intercepts AND random slopes will include each ppts u1 variable estimate too
Dependency between data points can also come in the form of what?
Clustering (a general problem) - when a subset of ppts are more connected with each other
Whether clustering exaggerates in favour of our hypothesis, or against it, dependency will always distort our analysis from reality
If 2 related ppts are both in the post-experiment group, what happens? (CLUSTERING)
Increased score for 1st ppt after experiment, causes 2nd ppt scores to be increased more through talking to 1st ppt
So, the experimental group scores are exaggerated
If there are 2 related ppts in study and one is in the control group and one is in the experimental group, what happens? (CLUSTERING)
1st ppt scores will increase after experiment > leads to increase in 2nd ppts score in control through talking to 1st ppt
makes exp group look less effective than it is, because scores in both groups look the same (not much to compare)
If 2 related ppts are both in the control group, what happens? (CLUSTERING)
An event outside study could bring the 1st ppts mood up or down, thus affecting the 2nd ppts mood too
This brings both control scores up (exp looks worse) or down (exp looks better)
scores not based exp itself!
What is heirarchical clustering? An example?
When data is naturally grouped, at multiple levels
- In schools for example, children within classes talk to each other and children within schools too (but less so)
How do we deal with (heirarchical) clustering?
Modelling the dependencies - include dependencies (both class and school) in our model as variables
When looking at if there is an effect of cosmetic surgery on quality of life, where patients are within clinics, how does clustering occur? (L7)
- having different clinics - one clinic in one area better than others = ppts from this clinic starting with higher baseline quality of life
- Different surgeons within clinics - good surgeons boost quality of life a lot more post surgery in exp group, compared to bad surgeons. Bad surgeon could do the surgery wrong and worsen quality of life!
Based on the cosmetic surgery example, what are the variables we need to consider? (L7)
- Quality of life
- Surgery
- Clinic
For the cosmetic surgery example, what is the formula for the first model where we do not specify random effects yet? (MLM L7)
NOTE: can run a MLM same as a linear regression
Quality of life = b0 +b1*Surgery
one way of getting -2LL for Linear R
Also called a No random effects / fixed effects model
When does the MLM depart from Linear R?
When we start adding random intercepts
What variable is added to the original GLM formula when we add random intercepts?
u0 variable
becomes (b0+u0[variable the varying intercept is based on]) *Time
Based on the cosmetic surgery example, what variable is the varying intercept based on?
Clinic > each clinic now has its own intercept (u0) for quality of life
Why do we vary the intercepts based on a particular variable in MLM?
By changing the intercept, we are effectively allowing DV (quality of life) to ‘start’ at a different point in each variable (clinic)
Aside from the p value given in the Estimates of Covariance Parameters box, how can we find whether DV (quality of life) varies between u0 variables (clinics)? (this can be a legitimate research q)
Once the variance is found what is the difference called?
Compare how much DV varies with random intercept to how much DV varies without random intercept (u0 variable)
-2LL current model with random intercept take away -2LL previous model with no random intercepts (linear R model)
Difference = Likelihood ratio
How do you get a chi square p value in the chi square calculator?
The likelihood ratio and DF of 1
Andy Field says that for testing random effects, the __1__ is much more accurate, so you should rely on this rather than __2__
- Likelihood ratio (diff in -2LL between random intercepts model and no random intercepts model(Lin R))
- Wald Z (in the covariance parameters box)
What in SPSS tells us whether there is in fact significant variance in intercepts between participants
i.e. whether the ‘best’ intercepts for each participant are significantly different from each other (i.e. whether they needed to be ‘random’?
Estimate of covariance parameters’ box
look at the p value