Lecture 11 Flashcards

1
Q

When is it appropriate to use multiple regression? What does it tell us?

A

MR examines the degree of linear association between a set of IVs on ONE DV.
MR allows us to test hypotheses about which specific IVs contribute to variation in the DV. It is appropriate to analyse linear relationships only.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Give a clinical example of multiple regression. How is multiple regression clinically important?

A

Which factors (IVs) predict the amount of improvement (DV) after therapy? Examples of IVs include: age, number of sessions, gender, parental involvement, severity of impairment.

Do these factors together predict level of improvement, and which factors contribute significantly to the prediction?

This type of research is important in determining which variables moderate the treatment effect, so as to plan therapy that maximises the treatment outcome. 


How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the general model for simple linear regression?

A

Y = C + BX + Error
Y’ = C + BX
(where Y’ is the predicted value of the DV if the linear model was 100% perfect and contained no error).

Error is the residual term (i.e. the part of Y not explained by the regression model, i.e. Y - Y’).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the aim of regression analysis?

A

To determine a linear equation that minimises the sum of squared deviations of the residuals (Y - Y’)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What does R represent?

A

The strength of the relationship between the set of predictors (Xs) and DV (Y).

It varies between 0 and 1: When R is 0 there is no linear relationship, and when R is 1 there is a perfect linear relationship.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does R squared represent?

A

The proportion (as a percentage) of variance in Y that is explained by the DV.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What are four requirements that need to be examined before multiple regression takes place?

A
  1. Is the relationship linear?
  2. Is the relationship positive or negative?
  3. Is the relationship weak or strong? (indicated by R-squared)
  4. Are there any outliers?
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How do you know if there are outliers?

A

The standardised residuals should be normally distributed with a mean of 0, standard deviation of 1, and a range of around 2 to –2.

Standardised residuals exceeding this range (e.g., > 3 or < -3) are more than 3 standard deviations from the mean and may be classed as outliers, skewing the distribution and having an impact on the regression coefficients.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is partial correlation?

A

Partial correlation is a way of examining the (linear) correlation between two variables, X1 and Y, while “controlling” for some other variable (e.g., X2).

It reflects the amount of variance between those two variables that is unique to those variables (and is therefore not also shared with the control variables).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give a clinical example of partial correlation

A

Research question: Is age (IV) related to recovery from stroke (DV)?

  • Seems to be a significant linear relationship between the two variables. However, severity of stroke also increases with age, and therefore severity may be a confounding variable/covariate.
  • After controlling for severity, “age” is no longer significantly related to amount of recovery.
  • Therefore, any effect that “age” has on “recovery” is also shared/explained by “severity”.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the general linear model in multiple regression?

A
Y = C + B1X1 + B2X2 + ... + BkXk + Error
Y' = C + B1X1 + B2X2 + ... + BkXk
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What does multiple regression assume about residuals?

A

The analysis assumes the residuals have a normal distribution with mean = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

How does multiple regression minimise the error of prediction?

A

It determines the Y intercept and the optimal weights (regression coefficients) for each IV or predictor (X) so that sum of squared residuals is a minimum (and that scores, or DVs, are clustered closely around the linear line).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is the null hypothesis in multiple regression analysis?

A

That all regression coefficients are 0 (i.e. that there is no linear relationship between any of the variables)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What can we conclude from a significant F ratio in multiple regression analysis?

A

That at least one of the IVs has a linear relationship with the DV. But the statistical test is like an omnibus test, it doesn’t tell us which IVs are significantly (and linearly) related to the DV.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

How do we know if each individual IV makes a unique contribution to variance in the DV? (Two ways)

A
  1. Each IV has a t score from SPSS. If the p value associated with the t score is <0.05, then we can conclude the IV is a significant predictor and does explain a non-zero proportion of variance in the DV.
  2. Another way of viewing this significant IV is that the B coefficient for that IV in the population is significantly greater than zero.
17
Q

If more than one IV does contribute significantly to the prediction of the DV, how can you tell which IV explains most of the variance in the DV?

A

The beta weights reflect the impact on the DV after standardising all of the variables. The size of the beta weight will reflect the strength of the linear association with the DV regardless of differences in scale of measurement.
The size of the beta weight lies between 0 and 1, the higher the value the more impact it has on the DV (a near zero weight means the IV has little impact on the DV).

18
Q

What is squared semipartial correlation?

A

Another way of evaluating which IV is significant. It is a value between 0 and 1 assigned to each IV, corresponding to the proportion of the total variance in the DV explained uniquely by that IV.
The larger the sr-squared, the more variance is explained by that factor (e.g. sr-squared of 0.35 means a 35% proportion of the total variance explained by that one variable.

19
Q

What is a practical issue with multiple regression?

A

While we can detect a significant linear relationship between an IV and a DV, we cannot infer causation.

20
Q

What is homoscedasticity?

A

The variance of the DV is independent of the level of each IV. That is, there is an equal amount of variance around each section of the linear line.

21
Q

What is partial correlation?

A

Partial correlation is a way of examining the (linear) correlation between two variables, X1 and Y, while “controlling” for some other variable (e.g., X2), or variables (X2, X3, … Xk). The partial correlation reflects the association between the two target variables after taking into account any association those two variables have with the control variable/s.

22
Q

What is Standard MR?

A

All IVs are entered into the MR equation at once. Each IV is assessed for its contribution to explaining variance in the DV. Shared and unique variance is assessed.

23
Q

What is Hierarchical or Sequential MR?

A

The IVs are entered into the equation in an order where high priority or theoretically important variables are entered first. Each IV is assessed in terms of how much additional variance in the DV is explained by that IV when entered at that point in the equation. For example, does parental support predict improvement in language skills associated with speech therapy after controlling for the effects of number of sessions? Number of sessions would be entered first in the hierarchical MR, perhaps with other variables such as age and intelligence level of the child, followed by the measure of parental support. To answer “yes” to our question, we would expect the measure of parental support to be a significant predictor.

24
Q

What is stepwise MR?

A

The researcher controls the order in which the IVs are entered; the IV with the best prediction is entered first. The IV that adds the next best prediction is added second, and so on. In stepwise MR, the focus is on yielding the most accurate prediction possible, rather than a solution that focuses on the theoretical importance of a specific variable.