Multiple regression Flashcards

(63 cards)

1
Q

What is the purpose of multiple regression?

A

To estimate the value of an outcome variable (Y) based on multiple predictor variables (X)

It extends upon the principles of simple linear regression.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What are predictor variables in multiple regression?

A

Variables that are used to predict the outcome variable

They can include continuous, ordinal, or binary data.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the main difference between regression and ANOVA?

A

Regression focuses on relationships between predictor variables and one outcome variable, while ANOVA focuses on differences in scores on the dependent variable according to two or more independent variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the Forced Entry method in multiple regression?

A

All predictor variables are entered into the model at the same time without a specified order

Known as the Enter method in SPSS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is Hierarchical Regression?

A

Predictors are entered into the model in a specified order based on previous research

New predictors can be entered all at once, hierarchically, or stepwise.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the Stepwise method in multiple regression?

A

A controversial method where the order of variable entry is based on statistical criteria rather than prior research.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What does the R² value represent in multiple regression?

A

The amount of variance accounted for by the model.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What are the assumptions of multiple regression?

A

Sample Size, Variable Types, Non-Zero Variance, Independence, Linearity, Multicollinearity, Homoscedasticity, Independent Errors, Normally Distributed Errors.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the rule of thumb for sample size in multiple regression?

A

10 participants for every one predictor variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is Multicollinearity?

A

Strong correlation between predictor variables that can make interpreting results difficult.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the Durbin-Watson Test used for?

A

To test for correlations across error terms in the residuals.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is homoscedasticity?

A

The variance of the residuals should be constant at each level of the predictor variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What does the term ‘independent errors’ refer to?

A

Residuals for any two observations should not correlate.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

How should variables be coded in SPSS for binary predictors?

A

Categories must be coded as 0 and 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is a common way to check for the assumption of linearity?

A

Analyzing residuals in SPSS.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What type of data should predictor variables be in multiple regression?

A

Quantitative, which can be continuous, categorical, or ordinal.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What happens if the assumptions of multiple regression are violated?

A

It can impact the validity of the results and confidence in the findings.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are residuals?

A

The distances between the data points and the regression line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What does a VIF value greater than 10 indicate?

A

There is likely a multicollinearity problem.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is the significance of checking assumptions in regression analysis?

A

To ensure the model produced is reliable and valid.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the purpose of using power analysis in regression?

A

To determine an appropriate sample size based on the expected effect size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What does the term ‘heteroscedasticity’ refer to?

A

Unequal variability of a variable across the range of values of a second variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

True or False: The outcome variable in multiple regression must be continuous.

A

True.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Fill in the blank: In multiple regression, the regression line is also known as the _______.

A

line of best fit.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What is cedasticity?
The assumption of constant variance of errors in regression analysis ## Footnote Cedasticity is important for ensuring the validity of regression results.
26
What are Independent Errors?
Errors that are not correlated with one another in regression analysis ## Footnote This assumption is crucial for the reliability of regression estimates.
27
What does it mean for errors to be Normally Distributed?
Errors should follow a normal distribution pattern in regression analysis ## Footnote This assumption helps in making accurate inferences from the regression model.
28
Why do assumptions matter in regression analysis?
Violating assumptions can impact the validity of results and confidence in outcomes ## Footnote It is essential to check assumptions to ensure reliable interpretations.
29
What type of predictor is caffeine consumption?
Binary (consumed or not) ## Footnote Categorical predictors must be coded as 0 and 1 in SPSS data sets.
30
What are the steps in calculating a multiple regression?
1. Calculate descriptive statistics 2. Correlation matrix 3. Calculate regression 4. Interpret model fit (R², ANOVA, Beta values, intercept) 5. Check assumptions ## Footnote These steps are essential for a thorough analysis.
31
What is R² in regression analysis?
The proportion of variance in the outcome accounted for by the model ## Footnote A higher R² indicates a better model fit.
32
What does ANOVA tell us in regression?
Whether the regression model is significant or a good fit for the data ## Footnote ANOVA results include the F statistic and p-value.
33
What should be checked to ensure the assumptions of regression are met?
1. Sample size 2. Variable types 3. Non-zero variance 4. Independence 5. Linearity 6. No perfect multicollinearity 7. Homoscedasticity 8. Normally distributed errors ## Footnote Each assumption is crucial for valid regression analysis.
34
What is the significance of the Durbin-Watson test?
Tests for autocorrelation of residuals in regression ## Footnote A value of 2 indicates no correlation; values lower than 2 suggest negative correlation and values higher indicate positive correlation.
35
What does a significant ANOVA result indicate?
The regression model is a good fit for the data ## Footnote If ANOVA is not significant, the model should not be pursued further.
36
What are unstandardised and standardised betas?
Unstandardised betas represent raw change in Y for a unit change in X; standardised betas allow comparison across predictors ## Footnote Standardised betas are useful for determining the strength of relationships.
37
What does a VIF value indicate in regression?
Variance Inflation Factor assesses multicollinearity; a VIF above 10 indicates a problem ## Footnote High VIF values suggest that predictors are too highly correlated.
38
What is the hypothesis regarding caffeine intake in the study?
H1: Caffeine intake will be significantly associated with stats exam performance ## Footnote The null hypothesis (H0) states it will not be significantly associated.
39
What does the coefficient table provide in multiple regression?
Relationships between predictors and the outcome variable, including unstandardised and standardised betas, and t-tests ## Footnote This table is essential for interpreting individual predictor effects.
40
What is the significance of the model fit statistic R² = 0.887?
88.7% of the variance in exam performance is accounted for by the predictors ## Footnote This indicates a strong model fit.
41
What is the impact of violating the assumptions of regression?
It can lead to invalid results and decreased confidence in findings ## Footnote Ensuring assumptions are met is crucial for accurate analysis.
42
What is the purpose of regression analysis?
To examine the association between several predictors and one continuous outcome variable ## Footnote It helps in understanding how predictors influence the outcome.
43
What are the three core groups of statistics to look for in SPSS output?
* Assumption-related statistics * Model Fit R² statistics * Slope-related statistics ## Footnote Assumption-related statistics include VIF/Tolerance for multicollinearity and Durbin-Watson for independent errors. Model Fit includes ANOVA results with F and p-values. Slope-related statistics cover unstandardized and standardized betas, as well as T-tests for each predictor.
44
What is the purpose of multiple regression in the context of assessments?
To test the association between multiple predictor variables and one outcome variable ## Footnote In the context provided, three hypotheses are developed to explore the associations.
45
What types of predictor variables can be used in multiple regression?
* Continuous * Binary * Ordinal ## Footnote At least one continuous predictor is required.
46
What must the outcome variable be in multiple regression?
Continuous
47
What is the hypothesis structure for multiple regression?
One hypothesis per predictor variable ## Footnote In journal articles, only alternative hypotheses are typically stated.
48
What are the three predictor variables examined in the study regarding stats exam performance?
* Caffeine intake * Sleep * Number of hours spent revising
49
In the context of the study, what is the null hypothesis for caffeine intake?
Caffeine intake will not significantly be associated with stats exam performance.
50
What is the method of regression that will be covered this week?
Forced Entry Enter Method in SPSS
51
What is the target sample size based on power analysis for three predictors?
76 participants ## Footnote This is based on an alpha of 0.05 and an estimated medium effect size.
52
What are the assumptions of multiple regression?
* Sample Size * Variable Types * Non-Zero Variance * Independence * Linearity * Lack of Multicollinearity * Homoscedasticity * Independent Errors * Normally Distributed Errors
53
What is the first step in the analysis plan for multiple regression?
Describe the variables included
54
What type of regression technique is planned to be used in the analysis?
Enter method or hierarchical method
55
True or False: In the pre-registration, null hypotheses need to be included.
False
56
What is the effect of caffeine consumption on stats exam performance according to the hypothesis?
Significantly associated
57
Fill in the blank: The outcome variable in the study is marks on the stats exam, which is a _______ variable.
continuous
58
What is the significance of understanding how to interpret output in multiple regression?
It underpins future content and is important for exams.
59
What is the sample size for a small effect in power analysis with two predictors?
478
60
What is the sample size for a large effect in power analysis with five predictors?
643
61
What statistical analysis is used to determine the desired sample size?
Power analysis
62
What is one of the assumptions that need to be checked for multiple regression?
Lack of Multicollinearity
63
What is the significance of the Durbin-Watson statistic?
It checks for independent errors.