Module 5 (Lecture 5, Tutorial 2, Article) Flashcards

Question 1

Q

When do you use a Chi-square test (x^2) and what does it measure?

Answer

A

When both the IV and DV are nominal (non-metric), one group.

It measures whether the observed frequencies differ significantly from expected frequencies.
- Goal is to test for association between two nominal variables = Chi-square test (x^2) (contingency analysis).
- Goal is to predict a nominal (yes/no) DV = logistic regression.

Question 2

Q

When do you use a T-test and what does it measure?

Answer

A

Use T-test when DV is metric and the goal is to compare means.

It measures whether a mean difference is statistically significant.
- use a t-test in one group: tests whether the mean of the group differs from a known or expected value.
- use a t-test in two groups: tests whether the means of the two groups are significantly different.

Question 3

Q

When do you use a F-test and what does it measure?

Answer

A

The F-test is used when the DV is metric.

use a F-test when you have 2 groups and want to test whether variances between the two groups are different.
use a F-test when you have 3+ groups and want to test whether their means differ significantly (used in ANOVA).
also used in regression to test if the model explains a significant portion of variance.

Question 4

Q

Explain the purpose of a hypothesis test for mean differences

Answer

A

A hypothesis test checks whether an observed difference in means is likely due to random sampling error or reflects a real effect.

Null hypothesis (H0) = there is no difference between the group means.
With a t-test you compare group means.
If the difference is statistically significant, you reject h0.

Question 5

Q

What are the null and alternative hypothesis in a t-test?

Answer

A

H0 (null hypothesis): U1 - U2 = 0.
H1 (alternative hypothesis):
Two-sided: U1 - U2 ≠ 0.
One-sided: U1 - U2 < or > 0.

Question 6

Q

What does the significance level (a) mean?

Answer

A

The significance level is the threshold below which the p-value must fall to reject the null hypothesis.

It represents the probability of making a type I error.

Rejecting H0 when it’s actually true.

Lower a = fewer false alarms.

Question 7

Q

What does it mean if a test result falls in the shade tail of the bell curve?

Answer

A

It means the result is statistically significant. Unlikely to occur by random chance under H0. Reject H0.

Question 8

Q

What is power (1-B)?

Answer

A

The chance of correctly detecting a real effect, if it exists.

It’s the chance of correctly rejecting H0 when H1 is true.

Higher power = lower chance of missing a real effect.

Question 9

Q

Difference between type I error and type II error?

Answer

A

Type I = false positive, you reject H0 when it’s actually true.

Type II = false negative, you fail to reject H0 even though H0 is false.

Question 10

Q

What affects the a error (false positive)?

Answer

A

Larger effect size -> lowers a error.
Larger sample size -> lowers a error.
More data dispersion -> increases a error.

Question 11

Q

What is the objective of a regression analysis?

Answer

A

Measures the slope of the regression line.
Estimates influence of X on Y.

Question 12

Q

What is the regression formula?

Answer

A

Y = B0 + B1X

B1 is the slope of the line (difference of Y / difference of X).

Question 13

Q

What is the least squares method in regression?

Answer

A

It is a method to find the best beta 0 and beta 1 for the regression line, the values that minimise differences from the actual observations and regression line.

Steps:
1. Regression formula: Y = b0 + b1X + u
(Only error term added)

Rearrange, error term needs to be in front of equation
Minimise the total squared errors

Question 14

Q

What is R^2 in regression?

Answer

A

R^2 is the goodness-of-fit statistic in regression.
It shows how much of the variance in the DV (Y) is explained by the IV (X).

Formula: R^2 = (regression coefficient or slope)^2 x (variance of x / variance of y).

A higher R^2 means X explains more of the variation in Y.

Question 15

Q

What are the 3 limitations of R^2?

Answer

A

No rules on how high R^2 need to be.
Offers no info about how well the model performs outside sample.
Says nothing about the practical importance (you can have a high number but really small slope).

Question 16

Q

Differences between correlation analysis (3) and regression analysis (3)?

Answer

Study These Flashcards

A

Correlation analysis:
1. Correlation coefficient between -1 and +1.
2. Measures linear correlation between 2 variables.
3. No theory needed (just shows correlation) and not testable.

Regression analysis:
1. Regression coefficient (unconstrained).
2. Measures linear correlation between one DV and 1/multiple influencing variables.
3. Theoretical understanding necessary (you need to decide which variable influences which) and testable (you can also do causal models).

Question 17

Q

What is a multiple linear regression?

Answer

Study These Flashcards

A

A statistical method used to examine the relationship between 1 DV (Y) and 2 or more IV’s (X1, …).

No multicollinearity = multiple regression assumes that the IV’s are not highly correlated with each other.

Question 18

Q

What are the 4 key assumptions of linear regression?

Answer

Study These Flashcards

A

Linear relationship between DV and IV.
Error term is normally distributed.
The model should show homoscedasticity (equal spread of errors across x values).
Sample size of at least 20 cases per IV.

Question 19

Q

Which 4 variables increase the likelihood that media outlets report corporate social irresponsible news?

Answer

Study These Flashcards

A

Brand salience (how prominent a brand is in someone’s memory)
Brand strength
Level of negative word of mouth
Domestic brand

Question 20

Q

On what scale is “gender” coded?

Answer

Study These Flashcards

A

On a nominal scale.

Also use dummy variables to describe it numerically.

Question 21

Q

What are 3 advantages of using a multi-item scale compared to single-item scale?

Answer

Study These Flashcards

A

Less variables in your regression formula.
Higher reliability.
Higher validity.

Question 22

Q

Suppose we want to identify the factors that drive willingness to pay, and the independent variable is metric. What kind of econometric analysis could we perform?

Answer

Study These Flashcards

A

Regression analysis

Both IV and DV are metric.

Question 23

Q

Suppose that we ask our respondents whether they would join the festival, either yes or no. IV is metric.

What kind of analysis could we perform?

Answer

Study These Flashcards

A

Logistic regression.

For non-metric DV (yes/no) and metric IV.

Question 24

Q

What is correlation? What is causality?

Answer

Study These Flashcards

A

Correlation = when two or more events are related to each other and change together.

Causality = when one event contributes to the production of another event. The cause is partly responsible for the effect, and the effect is dependent on the cause.

Module 5 (Lecture 5, Tutorial 2, Article) Flashcards

(24 cards)