Slides 17 ˖⁺‧₊˚♡˚₊‧⁺˖ Flashcards

1
Q

What is the dependent variable in regression analysis?

A

Usually graphed on the y-axis

Also called the outcome, response, criterion, or endogenous variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the independent variable in regression analysis?

A

Usually graphed on the x-axis

Implied to affect the dependent variable; can be called predictor or exogenous variables.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

How many dependent variables are there in a linear regression model?

A

Only one

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does regression analysis provide that correlation does not?

A

An equation of a line for prediction and the ability to include multiple independent variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What are the three major uses for regression analysis?

A
  • Determining the strength of predictors
  • Forecasting an effect
  • Trend forecasting
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What does a low p-value indicate in regression analysis?

A

Strong evidence to reject the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the formula for a simple linear regression model?

A

Y = β0 + (β1 * X1) + ε

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What does β0 represent in a regression equation?

A

The y-intercept

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does β1 represent in a regression equation?

A

The slope coefficient of the predictor X1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the null hypothesis (H0) for linear regression?

A

β1 = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the alternative hypothesis (HA) for linear regression?

A

β1 ≠ 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

In regression analysis, what does the slope represent?

A

The relationship between each independent variable and the dependent variable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Fill in the blank: Regression analysis takes correlation to the _______.

A

next level

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What does regression analysis help to untangle?

A

Intricate problems where variables are entangled

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What is Ordinary Least Squares Regression (OLS)?

A

A method for fitting a line to data in simple linear regression

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What does the p-value represent in the context of regression analysis?

A

The strength of evidence against the null hypothesis

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What does regression analysis allow you to control for?

A

The potential influence of other variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What is a typical question addressed by regression analysis regarding marketing?

A

How much additional sales income do I get for each additional $1000 spent on marketing?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is the purpose of calculating standard error (SE) in regression?

A

To quantify the range of uncertainty and build confidence intervals

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

In multiple linear regression, how many predictors can be included?

A

Two or more

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is the significance of the regression equation?

A

It characterizes the relationship between two variables more holistically

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

True or False: Regression analysis can include more than one independent variable.

A

True

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What does the p-value test in regression analysis?

A

The null hypothesis that the variable has no correlation with the dependent variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

If the p-value is less than the significance level (alpha = 0.05), what can be concluded?

A

The sample data provide strong evidence to reject the null hypothesis for the entire population.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
What does a p-value greater than the significance level indicate?
Insufficient evidence in the sample to conclude that a non-zero correlation exists.
26
What is the first step in the regression process?
Graph a scatterplot and check for linearity.
27
What is the purpose of checking residual plots in regression analysis?
To ensure that you have unbiased estimates.
28
What does the R-squared value indicate?
The percentage of the variance in the dependent variable that the independent variables explain collectively.
29
What does an R-squared value of 0% represent?
A model that does not explain any of the variation in the response variable around its mean.
30
What does an R-squared value of 100% represent?
A model that explains all of the variation in the response variable around its mean.
31
What does the y-intercept (β0) represent in regression?
The predicted value of the dependent variable when the independent variable is 0.
32
What does the slope coefficient (β1) indicate?
The expected change in the outcome variable for a one-unit change in the predictor variable.
33
What does a positive slope indicate in regression analysis?
As the value of the predictor variable increases, the value of the outcome variable also tends to increase.
34
What does a negative slope suggest?
As the predictor variable increases, the outcome variable tends to decrease.
35
What is the significance of the p-value for the slope in regression?
It indicates whether the relationship between the predictor and the outcome variable is statistically significant.
36
What are the null and alternative hypotheses for the relationship between independent and dependent variables?
H0: β1 = 0 (no relationship); HA: β1 ≠ 0 (there is a relationship).
37
What is the purpose of calculating standard error in regression?
To quantify the uncertainty in the sample's point estimates.
38
True or False: R-squared is the only statistic needed to evaluate a regression model.
False.
39
Fill in the blank: The slope tells us how much change in the outcome variable we can expect when the predictor variable changes by _______.
1 unit.
40
What must a regression model satisfy to obtain unbiased coefficient estimates?
The assumptions of OLS linear regression.
41
What does the model predict regarding bids when the starting price is $0.00?
On average, 15.7 bids.
42
What does the relationship between p-value and null hypothesis indicate?
If the p-value is small, we can reject the null hypothesis.
43
What does the coefficient β1 represent in a regression model?
The point estimate for the parameter of the larger population.
44
What is the primary goal of regression analysis?
To determine how well the predictor(s) explain changes in the outcome variable.
45
What is the overall statistical picture in regression analysis?
Each piece of information, such as R-squared and p-values, should be viewed together for a comprehensive understanding.
46
What does the row labeled 'Intercept' in regression output represent?
The hypothesis test information for the y-intercept (beta-naught).
47
What is the purpose of the test statistic for the slope (beta-one)?
To quantify the uncertainty in the sample’s point estimates ## Footnote It helps in inference to determine if results from linear regression are significant.
48
What does the row labeled 'unemp' include?
Point estimate and other hypothesis test information for the slope beta-one.
49
What does the row labeled '(Intercept)' represent?
Information for beta-naught, i.e., the y-intercept.
50
At the significance level of alpha = 0.05, what does the analysis suggest about unemployment as a predictor?
There does NOT appear to be strong evidence that unemployment is a good predictor of midterm election losses.
51
What are residuals?
The leftover variation in the data after accounting for the model fit.
52
If an observation is above the regression line, what is true about its residual?
The residual is positive.
53
What is the goal in selecting the right linear model regarding residuals?
For the residuals to be as small as possible.
54
What is a residuals plot used for?
To check the residuals after fitting the linear model.
55
What does a small negative residual indicate?
That the observation is slightly below the regression line.
56
What are the three conditions that need to be met for OLS regression?
* Linearity * Nearly normal residuals * Constant variability
57
What is the first condition for OLS regression?
Linearity.
58
How do you check the linearity condition for OLS regression?
Using a scatterplot of the data before and after the regression analysis.
59
What should the points in a residual plot look like if the linearity condition is met?
Randomly scattered around the horizontal line.
60
What is the second condition for OLS regression?
Nearly normal residuals.
61
How is the condition of nearly normal residuals checked?
By plotting a histogram of the resultant residuals.
62
What is the third condition for OLS regression?
Constant variability.
63
What does constant variability imply in the context of OLS regression?
That the variability of residuals around the 0 line should be roughly constant.
64
What is homoscedasticity?
Constant variability in the residuals.
65
What is heteroscedasticity?
Non-constant variability in the residuals.
66
What should you be cautious about when applying regression to time series data?
Independent observations.
67
What indicates a bad fit in terms of residual patterns?
Non-random residual patterns.
68
What is the implication of detecting outliers in OLS regression?
The linear model may not be appropriate.
69
What does a high R-squared value indicate?
It does not guarantee a good fit if residuals are non-random.
70
What happens to the variability of y when x is larger?
The variability of y is larger when x is larger ## Footnote This indicates a potential non-linear relationship resembling a cone or trumpet shape.
71
What should be considered when applying regression to time series data?
The underlying structure of the data should be considered ## Footnote Time series data consists of sequential observations, such as daily stock prices.
72
What are residual plots?
Residual plots are visual examples of failed conditions in regression analysis ## Footnote They help assess the appropriateness of a linear regression model.
73
What is the first step in examining correlation and performing a least squares regression in R?
Download/open the correct dataset ## Footnote This is essential before any analysis can take place.
74
What is required before running the R demo scripts for regression analysis?
Install the statsr library ## Footnote Use the command install.packages("statsr") in the script.
75
What basic statistics should be run before performing regression analysis?
Basic descriptive statistics ## Footnote This includes checking variable names, dataset dimensions, and basic bivariate scatterplots.
76
What does 'eyeballing the best slope for the regression line' involve?
Manually minimizing SSR using plot_ss() ## Footnote This is a preliminary visual method to estimate the slope before using R for calculations.
77
How is the best slope for the regression line calculated in R?
Using the lm() function ## Footnote This function fits a linear model to the data.
78
What is a key practice after establishing a regression line?
Using the regression line for prediction ## Footnote This allows for forecasting based on the established model.
79
True or False: Successive observations in time series data are independent.
False ## Footnote Time series data often exhibit dependencies between observations.
80
What does the term SSR stand for in the context of regression analysis?
Sum of Squared Residuals ## Footnote It is a measure used to assess the fit of a regression model.
81
What are the two examples provided in the R demo for regression analysis?
Demo 1: mlbbat10, Demo 2: mariokart ## Footnote These datasets are used to illustrate regression techniques.
82
Fill in the blank: The relationship in time series data often shows as ______.
non-linear ## Footnote This indicates that traditional linear regression may not be appropriate without adjustments.
83
What is an important note when creating residual plots?
They are called residual plots ## Footnote This terminology is crucial for understanding regression diagnostics.