READING 10 SIMPLE LINEAR REGRESSION Flashcards

Question

In the context of regressing a stock's excess return on the market's excess return, the slope coefficient is often referred to as: (A) Alpha. (B) Standard deviation. (C) Beta. (D) R-squared.

Answer 1

(C) Beta. In the Capital Asset Pricing Model (CAPM) framework, the slope coefficient from regressing a stock's excess return on the market's excess return is the stock's beta, a measure of its systematic risk.

Answer 2

(C) 20. Using the regression equation Y^ = b^0+ b^1(X). Where: b^0 = 10 b^1 = 2 and X = 5 Thus: Y^ = 10 + (2×5) = 10+10=20.

Answer 3

(C) Direction of the relationship (positive or negative). The sign of the slope coefficient (+ or -) tells us whether the relationship between the independent and dependent variables is direct (positive) or inverse (negative).

Answer 4

(B) For every $1 increase in R&D expenditure, profit is expected to increase by $0.80. The slope coefficient of 0.8 implies that for each $1 increase in the independent variable (R&D expenditure), the dependent variable (profit) is expected to increase by $0.80. The statistical significance indicates that this relationship is unlikely to be due to random chance.

Answer 5

(B) The relationship is linear. A fundamental assumption of simple linear regression is that the relationship between the dependent and independent variables can be adequately modeled by a straight line

Answer 6

(C) A systematic, non-linear pattern (e.g., U-shaped or inverted U-shaped). When a linear model is fit to non-linear data, the residuals will often exhibit a systematic pattern that mirrors the non-linearity in the underlying relationship. In this case, a U-shaped pattern in the residuals would be expected.

Answer 7

(C) Linearity of the relationship is likely met. A random scatter of residuals indicates that the errors are not systematically related to the independent variable, supporting the assumption that a linear model is appropriate for the relationship.

Answer 8

(B) Biased coefficient estimates and unreliable predictions. If the true relationship is non-linear but a linear model is used, the model will not accurately capture the relationship, leading to biased estimates of the coefficients and unreliable predictions of the dependent variable.

Answer 9

(C) Variance of the residual terms is constant across all levels of the independent variable(s). Homoskedasticity means "same scatter" and refers to the condition where the spread or variance of the prediction errors (residuals) is constant across all values of the independent variable(s).

Answer 10

(C) Heteroskedasticity. Heteroskedasticity occurs when the variance of the residual terms is not constant across the levels of the independent variable(s).

Answer 11

(C) A funnel shape, where the spread of residuals increases or decreases as the independent variable changes. A funnel shape in the residual plot, where the variance of the residuals changes systematically with the independent variable, is a common visual indicator of heteroskedasticity.

Answer 12

(B) Unbiased but inefficient coefficient estimates and unreliable standard errors. While the OLS coefficient estimates remain unbiased in the presence of heteroskedasticity, they are no longer the most efficient, and the standard errors of the coefficients become unreliable, affecting hypothesis testing and confidence intervals.

Answer 13

(C) The standard errors of the regression coefficients are unreliable. Unreliable standard errors due to heteroskedasticity can lead to incorrect t-statistics and F-statistics, resulting in flawed conclusions from hypothesis tests about the significance of the regression coefficients.

Answer 14

(C) Not be correlated with the error term for any other observation. The independence assumption requires that the residuals are not systematically related to each other. The error in predicting one data point should not influence the error in predicting another.

Answer 15

(B) Autocorrelation or serial correlation. Autocorrelation (for time series data) or serial correlation (more general term) refers to the situation where the residuals are correlated with each other

Answer 16

(C) Independence of residuals. A cyclical pattern indicates that the prediction errors are systematically related over time, violating the independence assumption.

Answer 17

(B) Unreliable standard errors of the coefficients. Autocorrelation typically leads to underestimated standard errors, making hypothesis tests unreliable.

Answer 18

(C) Using the Durbin-Watson test. The Durbin-Watson test is a formal statistical test specifically designed to detect first-order autocorrelation in the residuals of a time series regression.

Answer 19

(C) Normality of residuals. A Quantile-Quantile (Q-Q) plot compares the quantiles of the residuals to the quantiles of a normal distribution. If the residuals are normally distributed, the points on the Q-Q plot should fall roughly along a straight line.

Answer 20

(C) Conducting valid hypothesis tests and constructing confidence intervals, especially with small sample sizes. While OLS estimators have desirable properties even without normal residuals (especially with large samples due to the Central Limit Theorem), the normality assumption is important for the validity of t-tests, F-tests, and confidence intervals.

Answer 21

(C) Outliers. Outliers are data points that lie far away from the general pattern of the data. They can have a disproportionate influence on the regression results.

Answer 22

(C) Estimated regression line and parameter estimates.

Answer 23

(C) The standard errors of the coefficients are likely underestimated, leading to inflated t-statistics. Positive autocorrelation typically leads to underestimated standard errors, which in turn can inflate the t-statistics and potentially lead to incorrect rejection of the null hypothesis.

Answer 24

(B) Positive autocorrelation. The Durbin-Watson statistic ranges from 0 to 4. A value close to 2 suggests no autocorrelation, a value close to 0 suggests positive autocorrelation, and a value close to 4 suggests negative autocorrelation.

Answer 25

(B) Including lagged values of the dependent variable as independent variables. Including lagged dependent variables can help capture the time dependence in the data and reduce autocorrelation in the residuals.

Answer 26

(C) Autocorrelation or heteroskedasticity. The Central Limit Theorem helps with the normality assumption in large samples, but it does not address problems like autocorrelation (dependent errors) or heteroskedasticity (non-constant variance of errors). These issues require specific detection and correction methods.

Answer 27

(C) Total variation in the dependent variable around its mean. SST quantifies the total dispersion of the observed values of the dependent variable (Yi) around their average (Yˉ).

Answer 28

(C) Variation in the dependent variable explained by the independent variable. SSR represents the portion of the total variation in the dependent variable that is accounted for by the linear relationship with the independent variable (i.e., the variation of the predicted Y^i around Yˉ).

Answer 29

(B) Unexplained variation in the dependent variable (the sum of squared residuals). SSE quantifies the amount of variation in the dependent variable that is not explained by the regression model (the sum of the squared differences between Yi and Y^i).

Answer 30

(C) SST = SSR + SSE The total variation in the dependent variable (SST) is equal to the sum of the variation explained by the regression (SSR) and the variation that remains unexplained (SSE).

Answer 31

(C) 1 In simple linear regression, there is one independent variable, so the degrees of freedom for the regression are k = 1. MSR = SSR/k = SSR/1 = SSR.

Answer 32

(D) n−2 The degrees of freedom for the error term are n-(number of estimated parameters) = n−2 (one for the intercept and one for the slope). MSE=SSE/(n−2).

Answer 33

(C) SSR/1 MSR is the Sum of Squares Regression (SSR) divided by its degrees of freedom, which is 1 in simple linear regression.

Answer 34

(C) SSE/(n−2) MSE is the Sum of Squared Errors (SSE) divided by its degrees of freedom, which is n−2 in simple linear regression.

Answer 35

(C) Overall goodness of fit and significance of the regression relationship. ANOVA provides a framework to evaluate how well the regression model as a whole explains the variation in the dependent variable.

Answer 36

(B) Stronger linear relationship and a better fit of the regression model. If SSR is large compared to SST, it means a larger proportion of the total variation in Y is explained by the regression, indicating a better fit.

Answer 37

(C) MSR/MSE The F-statistic is the ratio of the Mean Square Regression (MSR) to the Mean Square Error (MSE). It is used to test the overall significance of the regression relationship.

Answer 38

(B) Better fit of the regression model, as less variation is unexplained. A smaller SSE means that the regression model has smaller prediction errors and therefore provides a better fit to the data.

Answer 39

(C) At least one of the regression coefficients (in this case, the slope) is significantly different from zero, implying a significant linear relationship. A significant F-statistic suggests that the regression model as a whole explains a statistically significant portion of the variation in the dependent variable, meaning there is a significant linear relationship between X and Y (since there's only one independent variable in simple linear regression).

Answer 40

(C) Standard deviation of the residuals. The SEE represents the standard deviation of the prediction errors (residuals), indicating the average dispersion of the actual data points around the fitted regression line.

Answer 41

(C) Better fit of the regression model and more precise predictions. A lower SEE signifies that the data points are closer to the regression line, implying smaller prediction errors and a better fit of the model to the data.

Answer 42

(B) Proportion of the total variation in the dependent variable explained by the independent variable(s). R² measures the percentage of the total variation in the dependent variable that is accounted for by the regression model. It indicates the goodness of fit.

Answer 43

(B) 0.75 R² = SSR / SST = 75 / 100 = 0.75. This means that 75% of the total variation in the dependent variable is explained by the regression model.

Answer 44

(C) The square of the correlation coefficient. For simple linear regression (with one independent variable), R² is equal to the square of the Pearson correlation coefficient (r) between the independent and dependent variables.

READING 10 SIMPLE LINEAR REGRESSION Flashcards

(68 cards)