FRM Level 1 Part 2 Flashcards

Question

Gauss-Markov Theorem

Answer 1

identification if the three least squares assumptions hold and it the error is Homoscedastic, then the OLS estimator is the best linear conditionally unbiased estimator (BLUE) limitation its conditions might not hold in practice there has other not linear and conditionally unbiased estimators, which are more efficient than OLC if extreme outlier are not rare. Use least absolute deviation. In statistics, the Gauss–Markov theorem states that the ordinary least squares (OLS) estimator has the lowest sampling variance within the class of linear unbiased estimators, if the errors in the linear regression model are uncorrelated, have equal variances and expectation value of zero.[1] The errors do not need to be normal, nor do they need to be independent and identically distributed (only uncorrelated with mean zero and homoscedastic with finite variance). The requirement that the estimator be unbiased cannot be dropped, since biased estimators exist with lower variance. See, for example, the James–Stein estimator (which also drops linearity) or ridge regression. Zero mean so that the noise does not present a net disturbance to the system. There's as much positive noise as negative, so they cancel out in the long run. If the mean were not zero, then the noise would appear as an additional dynamic. The Gauss–Markov assumptions concern the set of error random variables,ei: They have mean zero: E[ei] = 0 They are homoscedastic, that is all have the same finite variance: Var(ei) = σ^2 < finite for all i Distinct error terms are uncorrelated: Cov(ei,ej)= 0 i != j

Answer 2

identification 1. the omitted variable is correlated with the movement of the independent variable in model 2. the omitted variable is determinant of the independent variable 在统计中，当统计模型遗漏一个或多个相关变量时，会发生遗漏变量偏差（OVB）。偏差导致模型将缺失变量的影响归因于所包含的变量。更具体地说，OVB是在回归分析的参数估计中出现的偏差，当假定的规范不正确时，因为它忽略了一个自变量，该自变量是因变量的决定因素，并且与一个或多个包含的自变量相关变量。 impact 1. the assumption of E(ui|xi) = 0 is not hold because Cov(ui, xi) != 0 2. OLS estimator is biased, and biased dose not vanish even in large sample 3. the large |p|, the larger bias solution add omitted valuable into the model

Answer 3

formula Yi = Beta0 + Beta1X1i + Beta2X2i + ... + BetakXki + ui the population regression line E(Y|X) = beta0 + beta1X1i + Beta2X2i + ... + BetakXki the intercept term is the expected value of Yi when Xki = 0 partial effect: beta1 = deltaY/DelaX1, holding X2,..., Xk constant, control variable Homoscedastic var(ui|X1i, ..., Xki) is constant OLS method still can be used

Answer 4

E(ui|X1i, ..., Xki) = 0 (X1i,...,Xki, Yi), i = 1,...,n are independently and identically distributed (i.i.d.) large outlier are unlikely there is no perfect multicollinearity

Answer 5

perfect multicollinearity identification: if one of the independent variable is a perfect linear, combination of the other independent variables impact: product division by zero in the OLS estimates example: Dummy variable trap: without beta0 -->N Dummy variables with beta 0 --> Dummy variables imperfect multicollinearity identification : two or more independent variables are high correlated but not perfect correlated impact: does not pose any problems for OLS estimators (still unbiased); have a high variance method to detect: t-test indicates that none of the individual coefficient is significantly different than zero, while the F-test overall significant and R^2 is high; the absolute value of the sample correlation is greater that 0.7

Answer 6

standard error of regression (SER) 回归的标准误差提供了数据点从回归线落下的典型距离的绝对度量。 S以因变量为单位。 R平方提供模型解释的因变量方差百分比的相对度量。 R平方的范围可以从0到100％。 SER = sqrt(SSR/(n-k-1)) = sqrt(sum(e^2)/n-k-1) SER = sqrt(SSR/[n-2]) = sqrt(the sum of ui^2/[n-2]) (Residual sum of squares) SSR = sum[( Y within population - sample mean)] tip: k = numbers of independents coefficient of determination (R^2) R^2 = ESS/TSS = 1 - SSR/TSS for i in n Total Sum of Square (TSS) : TSS = sum[(Yi within population - population mean)^2] Explained Sum of Squares (ESS) ESS = sum[(Yi within sample - population mean)^2] R^2 increases whenever a regressor is added, unless the estimated coefficient on the added regressor is exactly zero adjusted R^2 formula: adjusted R^2 = 1 - [(n-1)/(n-k-1) x (1 - R^2)] = 1 - [(n-1)/(n-k-1) x (SSR/TSS)] nature: 1. adjusted R^2 <= R^2 2. adjusted R&2 can be negative 3. adding a regressor has two opposite effect, adjusted R^2 can increase or decrease the R^2 or adjust R^2 dose not tell us 1. an included variable is statistically significant 2. the regressor are a true cause of the movement in the dependent variable 3. there is omitted variable bias 4. you have chosen the most appropriate set of regressors

Answer 7

page 13 tips: t-statistic = (estimated regression coefficient - value of estimate under H0)/standard error of estimated coefficient; tips: the t-statistic has n - k - 1 degrees of freedom where k = number of independents (也就是在多元回归中的因子数) confidence interval for a single coefficient the confidence interval (CI) for a regression coefficient in multiple regression is calculated and interpreted the same way as it is simple linear regression CI = estimated regression coefficient + or - critical t-value x standard error of regression coefficient ``` joint hypothesis (F-test) in a multiple regression, we cannot test the null hypothesis that all slope coefficient equal 0 based on t-test that each individual slope coefficient equal 0 why? individual test do not account for the effects of interactions amount the independent variable ``` for this reason, we conduct the F-test the F-statistic, which is always a one-tailed test, is calculated as: F = (ESS/k)/(SSR/[n-k-1]) ``` n = number of observation k = number of independent variables ESS = explained sum of squares SSR = sum of squared residuals ``` identification: a hypothesis that impose two or more restriction on the regression null hypothesis and alternative hypothesis H0: beta1 = beta2 = beta3 = ... = betak = 0 Ha: at least one Betaj != 0 the test assesses the effectiveness of the model as a whole in explaining the dependent variable 使用单独的自由度为n-k-1的F检测且不能够用单独的t检测来代替 classification of F-statistics ????????????????????? with q = 1 restriction: F is the square of the t-statistic with q = 2 restriction: F = 1/2 x (t1^2 + t2^2 - 2 x pt1,t2 x t1 x t2)/ (1 - pt1,t2^2) with q restriction (valid only when homoscedastic) F = (SSR[restricted - SSR[unrestricted]/q)/[SSR[unrestricted]/(n - k[unrestricted] - 1)] F = [(R[restricted]^2 - R[unrestricted]^2)/q] / [(1 - R[unrestricted]^2)/(n - k[unrestricted] - 1) page 13 R^2 to determined the accuracy within which the OLS regression line fits the data, we apply the coefficient of determination and the regression's standard error the coefficient of determination, represented by R^2, is a measure of "goodness of fit" of the regression it is interpreted as the percentage of variation in the dependent variable explained by the independent variables R^2 = (total variation - unexplained variation)/total variation adjusted R^2 However, R^2 is not reliable indicator of the explanatory power of multiple regression model why? R^2 almost always increases as new independent variables are added to the model, even if the marginal contribution of the new variable is not statistically significant thus, a high R^2 may reflect the impact of a large set of independents rather that how well the set explain the dependent https://analystprep.com/study-notes/frm/part-1/hypothesis-tests-and-confidence-intervals-in-multiple-regression/

Answer 8

linear trend b Tt = Beta0 + Beta1 x TIMEt (it is a straight line) non-linear trend: quadratic trend--- Tt = Beta0 + Beta1 x TIMEt + Beta1 x TIMEt^2 = Beta2 x (TIME + Beta1 / 2*Beta2)^2 + C ----- monotonically (when beta1 > 0, beta2 >0 & beta1 < 0, beta2 < 0) long-linear trend: Tt = beta0 x exp(Beta1 x TIMEt) In(Tt) = In(Beta0) + Beta1 x TIMEt ------ 增长率是常数 estimating & forecasting estimating: (beta0-hat, beta1-hat) = argmin sum(yt - Beta0 - Beta1 x TIMEt)^2 forecasting: y[T+ h] = Beta0 + Beta1 x TIME[T + h] + error[T+h] forecast model selection criteria ``` mean squared error (MSE) MSE is related to two other diagnostic previously look at --- the sun of square residuals (SSR) and the coefficient of determination: MSE = SSR / T T = sample size R^2 = 1 - SSR/TSS ``` tip: the following is penalties form of "penalty time MSE"; e = error; T = time; k = the degree of freedom unbiased estimate of the MSE ----- S^2 this is a degree-of-freedom penalty S^2 = sum(e^2)/(T - k) Akaike information criterion (AIC) AIC = e^(2k/T) x sum(e^2)/T Scharwz information criterion (SIC) SIC = T^(k/T) x sum(e^2)/T consistency two considerations are required for a model selection criteria to be considered consistent based on whether the true model is include among the regression models being considered 1. when the true model or data-generating process (DGP) is one the defined regression models, then the probability of selecting model approached one as the sample size increases 2. when the true model is not one of the defined regression models being considered, then the probability of selecting the best approximation model approached one as the sample size increases MSE ---> S^2 -----> AIC ----> SIC(consistent) 惩罚力度越来越来大 AIC 具有渐进有效性asymptotically degree-of-freedom penalties, various model selection criteria page 13 所有指标越小越好 https://analystprep.com/study-notes/frm/part-1/modeling-and-forecasting-trend/

Answer 9

definition: a seasonal pattern which repeat itself every year sources: weather, preference, social institution how to deal: macroeconomic: non-seasonal fluctuations business forecast: seasonal fluctuations modeling: yt = beta1 x TIMEt + sum(yi x Dit) + errort 注意考虑哑变量多重共线性： with intercept: 自变量取 S-1 without intercept: 自变量取S calendar effect : holiday variation trading day variation ................. https://analystprep.com/study-notes/frm/part-1/modeling-and-forecasting-seasonality/

Answer 10

page 14 | https://analystprep.com/study-notes/frm/part-1/modeling-cycles-ma-ar-and-arma-models/

Answer 11

page 15 https://video.search.yahoo.com/yhs/search?fr=yhs-iba- 1&hsimp=yhs1&hspart=iba&p=Book+2+volatility+in+ FRM#id=1&vid=08caa00f5c3b92940c0e7ab76443 37bd&action=click

Answer 12

page 17 | https://analystprep.com/study-notes/frm/part-1/correlations-and-copulas/

Answer 13

page 17 | https://www.youtube.com/watch?v=imxpWMAZMj4

FRM Level 1 Part 2 Flashcards

(37 cards)