Part 5. Sampling & Estimation Flashcards

Question

Alpha (a) vs 1- alpha (a)

Answer 1

Alpha (a) = the level of significance for the confidence interval 1-alpha (a) = the degree of confidence. e.g. we might estimate that the population mean of random variables will range from 15 to 25 with 95% degree of confidence, or at 5% level of significance.

Answer 2

point estimate +- (reliability factor x standard error) where: point estimate = value of a sample statistic of the population parameter. reliability factor = number that depends on the sampling distribution of the point estimate and the probability that point estimate falls in confidence interval (1-a). standard error = standard error of point estimate.

Answer 3

1. Probablistic interpretation | 2. Practical interpretation

Answer 4

After repeatedly taking samples of CFA candidates, administering the practice exam, and constructing confidence intervals for each samples mean. 99% of the resulting confidence intervals will in the long run include the population mean.

Answer 5

We are 99% confident that the population mean score is between 73.55 and 86.45 for candidates from this population.

Answer 6

1. Compute df: n-1 2. Find the appropriate level of alpha or significance, dependent on concerning one tail (a) or two tail (a/2). CF intervals designed for two tailed, as they compute an upper and lower limit. 3. To find t29,2.5, find 19 df row and match it with 0.025 column, resulting in t=2.045.

Answer 7

1. If distribution is nonnormal, but population variance is known, the z-statistic can be used as long as sample size is large (n>/30), this is possible as the central limit theorem assures that the distribution of sample mean is approx. normal when sample is large. 2. If distribution is nonnormal, but population variance is unknown, the t-statistic can be used as long as the sample size is large (n>/30), but also acceptable to use z-statistic although t-statistic is more conservative. Overall: - sampling from non-normal distribution we cannot create a confidence interval if sample size is less than 30, so all else equal make sure you have a sample of at least 30, the larger the better.

Answer 8

1. Larger samples may contain observations from a different population distribution, if we include observations coming from a different population (with different population parameter), we may not improve and even reduce the precision of our population parameter estimates. 2. The cost of using a larger sample must be weighed against the value of increase in precision from increase in sample size.

Answer 9

Occurs when analysts repeatedly use the same database to search for patterns or trading rules until one that works is discovered. i.e. evidence that value stocks appear to outperform growth stocks, arguing as product of data mining, but data set for historical stock returns is limited.

Answer 10

The results where the statistical significance of the pattern is overestimated as the results were found through data mining.

Answer 11

- evidence that many different variables were tested, most of which are unreported until significant ones were found. - the loack of any economic theory that is consistent with empirical results. solution: - avoid data mining to test potential profitable trading rule on a data set different from one you used to develop rule. (i.e. use out-of-sample data)

Answer 12

- Occurs when some data is systematically excluded from analysis, due to lack of availability. - This practice renders observed sample to be nonrandom, and drawn conclusions from sample cant be applied to population, as the observed sample and portion of population that was not observed are different.

Answer 13

- Most common form of sample selection bias. - A good example in investments is the study of mutual fund performance, i.e. mutual fund database Morningstar's only include funds currently in existence, not funds ceased to exist due to closure or merger. - Funds that are dropped from sample have lower returns than surviving funds, this surviving sample is biased toward better funds (i.e. not random) - these yield results that overestimate the average mutual fund return as database only includes better performing funds. - solution to this bias is to use sample of funds that all started at the same time and not drop funds that have been dropped from sample.

Answer 14

- Occurs when study tests a relationship using sample data that was not available on the test date. e. g. consider the test of trading rule based on price-to-book ratio at the end of fiscal year, the stock prices are available for all companies at same point in time, while end-of-year book values may not be available until 30 to 60 daysafter fiscal year ends. - to account for bias, study uses price-to-book value ratios to test trading strategies may estimate book value as reported at fiscal year ebd and market vale 2 months later.

Answer 15

- Result if the time period over which data is gathered either too short or too long. - if too short, research results may reflect phenomena specific to time period or even data mining. - if too long, the fundamental economic relationship that underlie the results may have changed. e. g. findings may indicate small stocks outperform large during 1980-85, using bias relating too short time period, unsure if just an isolated occurrence. alt: study quantifies relationship between inflation and unemployment during 1940-2000 results in time-period bias, as period too long, and covers fundamental change in both variables occurred in 1980s. -- data should be divided into 2 subsamples that span period before and after change.

Part 5. Sampling & Estimation Flashcards

(39 cards)