Week Five Flashcards
(29 cards)
What do inductive statistics provide?
The means to estimate population parameters from sample statistics and test hypotheses.
What are the two golden rules of quants?
- Correlation does not imply causation.
2. r is not the line of best fit.
At what level of measurement do we use parametric tests?
The continuous level - interval or ratio.
Parametric tests can only be used when?
The data is normally distributed.
What is the rule of thumb about big enough sample sizes?
The sample must be greater than 30 to be deemed large enough.
Define the sample distribution of the sample mean
This is the distribution of sample statistics that would be obtained if a large number of random samples of a given size were drawn from a given population.
It is a hypothetical distribution.
At what level of measurement do we use non-parametric tests?
They can be used with nominal or ordinal level data.
Non-parametric tests can be used when?
We have a small sample and when we don’t know anything about the underlying population.
What is the Central Limit Theorem (CLT)?
The theorem states that the sampling distribution of the sample mean approaches a normal distribution as the sample size grows.
What must happen for the CLT to apply?
There must be at least 10 cases of an event happening and of the event not happening.
What do we use the CLT for?
The CLT mean that we are able to use the sample statistics to estimate the value of population parameters, and indicate how accurate the estimates are.
What are the four fundamental propositions of the CLT?
- Regardless of the distribution of the population, the sampling distribution of the sample mean will be approximately normal if the sample size is big enough.
- The mean of the sampling distribution of the sample means will be equal to the unknown population.
- The standard deviation of the sampling distribution indicates the range of possible error.
The standard deviation of the sample is the best estimate of the population standard deviation.
What is the equation for standard error?
Standard error = population SD ➗ the square root of the number of cases.
How do we reduce standard error?
Increase the sample size.
Explain confidence intervals.
If we known the sample mean we can fix a confidence interval around it to give us a interval estimate, which will predict the parameters the true population mean will fall in.
What is the standard confidence interval? And what does it mean?
95% confidence interval.
The 95% confidence interval means that we expect to be correct that the interval will include the population mean 95% of the time.
How can we increase confidence in interval estimates?
If you want to be more confident that the interval estimate includes the population mean we can choose a wider confidence interval, at the cost of being less precise. E.g. 99% confidence interval.
How can we increase precision of interval estimates?
If you want to be more precise in the estimate of the population mean then you can choose a narrower confidence interval, at the cost of being lest confident. E.g. 90% confidence interval.
What do we need to find the confidence interval?
The population SD and the z-value.
How do you calculate confidence intervals?
CI = mean + or - z-score x sampling error.
How do we calculate the alpha?
Alpha = 1 - the confidence interval.
When do we use the t-distribution?
When dealing with small samples and where we do not know the population SD.
For the same level of alpha the t-distribution uses?
A slightly wider confidence interval.
What does the t-distribution look like?
It is symmetrical and bell-shaped.