Lecture 19 ARM Flashcards
Testing Ideas With Numbers - 6 / 7 (25 cards)
DISCLAIMER
We are not expected to know how to do any calculations
We need to know the themes and concepts - particularly those from the readings we are assigned.
Descriptive statistics vs inferential statistics
Descriptive: Describe what your SAMPLE LOOKS like (looking at mean, averages, etc)
Inferential: What does the POPULATION look like based on the sample? (what does your sample tell you about the population?)
Inferential statistics assumptions
Problem: we do not know how the population is distributed- we cannot study the entire one, usually
- We assume that it is a NORMALDISTRIBUTION of the sample, that is how we treat the sample (eg Z-scores)
Central Limit Theorem
Even if the original population data are not normally distributed, !the averages of many samples taken will be distributed normally! as long as the sample size is large enough.
So, as the sample size gets large enough, the sampling distribution becomes almost normal regardless of the shape of the population (aka look like a bell curve)
Key to remember: Really depends on how you sample the study. You CANNOT run statistical inferences on a sample that was NOT done random or through probability sampling (quantitiative sampling).
Theoretical justification for relying on normal distribution - can rely on that despite what the raw data might say- Make it bell-shaped through z-scores in order to make inferences.
Sample size gets larger - closer to normal distribution
This allows us to make claims about a population on the basis of just one sample, regardless of the distribution of the population.
EXPECTin exam!
Exceptional measures in normal distribition
The furthest values from the peak (mean-mode-median) of the bellcurve
Another recap Central Limit Theorem
If you sample from a population, regardless its distribution, and you do multiple samples from the same population over and over again, the samples will be distributed normally.
This is if the samples sizes are 30 or more (heuristic).
Thus, the theorem allows us to make claims about the population based on one sample!
However, we must be aware of the coincidental errors that we happen to take - meaning that the sample could be a bit off the population value. Therefore, we sample STANDARDERRORS and confidence intervals!
Statistical significance
How confident can we be that the findings from a sample can be generalised as a whole
and
The probability that the null hypothesis is true
Standard error
The standard error of the mean, or standard error, indicates - how different the population mean is likely to be from the sample mean.
Decreases as the sample size (n) increases.
Directly related to standard deviation of a population - more variability = more standard errors
Standard error is used to construct the confidence interval
Difference margin of error vs standard error
Standard error - how the population mean is likely to vary from the sample mean. Used in hypothesis testing to find statistical significance.
Margin of error - a RANGE within which the true value of population is likely to fall - you add and subtract the margin of error from the sample statistic to find the confidence interval
Testing procedure for statistical significance
1)Set up a null hypothesis
2) Decide on a level of statistical significance
3) Use a statistical test
4) Reject or accept null hypothesis
Hypothesis
Testable idea or prediction
We assume that there is no effect and then look at how surprising our data is under that assumption
-If data would be very UNLIKELY assuming no effect - we have evidence to reject the “no effect” idea and say that our alternative hypothesis is supported
Null hypothesis(H0)
We assume that there is no relationship between the variables we test. Nothing going on here! No effect, no difference, no association. We always assume H0is true unless evidence strongly suggests otherwise.
Innocent until proven guilty!
Alternative hypothesis (H1)
The research hypothesis. What you want to find out, the reason you do the research in the first place. We assume that there is a relationship between the variables we are testing.
The steps of hypothesis testing
1) State the hypotheses: Both null hypothesis and alternative hypothesis
2) Choose significance level (alfa): commonly a = 0.05 or 5% which is how much risk of error you accept
3) Collect data and compute test statistic: Gather your sample data and use an appropriate statistical test (t-test, chi-square etc) to get a test value)
4) Calculate the p-value: Find the probability (p) of getting a result as extreme as yours if H0wre true
5)Make a decision - compare your p-value to alfa. If p < a, the data are unlinked under H0, hence reject H0 and accept the H1. If p >a, the data are not that unlikely under H0,so you fail to reject H0 and do not accept H1
One-tailed hypotheisis
H1 is in one-direction (greater than or less than)
eg “Tutorial teacher presence has a positive impact on student learning” or
“Tutorial teacher present has a negative impact on student learning”
Two tailed hypothesis
H1 is in two-directions - acknowledges difference without signalling the direction.
eg “Tutorial teacher present has an impact on learning for students”.
More common to stay open - unless you have very good reasons to go against it.
Failing to reject H0..
does not mean that H0is true!
We avoid saying “accept H0”-we may need more data, or something else.
say rather - found no significant effect
Significance level
= alfa.
the cutoff probabliity for “rare enough to reject H0”.
commonly 0.05
chosen BEFORE analysis
a = 0.05 means we accept a 5% chance of mistakenly seeing an effect that is not there (Type 1 error) .
A smaller a, eg 0.01 leads to harder to declare statistical significance, lower false alarm risk but higher chance of missing a real effect (type 2 error)
P-value
p = probability
the computed probability that, if H0 is true, the observed data would occur.
A small p value = data (testing H0) would have a very unlikely chance of reoccurring, casting doubt on H0
Eg p = 0.03 means that there is a 3% chance of seeing data like ours if the null were true.
Decision rule:if p value is < a, we call the result statistically significant and thereby reject H0.It does not necessarily prove H1,but is kinda in favor.
p < 0.05 = significant (*)
p < 0.01 = highly significant (**)
PVALUEdoes not mean the chance of H0 being true -it means that if H0WERETRUE, a result like the p-value would only occur that many times. (eg 3% of the times, not 3% chance of being true)
Visualisation of the alpha
One sided test (left) - should show in 0.05 total area on the left of the bell curve and vice versa for right side - disregard the possibility of a relationship in the other direction (higher). But more power to detect an effect
Two sided test - should show in the 0.025 area on each side of the bell curve
One sided vs two sided testing
Undirected: There is an association, but we do not know the direction (two directions)
Directed: There is an association and we do know the direction (one direction)
Type 1 error
Also called a “false positive”.
This occurs when you reject the H0,but H0is actually true.
- we think we find a significant effect, but in reality it is not true
- the probability of type 1 error is alfa.
eg concluding that a ritual impacts health, when it actually does not
Type 2 errors
Also called “false negative”
Failing to reject the H0when H0is false - we fail to detect an effect that is there.
-We think that there is no significant result, but we just missed it
- Probability of type 2 error is beta. - depends on sample size, effect size and variability.
Trade off type 1 and 2 errors
If you lower alfa value, eg 0.01, you reduce type 1 error risk - but you also increase type 2 error risk as it becomes harder to find a real effect. Balancing act.
a= 0.05is a compromise to balance type 1 and 2 errors.
Tip - it should be possible to replicate the study to ensure there is not a one-time fluke.