Lecture 3 Flashcards
(45 cards)
What drives our efforts to calculate estimates like means and SDs?
Our efforts are usually driven by a desire to make comparisons amongst these estimates or quantify the strength of an association between variables.
What do we typically do before collecting data?
We make hypotheses about what we expect, usually driven by existing research theory.
What is hypothesis testing?
Hypothesis testing involves making one or more assumptions about a set of variables and then testing those assumptions statistically.
What are examples of hypotheses?
Examples include: Women use social media more than men; Women are more sexually attracted to intelligence than men.
What are null hypotheses?
Null hypotheses posit a neutral position,
such as: Men and women use social media the same amount; Women and men are equally sexually attracted to intelligence.
What are popular methods for making statistical decisions?
Popular methods include significance testing (p values), effect sizes, and confidence intervals.
What does statistical decision making often rely on?
Statistical decisions often rely on probability, calculating the likelihood that findings are meaningful.
What is significance testing?
Significance testing tests our assumptions against a null hypothesis using probability to ascertain how likely our observed effect would occur if the null was true.
If the probability is calculated to be small, then we reject the null
Example of a P value
You’re testing if a new teaching method improves test scores.
Null Hypothesis: The new method does not improve test scores.
Alternative Hypothesis: The new method does improve test scores.
After the test, you get a p-value of 0.03.
What does this mean?
p-value = 0.03: Since this is less than 0.05, you reject the null hypothesis. This means the new teaching method does improve test scores.
If the p-value was higher than 0.05, like 0.08, you would not reject the null hypothesis (the new method likely doesn’t have an effect).
What happens if the p value is less than .05?
If a p value is less than .05, we refer to it as a meaningful effect, indicating it is very unlikely we would find this if the null hypothesis was true.
we would then assume our effect is the truth to some degree
What happens if the p value is more than .05?
- It a p value is more than .05 (not significant) and there is not meaningful effect
How do we get P values
- There are several statistical tests that can yield p values
- Z test
- T test
- F test
- For these tests, p values are derived from a test distribution which can change depending on the degrees of freedom of df
What does the size of a p-value tell us about the chance of making a mistake or the meaningfulness of an effect?
- A p value not only represents the chance an effect exists if the null is true, it also represents the margin of error we allow ourselves
- i.e., the smaller the p value, the less chance we have made a mistake (rejected the null when we shouldn’t have)
- The larger the p value, the greater chance that there is no effect or it is just chance (it is not meaningful)
What is the standard threshold for statistical significance?
The standard threshold for statistical significance is 5% (or 0.05).
What is degrees of Freedom
- Df represents the number of scores within a sample that are free to vary
Example:
4 people = age of 100
- People in sample can be whatever age but the last person must add up to the sum of all scores ( does not freedom to vary)
What are test statistics?
- Test statistics also come from distributions
- The difference is that the thresholds for significance change. Depending on the sample size (represented as df)
- Larger sample sizes have a lower threshold for significance
- Designed to change as the df changes (z distribution does not do that it is a fixed distribution of a very large sample set)
- When the df is smaller the threshold has to get smaller
What is F distribution?
- The principle is similar for the f distribution
- The f distribution is asymmetrical
- Thresholds for significance are determined by df as well
What are effect sizes?
Effect sizes quantify the difference between two means or the strength of an association,
used as an alternative or complement to a p value.
* Coefficient r is the most common effect size for the strength of an association
- Can be positive or negative
What is Cohen’s d?
- Cohen’s d – to understand the difference between graphs
Cohen’s d is the most common effect size for comparing means and can be positive or negative.
What are the 4 outcomes when making a statistical decision?
- Hit and correct rejection:
(we have made right decision ) - Miss and false alarm –
- False alarm – type 1 error (hard to know if you have a false error)
- Miss – type 2 error ( often through error – small sample size, bad measuring)
What are type 1 and type 2 errors?
Type 1 error occurs when we incorrectly reject the null hypothesis, while type 2 error occurs when we incorrectly retain the null hypothesis.
What is the purpose of confidence intervals?
Confidence intervals estimate a range that would capture 95% of sample means, leaving an error rate of 5%.
How is standard error related to confidence intervals?
Standard error is an estimate of how much error surrounds our mean estimate and is used to create confidence intervals.
How to make statistical decisions using Confidence Intervals?
- We can make statistical decisions using confidence intervals ( Similar to p values )
Example: - If the female and male intervals overlap, we can retain the null hypothesis
- If the female and male intervals do not overlap, we can reject the null hypothesis
- The gap in the graph means it is unlikely that these two means were sampled from the same distribution/population
- It tells us males and females probably have different social media use behaviour
- (We would reject null hypothesis as the. intervals do not overlap