Module 6- Inferential Statistics Flashcards
Inferential Statistics
- what we do to make inferences about a population based on our sample
- how we test hypotheses
- make inferences from sample to population
Population
- the larger group of all participants of interest to the researcher
Sample
- subset of the population
- never represent the population perfectly due to sampling error
Sampling Error
- natural variability you expect from one sample to another
- not really an error
Population Parameter
- A descriptive Statistic (MCT , Variability measure)
- computed from everyone in the population
Sample Statistic
- Descriptive Stat (ex, mean) computed from everyone in the sample
- not a true representation of the population parameter bc of sampling error
- an approximation of the population parameter
- deviates away from the parameter bc of the sampling error
- close to parameter but not perfect
if we had an infinite amount of samples
- distributions of means of an infinite amount of samples would form a normal curve
- even if each sample itself is skewed, the plot of means would be normally distributed
Central Limit theorem
- if draw a large number of samples from a population at random, the means of those samples will make a normal distribution
- can never draw an infinite amount of samples
Sampling Distribution of the Mean
- plot or distribution of means from different samples of the same population
- makes a normal curve
Law of Large Numbers
- Larger the sample, the more the mean of each sample will approximate the mean of the population
- larger the sample, less the mean is impacted by outliers
- larger the sample, the smaller the SD of each sample and therefore each sample will have a mean similar in value
Characteristics of Sampling Distributions
- approximate the population mean
- approximately normal in shape
- can answer probability qs about the population
Standard deviation of the Sampling Distribution of the Mean
- Standard Error of the Mean
Standard Error of the Mean
- defines the variation around the population mean (u “mu”)
- percentage of data will fall within 1,2,3 standard error units from the mean
68% of the sample means fall within -/+ 1 standard error units from the mean of the sampling distribution
95% within -/+ 2 standard error units
99% within -/+ 3 standard error units - like the standard deviation
- difference due to chance
Can never obtain a Sampling Distribution of the Mean bc
- can never collect an infinite amount number of samples
- therefore, can’t collect the standard error of the mean
Confidence Intervals
- can estimate the standard error of the mean to calculate these
smaller the standard error
the smaller our confidence intervals will be
- want our standard error to be as small as possible
- want dis to be tall and skinny
Influences to the size of the standard error of the mean
- if variability of the variable is large within the population, then the standard error will be large
- if the variability of the variable is small within the population, then the standard error will be less and ^ have a tall and skinny distribution
- Law of large numbers; larger the sample size, the smaller the standard error due to less influence of outliers
Null Hypothesis
- for hyp testing
- no difference bw our sample and population mean (come from the same distribution)
- no difference bw 2 group means bc they come from the same pop mean
Reject Null Hypothesis
- what we want
- says the 2 groups are from 2 different population distributions
- there is a difference bw groups
Fail to reject null hypothesis
- comes from not enough evidence
- says 2 groups are from the same population distribution
- no difference exists
- say “fail to reject” bc can never prove anything true
test the null hypothesis by
- Test statistic
- test stat= observed difference/ difference due to change (standard error of the mean)
big or small test stat?
- want a big test statistic, so it can fall in the critical region of rejection (^ can reject the null hyp)
- observed difference would be a large number (numerator)
- want difference due to chance (denominator) to be very small
Z- distribution
- use to determine if our sample mean differs from the population mean
- if our observed Z values fell into the extreme regions, which is defined by alpha values then we reject the null hyp
- want a large z value to fall in the rejection region
what does a= 0.05 mean
- means 5 out of 100 times we are making an error
- 5% chance of incorrectly rejecting the null hypothesis when it is true; Type 1 Error
- as we lower the alpha value we are likely to be more confident
- results occur by chance less than 5/100 times