AP Stat Ch 7-8 Flashcards
Statistic vs parameter
Any quantity computer from the values in the sample is a statistic. X bar and s and p hat. These are variable
The corresponding population characteristic is usually called a parameter. Mu and sigma and p. These are constant
Statistical inference
The process of drawing conclusions about a population based on data from a sample of that population
Sampling variability
Since the value of the statistic (estimate) depends on the sample selected, the value of the statistic will vary from sample to sample. This variability is called sampling variability.
NOTE:
Sampling variability is the variability of X bar, not X
For example, if X bar = the average weight of a sample of 50 students, then X bar also has variability. This is sampling variability.
Sampling distribution
The distribution of a statistic is called its sampling distribution. It is the distribution of all the values of the stat based on all possible samples of the same size.
What to do when describing distribution
Say the shape center and spread!
Bias
Sampling distributions allow us to describe bias more precisely by speaking of the bias of a statistic rather than bias in a sampling method. Bias concerns the center of the sampling distribution.
Unbiased estimater
A statistic used to estimate a parameter is unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated. The statistic is called an unbiased estimator of the parameter.
Variability of a statistic
The variability of a statistic is described by the spread of its sampling distribution. This spread is determined by the sampling design and the size of the sample. Larger samples give smaller spread.
Population proportion vs sample properition
Population is p and sample is p hat
P hat = X/n where X is the number of successes and n is the sample size.
X–> B(n,p)
Mu p hat and sigma p hat
Mu p hat = p
Sigma P hat = SQRT (p(1-p)/n)
NOTE: population must be ten times as large as the sample to use this and p hat is only approx normal if np and n(1-p) are BOTH >=10
Ex: suppose the proportion of HM students with a graphing calculator is p=.43. If you were to take a sample of size 50, what is the probability that in the sample, 30% or less had a graphing calculator?
Let p hat = sample proportion of people with a GC
Np and n(1-p) are both greater than or equal to 10 and assume the population is bigger than 500
So, mu p hat = .43 and sigma p hat = SQRT (.43(.57)/50) = .07
Z = (.3-.43)/.07=-1.86. Norm cdf (-100000000,-1.86,0,1)
If sample size is only ten, then can’t use normal approx. X–> B (10,.43) and use that
Sample mean rules:
- Mu sub x bar = mu
- Sigma sub x bar = sigma / SQRT (n)
- Pop needs to be at least ten times the sample
- n needs to be greater than 30– CLT
Ex. Suppose that the population of residents of a local town has a mean annual income of 50000 and standard deviation of 35000. If a random sample of 50 residents is selected, what is the probability that their sample mean is within $5000 of the true mean?
n is >= 30 and assume 500 people in the town
So, mu sub x bar= 50000 and sigma x bar = 35000/SQRT(50)
X = 45000-50000 / sigma sub x bar
Then use norm cdf
What influences variability?
SAMPLE SIZE, NOT POPULATION SIZE
Point estimate
A single number that represents a plausible value for the population characteristic. It is a single point on the number line. For example, we estimate that61% of adults favor…
Interval estimate
A range of plausible values for that characteristic. Interval on a number line.