AP Stat Ch 7-8 Flashcards

0
Q

Statistic vs parameter

A

Any quantity computer from the values in the sample is a statistic. X bar and s and p hat. These are variable

The corresponding population characteristic is usually called a parameter. Mu and sigma and p. These are constant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
1
Q

Statistical inference

A

The process of drawing conclusions about a population based on data from a sample of that population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sampling variability

A

Since the value of the statistic (estimate) depends on the sample selected, the value of the statistic will vary from sample to sample. This variability is called sampling variability.
NOTE:
Sampling variability is the variability of X bar, not X
For example, if X bar = the average weight of a sample of 50 students, then X bar also has variability. This is sampling variability.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Sampling distribution

A

The distribution of a statistic is called its sampling distribution. It is the distribution of all the values of the stat based on all possible samples of the same size.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What to do when describing distribution

A

Say the shape center and spread!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bias

A

Sampling distributions allow us to describe bias more precisely by speaking of the bias of a statistic rather than bias in a sampling method. Bias concerns the center of the sampling distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Unbiased estimater

A

A statistic used to estimate a parameter is unbiased if the mean of its sampling distribution is equal to the true value of the parameter being estimated. The statistic is called an unbiased estimator of the parameter.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Variability of a statistic

A

The variability of a statistic is described by the spread of its sampling distribution. This spread is determined by the sampling design and the size of the sample. Larger samples give smaller spread.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Population proportion vs sample properition

A

Population is p and sample is p hat
P hat = X/n where X is the number of successes and n is the sample size.
X–> B(n,p)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Mu p hat and sigma p hat

A

Mu p hat = p
Sigma P hat = SQRT (p(1-p)/n)
NOTE: population must be ten times as large as the sample to use this and p hat is only approx normal if np and n(1-p) are BOTH >=10

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Ex: suppose the proportion of HM students with a graphing calculator is p=.43. If you were to take a sample of size 50, what is the probability that in the sample, 30% or less had a graphing calculator?

A

Let p hat = sample proportion of people with a GC
Np and n(1-p) are both greater than or equal to 10 and assume the population is bigger than 500
So, mu p hat = .43 and sigma p hat = SQRT (.43(.57)/50) = .07
Z = (.3-.43)/.07=-1.86. Norm cdf (-100000000,-1.86,0,1)
If sample size is only ten, then can’t use normal approx. X–> B (10,.43) and use that

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Sample mean rules:

A
  1. Mu sub x bar = mu
  2. Sigma sub x bar = sigma / SQRT (n)
  3. Pop needs to be at least ten times the sample
  4. n needs to be greater than 30– CLT
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Ex. Suppose that the population of residents of a local town has a mean annual income of 50000 and standard deviation of 35000. If a random sample of 50 residents is selected, what is the probability that their sample mean is within $5000 of the true mean?

A

n is >= 30 and assume 500 people in the town
So, mu sub x bar= 50000 and sigma x bar = 35000/SQRT(50)
X = 45000-50000 / sigma sub x bar
Then use norm cdf

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What influences variability?

A

SAMPLE SIZE, NOT POPULATION SIZE

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Point estimate

A

A single number that represents a plausible value for the population characteristic. It is a single point on the number line. For example, we estimate that61% of adults favor…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Interval estimate

A

A range of plausible values for that characteristic. Interval on a number line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Two things to look for when choosing a statistic to estimate a population characteristic:

A

No bias and low variability.

17
Q

Unbiased statistic

A

Statistic with a mean value = to that of the population characteristic being estimated. Centered in the right place

18
Q

Commonly used stats:

A

X bar for mu
P hatfor p
s for sigma

Mean, proportion, standard deviation

19
Q

Confidence interval

A

Point estimate not the only plausible value, so statisticians will usually report an interval of plausible values for the population characteristic based on the sample. This interval is called a confidence interval.
More likely to be correct than a point estimate!

20
Q

General form of a confidence interval:

A

Point estimate +- (multiplier) (standard error)

21
Q

Confidence interval for a parameter has two parts:

A
  1. An interval calculated from the data with form estimate +- margin of error.
  2. Confidence level, c, which gives the overall success rate of the method for calculating the confidence interval.
22
Q

What does confidence level of 95% mean?

A

95% of all intervals calculated in same way should incorporate the true population mean

23
Q

Standard error of p hat

A

SE = SQRT ((p hat)(1-p hat) / n)

24
Q

Calculating one sample z confidence interval for a population proportion

A

P hat +- z STAR * SQRT ((p hat)(1-p hat)/n)

25
Q

How to find z star

A

For confidence level C,

Z star = Abs (inv norm (1-c)/2)

26
Q

One sample z confidence interval for a population proportion 4 step plan

A
  1. Name of test– 1 sample, z, 95% confidence interval for p = true proportion of…
  2. A. Random sample?
    B. Large sample size – n (p hat) >=10 and n(1-p hat)>10
    C. Independence – N >= 10n
  3. Calculation:
    Interval = P hat +- Z* * SQRT (p hat(1-p hat)/n)
  4. Conclusion: I am c% confidence that the true proportion of … is between the interval… and …
27
Q

Margin of error

A

The plus minus part in the confidence interval

ME = z* * SQRT (p hat(1-p hat)/n)

28
Q

How to calculate sample size needed when given a desired margin of error

A

Assume p hat =.5

Then plug into the equation ME = z star * SQRT (p hat(1-p hat)/n) and solve for n. Round up!

29
Q

Benefit and tradeoff of a bigger margin of error

A

More likely that p is in the interval

But, much less precision in a wider interval

30
Q

Benefit and tradeoff of decreasing ME

A

Increase sample size –> more precision

But it takes longer, more expensive

31
Q

When to use a z interval for a population mean

A
Only when sigma, the population standard deviation is known!
Need to meet the conditions:
1. SRS
2. Normality 
3. Independence (N>=10n)
32
Q

If we don’t know sigma for an interval with the population mean, what do you do?

A

Use s and do a t interval instead of sigma and a z interval

Standard error = s/SQRT (n)

33
Q

Sample size for a desired margin of error with means

A

ME >= z (sigma)/SQRT (n)

34
Q

T distribution

A

Use for a mean when the population standard deviation is unknown.
Not a normal curve– the ends have more area because more variability because depends on the mean and standard deviation, two statistics.

35
Q

Degrees of freedom

A

n -1 where n is the sample size

36
Q

What happens as the degrees of freedom increase?

A

More approximately normal

Larger sample, better we are. As n increases, s –> sigma and the t distribution –> z distribution (normal curve)

37
Q

4 step procedure for a t interval:

A
  1. C% t interval for mu = the true mean of …
  2. A. Random sample?
    B. Sample size is large (n>=30) or the population is normal (graph the data, check and discuss the shape. If outliers or extremely skewed, then procede with caution)
    C. N >= 10n
  3. Calculate the interval–> x bar +- t* (s / SQRT (n))
  4. Interpret– C% confident true mean of … is from ___ to ___
38
Q

Paired t procedures

A

Same as normal t procedures, but the mean is the average difference in the responses to the two treatments within matched pairs of subjects in the entire population or the mean difference in response to the two treatments for individuals in the population (when the same subject receives both treatments) or the mean difference between before and after measurements for all individuals in the population.

39
Q

Sample size for t procedure

A

If n<15, then needs to be close to normal. If outliers or clearly not normal, don’t use t procedures
If n >= 15, then t can be used except with outliers or strong skewness
If n>=30, then can use no matter what.