Stat - Exam #2 Flashcards Preview

Spring 2015 > Stat - Exam #2 > Flashcards

Flashcards in Stat - Exam #2 Deck (90)
1

What is the Sampling Distribution of the Statistics?

-Stats calculated from a sample varies with each sample;
-Variation because each stat is a random variable that follows some probability density CURVE with a LOCATION and SPREAD;

2

What is Sampling Distribution?

A probability density curve of all possible values of a statistic computed for a sample size (n).;
-Focus on the Population Means

3

What is the Law of Large Numbers?

As the sample size gets LARGER, the difference between the sample average and the population mean gets SMALLER

4

What is affected by SAMPLE SIZE with normally distributed data?

-Normally distributed, the MEAN of sample average is NOT affected by sample size, but the standard deviation of the sample average IS affect by size

5

What is Sampling Distribution of Sample Average?

If data are distributed normally with mean (u) and standard deviation (sigma), then the average of a sample of size (n) with be distributed normally with mean (u) and standard deviation [sigma/(sq. rt of n)]

6

IF/THEN of Sampling Distribution of Sample Average

IF: x has shape NOR with location (u) and spread (sigma)
THEN: x-bar has shape NOR with location (u) and spread {sigma/(sq. rt of n)}

7

What is the Standard Error of the Mean?

The standard deviation of the same average;
— sigma_x-bar = sigma/(sq. rt of n)

8

How do you find the shape of a sample for data NOT normally distributed?

-The sample average and the sample standard deviation can be calculated, but the shape is determined from a z-curve and the z-table = Central Limit Theorem

9

What is the Central Limit Theorem?

- When there are at least 30 data points (any shape, mean u, and standard deviation) the SAMPLE AVERAGE will...
1. follow the NORMAL shape
2. have mean (u) — same as population;
3. and have standard deviation {sigma/(sq. rt. of n)}

10

LESS than 30 data points

-Unknown shape;
-Mean (u);
-Standard deviation {sigma/(sq. rt. of n)}

11

Greater than or Equal to 30 data points

-NORMAL shape;
-Mean (u);
-Standard deviation {sigma/(sq. rt. of n)}

12

Difference in the Sampling Distribution and Central Limit Theorem

Sampling distribution of the mean deals with the location and spread of the sample average;
-The central limit theorem deals only with the SHAPE of the sample average

13

What is the population standard deviation is UNKNOWN?

-Use the sample standard deviation to calculate confidence interval estimates of a population parameter;
-The sample standard deviation can be calculated ANYTIME there is a sample

14

What is used to get degrees of freedom and critical values of the sample test?

The t-table

15

What do you calculate when the population standard deviation is UNKNOWN?

-Replace the populations standard deviation with with SAMPLE standard deviation = t-transforation that yields a t-statistic

16

What is a t-transformation?

Converts a sample average into a t-statistic

t = (x-bar - u) / [s/(sq. rt of n)]

17

What is t-distribution?

If a simple random sample of size (n) is taken from a population that follows the normal distribution, then the t-statistic follows the t-distribution with (n-1) degrees of freedom

18

What is a t-statistic?

t _ (sigma/2), (n-1) =

- sigma/2 = gives the area in one tail = Column of t-table;
- n-1 give the degrees of freedom = Row of t-table

19

How do you use the t-table?

-Need to know the area under the tail and the degrees of freedom;
-If the exact are not in the table, follow the practice of always going down to the next lower degrees of freedom in the table;
-NOTE: the LAST row of that -table is the same as the z-table

20

What is a reasonable value for the population mean?

A CONFIDENCE INTERVAL gives a set of values that are reasonable choices for the population mean based on the information in the SAMPLE data

21

Where does the level of confidence come from?

The NORMAL probability curve

22

What is Inferential Stats?

Use the information from a sample to make conclusions about the population

23

What is an Interval Estimate?

-Value of sample stat is very seldom the exact population parameter, but pretty close;
-Calculate a sample stat and an INTERVAL indicating how close the stat is to the population parameter;
**Central to Inferential Stats

24

What are the major methods of Inferential Stats?

1. Confidence Interval Estimation = give an estimate of the value of the UNKNOWN population parameter;

2. Hypothesis Testing = Claim about a population, then sample data are collected and use to test this claim

25

Which standard deviation is ALWAYS known?

-Sample!;
- Population is not usually known in everyday practice

26

What is required when the population standard deviation is UNKNOWN?

Requires the used of a t-value
(Z-scores are only applicable when pop. standard deviation is already known)

27

What is the Point Estimate of a population parameter?

The value of the sample statistic used to estimate the population parameter

28

What is the Point Estimate of of the population MEAN?

The value of the sample average;
-BEST estimator of the population mean

Sample Average = POINT ESTIMATOR
Actual value of Sample Average = POINT ESTIMATE

29

What is the Point Estimate of the population STANDARD DEVIATION?

The value of the sample standard deviation

30

What values are estimated?

ONLY the values of the population parameters, NEVER the values of the sample statistics

31

What are the Estimators for population parameters?

Location = Mean, Median, Mode

Spread = Standard Deviation, Range, IQR

32

What are the properties of a good estimator?

1. Unbiased = expected value of estimator equals value of parameter;

2. Consistent = larger sample makes estimator more accurate;

3. Efficient = estimator has the smallest standard deviation

33

What is a Confidence Interval Estimate?

Range of values if an interval on the real number line;
-Expresses the natural uncertainty in the estimate

34

What is a Confidence Level?

The proportion of confidence intervals calculated from a large number of random samples that contain the value of the population parameter

Denoted = CL;
Common Values = 90%, 95%, 99%;
Decided by the researcher

35

What is the Significance Level?

The area outside of the region of confidence;

Denoted = sigma
Calculated = {1 - (CL/100)}

36

What is a Critical Value?

The pair of values that bound the region of confidence

Denoted = (+/- z_alpha/2’), (+/- t_alpha/2, n-1)

37

What is the Confidence Interval Estimate of a Population Parameter of a Population when Standard Deviation is KNOWN?

-An estimate of the value of a population parameter consisting of
1. an interval of number, and
2. a level of confidence that the interval contains the value of the population parameter

Denoted = CI% = (LCL,UCL)
LCL — Lower Confidence Limit; UCL — Upper Confidence Limit

38

What determines the WIDTH of the confidence interval?

Comes from the confidence level CHOSEN and the spread of the sample average

39

What is given by the confidence level?

Gives the area in the RIGHT tail (alpha/2), which gives the critical values bounding the region of confidence (+/-z_alpha/2)

40

Where does the CENTER of the confidence interval come from?

The CENTER of the confidence interval comes from the value of the SAMPLE AVERAGE (x-bar)

41

What are the steps of a Confidence Interval?

1. First find the critical values — defines the width of the interval (which is centered around the population mean, which is unknown)
2. Need to slide the interval over until it is centered on the sample average, which is known;
3. Convert the critical values into x-values;
4. Two x-values in the proper from make the confidence interval

42

What is the Margin of Error?

The RIGHT term in a confidence interval estimate;
-Determined the width of the confidence interval;
-Anything that changes the margin of error changes the width of the confidence interval

Margin = {z_(alpha/2)} x {sigma/(sq. rt of n)}

43

What happens with a REDUCED margin of error?

NARROWS confidence interval

44

What happens with an INCREASED margin of error?

WIDENED confidence interval

45

What are the 3 ways to change the Margin of Error?

1. Sample size: Increase = Narrow; Decrease = Widen;
2. Confidence level: Increase = Widen; Decrease = Narrow;
3. Standar deviation of the population: IMPOSSIBLE to change

46

What is the method for the Confidence Interval Estimate (z) of the Population Mean?

(Pop. SD is KNOWN)

1. Statistics = Sample Average (x-bar) & Population Standard Deviation (sigma);

2. Critical Value = Confidence Level (CL) & Critical Z-score (z_sigma/2);

3. Compute = CI% = {x-bar +/- (z_alpha/2) (sigma/sqrt. n)}

4. State = CI% = (LCL, UCL)

47

How is calculating confidence intervals estimates different when the population standard deviation is NOT known?

-Use the sample standard deviation (which can always be calculated) to calculate confidence interval estimates of a population parameter;
-Major difference is is the the DEGREES of FREEDOM must be known and must use the t-table to get critical values

48

What is the method for the Confidence Interval Estimate (t) of the Population Mean?

(Pop. SD is NOT known)

1. Statistics = Sample Average (x-bar) & SAMPLE Standard Deviation (s);

2. Critical Value = Confidence Level (CL), Degrees of Freedom (n-1) & Critical t-value {t_(alpha/2), (n-1)};

3. Compute = CI% = {x-bar +/- {t_(alpha/2), (n-1)} x [s/sqrt. n)]}

4. State = CI% = (LCL, UCL)

49

What is a Hypothesis Test?

1. First write 2 statement about a population parameter;
— Status quo value of the population parameter is given in the first statement;
— Claim made by the researcher is given in the second statement;
2. Sample data are collected and analyzed;
3. Finally concluded which statement is closer to the truth

50

How is a hypothesis test different from a confident interval calculation?

With a hypothesis test, you first must make some claim about a population parameter;
-Then collected data and determine if the claim is reasonable or not

51

What are the 3 steps of a hypothesis test?

1. Hypothesize = a set of hypotheses are written giving the status quo and the researchers claim;

2. Analyze = sample data are collected and analyzed;

3. Conclude = a conclusion as to which statement is closer to the truth is made

52

What is a hypothesis?

A statement about a population parameter;
EX: u=50 (pop mean)
- Can be for one or more populations
- Must be about the value of a population parameter (never a sample stat)
- Made BEFORE data is collected — and sample must be appropriate to the the statement;
**Two Types = Null and Alternative Hypothesis

53

What is a Null Hypothesis?

A statement that the population parameter has the status quo value;
- Denoted = “H-naught” (H0);
- EX = H0 : u = 72;
- Assumed TRUE in the hypothesis test until the sample evidence proves OTHERWISE;
-ALWAYS contains an EQUAL SIGN

54

What is an Alternative Hypothesis?

-A statement that a population parameter does NOT have the status quo value;
**Gives the researchers CLAIM;

- Denoted = “H-one” (H1);
- EX = H1 : u /= 72, H1 : u 72;
- Assumed FALSE in the hypothesis test until the sample evidence proves otherwise;
-NEVER contains an EQUAL SIGN

55

What is the purpose of analysis of a hypothesis test?

-Decide which hypothesis is closer to the truth, the null hypothesis or the alternative hypothesis;
-3 scenarios for a hypothesis test =
1. z-Test of the Mean
2. t-Test of the Mean
3. z-Test of Proportion

56

What are the 3 methods in each scenario to conduct a hypothesis test?

1. Critical Value Method = Traditional
2. P-value Method = Modern
3. Confidence Interval Method = Two-Sided

57

What is the conclusion to a hypothesis?

-Must chose only one of the two conclusions to end a hypothesis test;
1. REJECT the null hypothesis, or
2. NOT REJECT the null hypothesis
*Never enough information to prove a hypothesis is true, so a hypothesis can be shown to be false or not false — never shown true

58

What is Hypothesis Testing?

A producer that uses our knowledge of probability with evidence from a sample to test a claim about a characteristic of a population;
— Claim is about the value of a population parameter;
— Can be for one or more populations

59

What are the Assumptions of Hypothesis Testing?

1. Simple random sample;
2. Sample average is normally distributed

60

What is the logic behind hypothesis testing?

1. Assume status quo value (null) is TRUE (this is NOT confidence intervals);

2. Examine sample data;
— Sample evidence CLOSE to the status quo, support that value (null)
— Same evidence FAR from the status quo REFUTES that value and supports the alternative

3. Make one of two conclusions
— Status quo is reasonable given the sample
— Status quo is not reasonable given the sample

61

How do you define “close” and “far” from the status quo?

-Using the properties of the normal curve;
-Determine the critical z-score (remember that z-score of a point is just how many standard deviations away from the mean)
— Any value CLOSER to the mean than the z-score is CLOSE;
— Any value further way from the mean than the z-score if FAR
*Z-scores are then converted to x-values that support or refute the null hypothesis

62

95% Level of Confidence

-Any value inside Z-scores of +/-1.96 is CLOSE to the mean and support the null;
-Any value outside Z-scores +/-1.96 is FAR from the mean and refutes the null hypothesis (reject the null)

63

What are the 3 situation scenarios in hypothesis testing?

1. Two-tail situation;
2. One-tail situation to the left, and
3. One-tail situation to the right

64

Two-Tail Situations

The null hypothesis can be rejected by sample evidence that is too big OR too small (rejection in both tail regions)
-Two- tail hypothesis: H0: u=72; H1 u /= 72

65

One-Tail Situations

The null hypothesis can be rejected by sample evidence that is ONLY too big or too small (rejection in ONLY ONE tail region)
-Right-Tail hypothesis: H0: u= 72; H1: 72< u
-Left-Tail hypothesis: H0: u = 72; H1 u <72

66

What is a Type 1 Error?

The null hypothesis is TRUE; but we REJECT the null hypothesis in the test;
-Denoted “alpha"

67

What is a Type II Error?

The null hypothesis is FALSE, but we do NOT REJECT the null in the hypothesis test;
- Denoted “beta"

68

Which type of Error is stats most concerned with?

-Type I;
-Choose the probability of making a Type 1 Error early in the hypothesis test;
-Usually let a Type II error float to whatever it becomes;
**Type 1 Error = Level of Significance

69

What is Level of Significance?

The probability of making a Type I Error;
-Denoted “alpha”
-Rejection region in the hypothesis testing

70

How do you choose the level of significant?

-If the consequences of making a Type I error are severe, choose the level of significance to be SMALL (alpha = 0.01);
-If the consequences are NOT severe, the level of significance should be larger (alpha = 0.05 or 0.10);
*Inverse relationship of Type I and Type II errors;
*Raise probability of Type I (raise alpha), reduces the probability of a Type II error

71

How do Significance Level and Confidence Level relate?

-Like two side of a coin;
-A 5% significance level means a 95% confidence level of giving the correct conclusion

72

What is Level of Confidence?

-The probability of NOT making a Type I Error;
- Calculation: 1-alpha

73

What are the 3 methods of conducting a hypothesis test about a KNOW population mean using z-scores?

1. Critical Value Method = Traditional;
2. P-Value Method = Modern;
3. Confidence Interval Method = Two-Sided

74

What is a Critical Value?

*Critical Value Method (z) =
A z-score which is critical to separate the REJECTION REGION from the ACCEPTANCE REGION;

-Denoted: +/- z_alpha/2, -z_alpha, +z_alpha

75

What is the Rejection Region of the Critical Value?

The set of all z-scores that are FAR from the mean, such that a NULL hypothesis is REJECTED;

- Denoted: alpha;
-Sometimes called the Critical Region

76

What is a Test Statistic?

A z-score, calculated form sample data, which is used to test if the NULL hypothesis is closer to the truth;
-Denoted: z_0;

-Calculation: z_0 = {(sample mean - pop. mean)/(pop. SD/ sqrt of sample (n))}

*SAME calculation for Left, Right, and Two-Tail Critical Values

77

Left-Tail Critical Value Method

1. Hypothesis:
H0: u = u0;
H1: u < u0;

2. Critical Value = -z_alpha;

3. Calculation

4. Reject: z_0 < -z_alpha;

5. Conclusion: Do, or do not, reject null

78

Two-Tail Critical Value Method

1. Hypothesis:
H0: u = u0;
H1: u NOT equal u0;

2. Critical Value = +/-z_alpha/2;

3. Calculation

4. . Reject: z_0 < -z_alpha/2 OR z_alpha/2 < z_0;

5. . Conclusion: Do, or do not, reject null

79

Right Tail Critical Value

1. Hypothesis:
H0: u = u0;
H1: u > u0;

2. Critical Value = +z_alpha;

3. Calculation

4. . Reject: z_alpha < z_0;

5. Conclusion: Do, or do not, reject null

80

When can you used the Critical Value method?

-Method is ROBUST for small deviation from normality (use normal probability plot),

- But NOT robust for data with outliers = use boxplot

81

What is a P-Value?

The probability of repeating an experiment under the assumption that the null hypothesis is true and getting a test statistic as extreme, or more extreme, than the value observed;

-This is the area under the curve from the TEST STAT to INFINITY;
-In one tail if is is a one-tail situation and in both tails if it is a two-tail situation

82

What is the P-Value method?

1. Test Stat for Left-Tail, Two-Tail, and Right-Tail = z_0;

2. Extreme Values:
-Left Tail = z < z_0;
-Two Tail = (z < -z_0) + (+z_0 < z);
-Right Tail = z_0 < z

3. Calculate = (Area from Z-table) X (Number of Tails)

83

What are the advantages of using the P-Value OVER the Critical Value Method?

1. The decision made in the P-value method is the SAME way every time — no need to look up a different P-Value every time
2. P-value gives info about the STRENGTH of evidence; P-value close to the level of significance means that the evidence for making conclusion is WEAK; P-value fat from the level of significance means that the evidence for making a conclusion is STRONG

84

How do you determine the hypothesis using a P-Value?

P-Value GREATER than alpha = DO NOT Reject the Null;

P-Value LESS than alpha = REJECT the Null

85

When are Confidence Intervals used?

ONLY when you have a two-tail hypothesis test;
-Because rejection region in a confidence interval is ALWAYS in both tails

86

What must be used to test claims about a population mean when the population standard deviation is NOT known?

-Sample standard deviation is always known (or can be calculated) and use the *t-table* (NOT the z-table)

87

What is a Critical Value?

-a t-value which is critical to separated the rejection region from the acceptance region;

-Denoted: {(+/- t_alpha/2), (n-1, -t_alpha), n-1, (+t_alpha, n-1)}

88

What is a Test Statistic for hypothesis tests when the population standard deviation is NOT known (using t-test)?

-a t-value, calculated from SAMPLE data, which is used to test is the NULL hypothesis is closer to the truth;
-Denoted: t_0;

-Calculation: t_0 = (sample mean - pop. mean)/(s/ sqrt. n)

(s = sample standard deviation; n = sample size)

89

What is the J-Method to Find Area (t)?

*Method ONLY for the t-table;
1. Start in the left margin at the degrees-of-freedom;
2. Go across the row until you find the number closest to the value of the test stat;
3. Read area in one tail at the top of the column

90

How do you decide the null using the t-table once the P-value is determined?

-One the P-value is obtained, the decision is made the SAME way as when the population standard deviation is KNOWN