Flashcards in Stat - Exam #3 Deck (68)
How do you make inferences on two DEPENDENT samples (paired samples)?
-Need to convert the two population situation to a one population situation;
-Take the difference between teh values of the two individuals in a pair and treat the mean of difference (d-bar) and the stat of choice
What is the null or status quo of dependent samples?
d-bar = 0
What are dependent samples?
-When the individuals selected from one sample influence which individuals are in the second sample;
-Also called "matched-pair samples"
What are independent samples?
When the individuals selected for one sample do NOT influence which individuals are in the second group
What is the best stat for testing a paired sample?
-the DIFFERENCE (d_i);
-Subtract the value of one individual of the matched-pair from the value of the other individual in the matched-pair;
-The mean (d-bar) and standard deviation (s_d) are calculated normally
What are the Assumptions for DEPENDENT Samples?
1. Random sample of matched pairs;
2. Sample average of the difference data is normally distributed;
3. Population standard deviations are NOT known (same for one-sample t-test)
What is the null hypothesis in a paired-sample hypothesis test?
The mean difference between the paired samples is zero (H_0: u_d=0)
What is the confidence interval for a paired-sample?
Statistic +/- Critical Value;
-The Point-estimate is the sample average of the difference data (d-bar), the critical value is a t-value, and the standard error of the mean has usual form (S_d/(sqrt n))
When can you makes inference about two population means?
-When the individuals in the two samples are UNRELATED to each other
How do you make inferences on two INDEPENDENT samples?
-Calculate the mean of each population and treat the DIFFERENCE in the means as the stat of choice;
-The status quo in this case is (u1-u2=0), meaning the mean of the two populations is the SAME
What is needed to conduct inferential stats on independent samples?
Sampling Distribution =
1. Mean and
2. Standard Deviation….
of the difference in the means of the samples
What are the two possible situations for populations when the populations standard deviation is NOT known?
*Can ALWAYS calc the sample standard deviation;
1. Equal population standard deviations;
2. Unequal population standard deviations
What test can be used to determine equality in variances, but is NOT recommended?
an “F-Test” can be used, but it is not robust to even account for small deviations from normality
What is the best estimate of the population standard deviation when the pop. standard deviations are EQUAL?
-Pooling together the two sample standard deviations;
-a T-STAT using the pooled standard deviations , exactly follows t-distibution with (n-2) degrees of freedom
What is the best estimate of the population standard deviation when the pop. standard deviations are UNEQUAL?
-an exact method of inference does NOT exist because you CANNOT determine degrees of freedom;
-There is NO formula for a t-stat that follows the t-distribution;
-But WELCH’S APPROX and SATTWERWAITE’S APPROX are close;
-Both use the same formula with different degrees of freedom approximations
What is Welch’s Approximation of Degrees of Freedom?
-Take the SMALLER number of the observations (n1 or n2), then subtract 1 to determine the degrees of freedom for the t-test (t_0);
-A conservative approximation;
-Easy to use by hand
What is Satterwaite’s Approximation of Degrees of Freedom?
-A more exact approximation that uses an extensive equation;
-Best used when calculated by machine
How many columns of data can ANOVA be used for?
-As many columns of data as needed
What are the three possible hypotheses with 3 columns of data for a t-test?
1. H0: u1=u2;
What is the problem with running 3 independent t-tests?
-Becomes additive each time a t-test is run since the t-tests are essentially independent of one another;
alpha = .05+.05+.05 = .15;
*Chance of error increased 3x than preferred
What is a significance level?
The chance of making an incorrect conclusion
How do you make inferences on 3 columns of data?
-Use a stat test that compares two VARIANCES to each other, instead of two means to each other;
-Called an F-TEST because an F-stat is calculated, and follows an F-distribution = critical values and P-values
What is ANOVA?
A statistical method used to test whether the population means of three or more columns of data are equal to each other;
-Uses the F-test to make decisions;
-Requires only ONE hypothesis test so it controls the significance level at the level set no matter how many populations compared
What is One-Way ANOVA?
An ANOVA method with only one classification variable, called a FACTOR;
-The factor is DISCRETE data that can have as many levels as needed
What is a FACTOR?
A classification variable used to separate data into several columns;
-EX: Age, gender, species, class, etc;
-LEVELS of a factor are the possible categories
Factor = Classification
Levels= Freshman, Sophomore
What is an ANOVA Hypothesis test?
Conducts ONE hypothesis test to find out whether all three populations are EQUAL to each other;
What is the alternative hypothesis test with ANOVA?
-At least ONE population means is DIFFERENT from the others;
-Cannot tell if only one population mean s different or if all are different
What stat test does ANOVA use to test the hypothesis?
-F-test to find out which is closer to the truth
What is an F-test?
A stat test to determine whether two VARIANCES are equal;
-Tests for the equality of two variances