Flashcards in Statistics exam 3 Deck (64):
the symbol for level of significance
the probability of failing to reject (accepting) a false null hypothesis
the hypothesis that the researcher wants to prove or verify; a statement about the value of a parameter that is either "less than", "greater than", or "not equal to".
approximate two-sample t test
a test for comparing the means of two independent samples or two treatments where the test statistic has an approximate t distribution. The formula for computing degrees of freedom is complicated.
ANOVA (analysis of variance)
a statistical procedure for testing the equality of means using variances.
Central Limit Theorem:
the name of the statement telling us that when sampling from a non-normal population, the sampling distribution of x bar is approximately Normal whenever the sample is large and random.
Claimed parameter value
the value of the parameter given in the null hypothesis
the basic premises for inferential procedures. If the conditions are not met, the results may not be valid.
conditions necessary for a one-sample t procedure (using t* for C.I. or getting P value from t table):
normality of the original population and SRS
check data collection and if n40 apply CLT.
Conditions necessary for a two-sample t procedure (using t* for C.I. or getting P-value from t table)
normality of both populations and either stratified sample (independent SRS's) or random allocation. Check data collection and if n1+n240, apply CLT.
conditions necessary for matched pairs t procedure (using t* for C.I. or getting P-value from t table)
normality of population of differences and either SRS or random allocation. Check data collection and if number of pairs 40, apply CLT.
Conditions necessary for ANOVA
normality of all populations, equality of variances and either stratified sample (independent SRS's) or random allocation. check data collection if n1 + n2 + ... +nk40, apply CLT and largest standard deviation divided by smallest standard deviation <2.
an estimate of the value of a parameter in interval form with an associated level of confidence; in other words, a list of reasonable or plausible values for the parameter based on the value of a statistic.
conservative two-sample t test
a test for comparing the means from two independent samples or two treatments where the degrees of freedom are taken to be the minimum of (n1-1) and (n2-1).
what happens to the width of a confidence interval when sample size is increased (or level of confidence is decreased)
degrees of freedom
a characteristic of the t-distribution; a measure of the amount of information available for estimating theta using s.
a condition for ANOVA; the condition is met when the largest standard deviation divided by the smallest standard deviation is less than 2.
estimated standard deviation of x bar.
called standard error of x bar and equals s/sq.rt.n; measures variability of sampling distribution of x bar.
Fail to reject H0
The appropriate statistical conclusion when the P-value is greater than alpha
results from statistical analysis performed on non-random samples or experimental data obtained without random allocation of treatments to individuals.
using results about sample statistics to draw conclusions about population parameters
laws of probability
the basis for hypothesis testing and confidence interval estimation
level of confidence
The percent of the time that the confidence interval estimation procedure will give you intervals containing the value of the parameter being estimated. After data are collected, level of confidence is no longer a probability because a calculated confidence interval either contains the value of the parameter or it doesn't.
level of significance (symbolized by alpha)
the probability of rejecting a true null hypothesis; equivalently, the largest risk a researcher is willing to take of rejecting a true null hypothesis
lower tailed test (also called a left-tailed test)
a test with " in the alternative hypothesis. This is a one-sided test
Margin of error for 95% confidence
the maximum amount that a statistic value will differ from the parameter value for the middle 95% of the distribution of all possible statistics.
either two measurements are taken on each individual such as pre and post OR two individuals are matched by a third variable (different from the explanatory variable and the response variable) such as identical twins or windows matched by installer when comparing installation time of two brands of windows.
matched pairs t test
the hypothesis testing method for matched pairs data. The typical null hypothesis is H0: mu=0 where mu.d is the mean difference between treatments. For this test, a difference is computed within every pair. The mean and standard deviation of these differences are computed and used in computing the test statistic.
the claimed value of the population mean given in H0.
performing two or more tests of significance on the same data set. This inflates the overall alpha (probability of type I error) for the tests. (The more analyses performed, the greater the chance of falsely rejecting at least one true null hypothesis)
The hypothesis of no difference or no change. The hypothesis that the researcher assumes to be true until sample results indicate otherwise. Generally, the hypothesis that the researcher wants to disprove.
The difference between the observed statistic and the claimed parameter value; x bar - mu
one-sided or one-tailed test
a test where the alternative hypothesis contains either ""
one sample t test
an inferential statistical procedure that uses the mean from one sample of data for either estimating the mean of the population or testing whether the mean of the population equals some claimed value.
an observation that falls outside the pattern of the data set.
a characteristic of a population that is usually unknown; this characteristic could be the mean, median, proportion, standard deviation, etc.
pooled two-sample t test
a test for comparing the means of two independent samples or two treatments where the test statistic has an exact t distribution. degrees of freedom=n1+n2-2. because this test requires that the two populations have equal variances, the approximate two-sample t is recommended.
the probability of rejecting a false null hypothesis; computed as 1-beta. increase power by increasing sample size
a difference between the observed statistic and the claimed parameter value that is large enough to be worth reporting. To assess practical significance, look at the numerator of the test statistic and ask "is it worth anything?" if yes, then results are also of practical significance. Do not assess practical significance unless results are statistically significant
The probability of getting a test statistic as extreme or more extreme than the value observed assuming H0 is true. OR the probability of obtaining a test statistic value as far or farther from the value actually obtained if H0 were true.
The appropriate statistical conclusion when P-value
a statistical procedure that is not sensitive to moderate deviations from an assumption upon which it is based; in other words, the confidence level or p-value does not change very much if the conditions for use are not met.
the variability of sample results from one sample to the next-something we must measure in order to effectively do inference. Margin of error only covers sampling variability.
standard deviation (s)
a measure of the variability (spread) of data in a sample
standard deviation of x bar
a measure of the variability of the sampling distribution of x bar; equals theta/rt.n
standard error of x bar
a measure of the variability of the sampling distribution of x bar; estimates the standard deviation of the sampling distribution of x bar; computed using formula s/rt.n
standard error of a statistic
an estimate of the standard deviation of the sampling distribution of the statistic; in other words, it is a measure of the variability of the statistic.
a characteristic of a sample; a number computed from sample data (without any knowledge of the value of a parameter) used to estimate the value of a parameter. examples include x bar, the sample mean, and s, the sample standard deviation.
a difference between the observed statistic and the claimed parameter value as given in H0 that is too large to be due to chance.
results of a study that differ too much from what we expected because of randomization to attribute to chance variation.
symbols for statistics
x bar, s
symbols for parameters
a distribution specified by degrees of freedom used to model test statistics for the one sample t test, the two sample t test, etc. where theta is unknown. Also used to obtain a confidence interval for estimating a population mean, or the difference between two population means, etc.
the multiplier of standard error in computing margin of error for estimating a mean (or the difference between two means). The value for t* is found on the t table in the intersection of the appropriate df row and level of confidence column
Test of significance
procedure used to assess the evidence against a claim (hypothesis) about the value of a parameter
a number that summarizes the data for a test of significance and is used to obtain a P-value.
two-sample t procedure
a statistical procedure used to compare the means from two populations either with a test of their equality or by estimating the difference between the two population means.
two sided test
a test where the alternative hypothesis contains "not equal to"
type I error
the error made when a true null hypothesis is rejected. You reject H0 when H0 is true
type II error
the error made when a false null hypothesis is not rejected.
you fail to reject H0 when H0 is false.
a condition where the mean of all possible statistic values equals the parameter that the statistic estimates.
upper tailed test
a test with ">" in the alternative hypothesis.
The square of standard deviation. Sample variance is s^2 and population variance is theta^2