Statistics Flashcards
Cluster sampling
involves selecting units or groups of individuals from the population (e.g., schools, hospitals, clinics.)
exists in contrast to simple random sampling and stratified random sampling (which involve selecting individuals from the population)
Probability Sampling
When using probability sampling, each element in the target population has a known chance of being selected for inclusion in the sample.
Methods of probability sampling include:
- simple random sampling,
- stratified random sampling, and
- cluster sampling.
Non-Parametric Tests
Nonparametric tests are inferential statistical tests used to analyze nominal or ordinal data (or interval or ratio data when the assumptions for a parametric test have not been met). They include:
- chi-square test
- Mann-Whitney U test
- Wilcoxon matched-pairs test
Benefits of Parametric Tests
An advantage of the parametric tests is that they are more “powerful” than the nonparametric tests.
They include the Student’s t-test and the analysis of variance.
Parametric Tests
Parametric tests are inferential statistical tests that are used when the data to be analyzed represent an interval or ratio scale and when certain assumptions about the population distribution(s) have been met - i.e., when scores on the variable of interest are normally distributed and when there is homoscedasticity (population variances are equal).
Normal Curve/Areas Under The Normal Curve
In a normal distribution,
- about 68% of observations fall between the scores that are plus and minus one standard deviation from the mean,
- about 95% between the scores that are plus and minus two standard deviations from the mean, and
- about 99% between the scores that are plus and minus three standard deviations from the mean.
Experimentwise Error Rate
The experimentwise error rate (also known as the familywise error rate) is the probability of making a Type I error (which is rejecting the null hypothesis when its actually true [claiming “effect” when there is no effect”).
As the number of statistical comparisons in a study increases, the experimentwise error rate increases.
Mixed (Split Plot) ANOVA
The mixed ANOVA is a type of factorial ANOVA that is used when a study includes at least one between-groups independent variable and one within-subjects independent variable.
Cross-Validation/Shrinkage
Cross-validation refers to validating a correlation coefficient (e.g., a criterion-related validity coefficient) on a new sample. Because the same chance factors operating in the original sample are not operating in the subsequent sample, the correlation coefficient tends to “shrink” on cross-validation. In terms of the multiple correlation coefficient (R), shrinkage is greatest when the original sample is small and the number of predictors is large.
One-Way ANOVA F Ratio
The one-way ANOVA yields an F-ratio that indicates if any group means are significantly different. The F-ratio represents a measure of treatment effects plus error divided by a measure of error only (MSB/MSW). When the treatment has had an effect, the F-ratio is larger than 1.0.
One-Way ANOVA
The one-way ANOVA is a parametric statistical test used to compare the means of two or more groups when a study includes one IV and one DV that is measured on an interval or ratio scale.
Trend Analysis
Trend analysis is a type of analysis of variance that is used to assess linear and nonlinear trends when the independent variable is quantitative.
Sampling Distribution
How is it Used
The sampling distribution is used in inferential statistics to determine how likely it is to obtain a particular sample mean given the
- population mean
- the population standard deviation
- the sample size
- and the level of significance
Standard Error of the Mean
equal to the population standard deviation divided by the square root of the sample size.
Sampling Distribution
Shape, Equal To,
- The sampling distribution is normally-shaped
- its mean is equal to the population mean,
Sampling Distribution of the Mean
Definition
The sampling distribution of the mean is the distribution of sample means that would be obtained if an infinite number of equal-size samples were randomly selected from the population and the mean for each sample was calculated.
Dependent Variables
The dependent variable (DV) is the variable that is believed to be affected by the independent variable and is observed and measured.
Independent Variables
The independent variable (IV) is the variable that is believed to have an effect on the dependent variable and is varied or manipulated by the researcher in an experimental research study.
Each independent variable in a study must have at least two levels.
Scales Of Measurement
- nominal
- ordinal
- interval
- ratio
A nominal scale yields “frequency data” (the frequency of observations in each nominal category). Ordinal, interval, and ratio scales provide scale values or scores.
negatively skewed distribution
In a negatively skewed distribution, the majority of scores are in the high side of the distribution, but a few are in the low (negative) side and the mode is greater than the median, which is greater than the mean.
positively skewed distribution
In a positively skewed distribution, most scores are in the low side of the distribution but a few scores are in the high (positive) side and the
mean is greater than the median which, in turn, is greater than the mode.
Skewed Distributions
Skewed distributions are asymmetrical distributions in which the majority of scores are located on one side of the distribution.
Random Assignment
Random assignment involves randomly assigning subjects to treatment groups and is sometimes referred to as “randomization.”
It is considered the “hallmark” of true experimental research because it enables an investigator to conclude that any observed effect of an IV on the DV is due to the IV rather than to error.
(Random assignment must not be confused with random selection, which refers to randomly selecting subjects from the population.)
Mode
The mode is the most frequently occurring score or category, and it is used as a measure of central tendency for nominal variables or variables that are being treated as nominal variables.