Statistics and Research Design - Flash Cards
(45 cards)
Effect Size
An effect size is measure of the magnitude of the relationship between independent and dependent variables and is useful for interpreting the relationship’s clinical or practical significance (e.g., for comparing the clinical effectiveness of two or more treatments). Several methods are used to calculate an effect size including Cohen’s d (which indicates the difference between two groups in terms of standard deviation units) and eta squared (which indicates the percent of variance in the dependent variable that is accounted for by variance in the independent variable).
Cluster Analysis
Cluster analysis is a multivariate technique that is used to group people or objects into a smaller number of mutually exclusive and exhaustive subgroups (clusters) based on their similarities - i.e., to group people or objects so that the identified subgroups have within-group homogeneity and between-group heterogeneity.
Experimentwise Error Rate
The experimentwise error rate (also known as the familywise error rate) is the probability of making a Type I error. As the number of statistical comparisons in a study increases, the experimentwise error rate increases.
Standard Deviation
The standard deviation is a measure of dispersion (variability) of scores around the mean of the distribution. It is the square root of the variance and is calculated by dividing the sum of the squared deviation scores by N (or N - 1) and taking the square root of the result.
Random Error
Random error is error that is unpredictable (random). Sampling error and measurement error are types of random error.
Random Assignment
Random assignment involves randomly assigning subjects to treatment groups and is sometimes referred to as “randomization.” It is considered the “hallmark” of true experimental research because it enables an investigator to conclude that any observed effect of an IV on the DV is due to the IV rather than to error. (Random assignment must not be confused with random selection, which refers to randomly selecting subjects from the population.)
One-Way ANOVA
The one-way ANOVA is a parametric statistical test used to compare the means of two or more groups when a study includes one IV and one DV that is measured on an interval or ratio scale. The one-way ANOVA yields an F-ratio that indicates if any group means are significantly different. The F-ratio represents a measure of treatment effects plus error divided by a measure of error only (MSB/MSW). When the treatment has had an effect, the F-ratio is larger than 1.0.
MANOVA (Multivariate Analysis of Variance)
The MANOVA is a form of the ANOVA that is used when a study includes one or more IVs and two or more DVs that are each measured on an interval or ratio scale. Use of the MANOVA helps reduce the experimentwise error rate and increases power by simultaneously analyzing the effects of the IV(s) on all of the DVs.
Factorial ANOVA
The factorial ANOVA is the appropriate statistical test when a study includes two or more IVs (i.e., when the study has used a factorial design) and a single DV that is measured on an interval or ratio scale. It is also referred to as a two-way ANOVA, three-way ANOVA, etc., with the words “two” and “three” referring to the number of IVs.
Alpha
Alpha determines the probability of rejecting the null hypothesis when it is true; i.e., the probability of making a Type I error. The value of alpha is set by the experimenter prior to collecting or analyzing the data. In psychological research, alpha is commonly set at .01 or .05.
Normal Curve/Areas Under The Normal Curve
A normal curve is a symmetrical bell-shaped distribution that is defined by a specific mathematical formula. When scores on a variable are normally distributed, it is possible to conclude that a specific number of observations fall within certain areas of the normal curve that are defined by the standard deviation: In a normal distribution, about 68% of observations fall between the scores that are plus and minus one standard deviation from the mean, about 95% between the scores that are plus and minus two standard deviations from the mean, and about 99% between the scores that are plus and minus three standard deviations from the mean.
Alpha
Alpha determines the probability of rejecting the null hypothesis when it is true; i.e., the probability of making a Type I error. The value of alpha is set by the experimenter prior to collecting or analyzing the data. In psychological research, alpha is commonly set at .01 or .05.
Randomized Block ANOVA
The randomized block ANOVA is the appropriate statistical test when blocking has been used as a method for controlling an extraneous variable (i.e., when the extraneous variable is treated as an independent variable). It allows an investigator to statistically analyze the main and interaction effects of the extraneous variable.
Null And Alternative Hypotheses
In experimental research, an investigator tests a verbal research hypothesis by simultaneously testing two competing statistical hypotheses. The first of these, the null hypothesis, is stated in a way that implies that the independent variable does not have an effect on the dependent variable. The second statistical hypothesis, the alternative hypothesis, states the opposite of the null hypothesis and is expressed in a way that implies that the independent variable does have an effect.
LISREL
LISREL is a structural equation (causal) modeling technique that is used to verify a predefined causal model or theory. It is more complex than path analysis, and it allows two-way (non-recursive) paths and takes into account observed variables, the latent traits they are believed to measure, and the effects of measurement error.
Cross-Validation/Shrinkage
Cross-validation refers to validating a correlation coefficient (e.g., a criterion-related validity coefficient) on a new sample. Because the same chance factors operating in the original sample are not operating in the subsequent sample, the correlation coefficient tends to “shrink” on cross-validation. In terms of the multiple correlation coefficient (R), shrinkage is greatest when the original sample is small and the number of predictors is large.
Chi-Square Test (Single-Sample And Multiple-Sample)
The chi-square test is a nonparametric statistical test that is used with nominal data (or data that are being treated as nominal data) - i.e., when the data to be compared are frequencies in each category. The single-sample chi-square test is used when the study includes one variable; the multiple-sample chi-square test when it includes two or more variables. (When counting variables for the chi-square test, independent and dependent variables are both included.)
Multiple Regression/Multicollinearity
Multiple regression is a multivariate technique that is used for predicting a score on a continuous criterion based on performance on two or more continuous and/or discrete predictors. The output of multiple regression is a multiple correlation coefficient (R) and a multiple regression equation. Ideally, predictors included in a multiple regression equation will have low correlations with each other and high correlations with the criterion. High correlations between predictors is referred to as multicollinearity.
Experimental Research (True and Quasi-Experimental)
Research involves conducting an empirical study to test hypotheses about the relationships between independent and dependent variables. A true experimental study permits greater control over experimental conditions, and its “hallmark” is random assignment to groups. A quasi-experimental study permits less control.
Sampling Distribution of the Mean/Standard Error of the Mean
The sampling distribution of the mean is the distribution of sample means that would be obtained if an infinite number of equal-size samples were randomly selected from the population and the mean for each sample was calculated. The sampling distribution is normally-shaped, its mean is equal to the population mean, and its standard deviation (the standard error of the mean) is equal to the population standard deviation divided by the square root of the sample size. The sampling distribution is used in inferential statistics to determine how likely it is to obtain a particular sample mean given the population mean, the population standard deviation, the sample size, and the level of significance.
Skewed Distributions
Skewed distributions are asymmetrical distributions in which the majority of scores are located on one side of the distribution. In a positively skewed distribution, most scores are in the low side of the distribution but a few scores are in the high (positive) side and the mean is greater than the median which, in turn, is greater than the mode. In a negatively skewed distribution, the majority of scores are in the high side of the distribution, but a few are in the low (negative) side and the mode is greater than the median, which is greater than the mean.
Mixed Designs
Mixed designs are a type of factorial design in which at least one IV is a between-groups variable and one IV is a within-subjects variable.
Probability Sampling
When using probability sampling, each element in the target population has a known chance of being selected for inclusion in the sample. Methods of probability sampling include simple random sampling, stratified random sampling, and cluster sampling. In contrast to simple random sampling and stratified random sampling (which involve selecting individuals from the population), cluster sampling involves selecting units or groups of individuals from the population (e.g., schools, hospitals, clinics.)
Within-Subjects Designs
Within-subjects designs are experimental research designs in which each subject receives, at different times, each level of the IV (or combinations of the IVs) so that comparisons on the DV are made within subjects rather than between groups. The single-group time-series design is a type of within-subjects design.