Flashcards in Chapter 10 Analyzing Data Deck (41)
Analysis of variance (ANOVA)
A *parametric* procedure used to test whether there is a difference among *three group means*.
1. A *nonparametric* procedure used to assess whether a relationship exists between two nominal level variables; symbolized as χ².
2. The most commonly reported *nonparametric* statistic (i.e., not normally distributed) to determine whether groups are different.
3. Compares the frequency of an observed occurrence (actual number in each category) with the frequency of an expected occurrence (based on theory or past experience).
A measure that defines the relationship between two variables.
1. Statistics that *describe, organize, and summarize* data.
2. Based on *frequency* and includes *measures of central tendency* and *measures of dispersion*.
Homogeneity of variance
Situation in which the dependent variables do not differ significantly between or among groups.
1. Statistics that *generalize findings* from a sample to a population. Based on *parameters*.
2. Two types: Parameter estimation and hypothesis testing.
Level of confidence
Probability level in which the research hypothesis is accepted with confidence. *A 0.05 level of confidence is the standard among researchers* (this means that researchers are willing to accept statistical significance occurring by chance 5 times out of 100).
1. A measure of central tendency calculated by summing a set of scores and dividing the sum by the total number of scores; also called the average. Represented by x̄ or M.
2. The calculation takes into account *each score* in the distribution. Easily used in more advanced statistical analyses.
3. A precise, stable, and reliable measure, but it is sensitive to *outliers* (the mean will be "pulled" in the direction of extreme values).
Measures of central tendency
1. Descriptive statistics that describe the location or approximate center of a distribution of data.
2. *Three types: Mean, median, and mode.*
3. The specific measure is chosen based on the level of measurement, shape, or form of the distribution of data and research objective.
Measures of dispersion
1. Descriptive statistics that depict the spread or variability among a set of numerical data.
2. *Three types: Range, variance, and standard deviation.* (Also used, but less frequent: Percentile and interquartile range.)
1. A measure of central tendency that represents the *middle score or midpoint* in a distribution. Represented by Mdn; sometimes known as the 50th percentile.
2. An *ordinal statistic* based on ranks: Calculated by first arranging the scores in rank order. If there is an odd number of scores, the median is the middle score. If there is an even number of scores, the median is the point halfway between the two middle scores (and thus would not be a score that appears in the distribution).
3. Does *not* take into account each score in the distribution and is *not* sensitive to extreme scores.
1. The score or value that occurs most frequently in a distribution; a measure of central tendency used most often with *nominal-level data*.
2. There may be more than one mode for any distribution of scores - data with a single mode are called *unimodal* and data with two modes are called *bimodal*.
3. Can be applied to *any* set of data at the nominal, ordinal, or interval/ratio level of measurement.
4. Easily identified when a *frequency distribution* is used.
Correlation in which high scores for one variable are paired with low scores for the other variable.
Data point isolated from other data points; extreme score in a data set.
Numerical characteristic of a population (e.g., population mean, population standard deviation).
Correlation in which high scores for one variable are paired with high scores for the other variable, or low scores for one variable are paired with low scores for the other variable.
1. Likelihood that an event will occur, given all possible outcomes. Represented by 𝑝 and expressed as a decimal (e.g., 0.5 represents a 50% likelihood).
2. A system of rules for analyzing a set of outcomes and a means of predicting.
3. Helps evaluate the accuracy of a statistic and *test a hypothesis*, but does *not* represent the amount of validity associated with a research hypothesis.
1. A measure of variability that is the difference between the lowest and highest values in a distribution.
2. The *simplest measure of dispersion* - the lowest score is subtracted from the highest score.
3. Considered an *unstable measure* because it is based on only two values in the distribution. It is extremely sensitive to outliers and does not take into account variations in scores between extremes.
Referring to results from statistical analyses that are close to being valid, even though the researcher does not rigidly adhere to assumptions associated with parametric procedures.
1. A distribution of scores with a few outlying observations in either direction.
2. Has an off-centered peak and a longer tail in one direction.
3. The mean, median, and mode have *constant* positions: The *mode* is closest to the peak of the curve; the *mean* is closest to the tail; the *median* falls somewhere between the mean and the mode.
Standard deviation (SD)
1. The most frequently used measure of variability; the distance a score varies from the mean.
2. The *square root of the variance* - summarizes data in the same unit of measurement as the original data.
3. The most stable measure of variability; it takes into account each score in the distribution, and is sensitive to outliers.
A distribution of scores in which the mean, median, and mode are all the *same*.
A popular *parametric* procedure for assessing whether *two group means* are significantly different from one another.
1. Measure of variability, which is the average squared deviation from the mean (i.e., the sum of all squared deviations divided by the number of scores).
2. Not widely used because it cannot be employed in many statistical analyses.
Parametric vs. nonparametric procedures
1. Parametric procedures: Require certain assumptions to be met for statistical findings to be valid - the dependent variable is measured on an interval/ratio scale that is *normally distributed* in the population, and groups are mutually exclusive.
2. Nonparametric procedures: Make no assumptions about the shape of the distribution; usually used when data represent an ordinal or nominal scale; easier to calculate but less powerful than parametric procedures.
The research design and type of data collected determine selection of appropriate _
When data represent either an interval or ratio scale, the preferred measure of central tendency is the _
Mean. (The mean is also sometimes appropriate for ordinal data.)
When data represent a skewed distribution, the appropriate measure of central tendency is the _
Median. (Provides a balanced picture of the extreme scores or outliers.)
When nominal data is used, the appropriate measure of central tendency is the _