1. A nonparametric procedure used to assess whether a relationship exists between two nominal level variables; symbolized as χ². 2. The most commonly reported nonparametric statistic (i.e., not normally distributed) to determine whether groups are different. 3. Compares the frequency of an observed occurrence (actual number in each category) with the frequency of an expected occurrence (based on theory or past experience).

1. A measure of central tendency calculated by summing a set of scores and dividing the sum by the total number of scores; also called the average. Represented by x̄ or M. 2. The calculation takes into account each score in the distribution. Easily used in more advanced statistical analyses. 3. A precise, stable, and reliable measure, but it is sensitive to outliers (the mean will be pulled in the direction of extreme values).

1. A measure of central tendency that represents the middle score or midpoint in a distribution. Represented by Mdn; sometimes known as the 50th percentile. 2. An ordinal statistic based on ranks: Calculated by first arranging the scores in rank order. If there is an odd number of scores, the median is the middle score. If there is an even number of scores, the median is the point halfway between the two middle scores (and thus would not be a score that appears in the distribution). 3. Does not take into account each score in the distribution and is not sensitive to extreme scores.

1. The score or value that occurs most frequently in a distribution; a measure of central tendency used most often with nominal-level data . 2. There may be more than one mode for any distribution of scores - data with a single mode are called unimodal and data with two modes are called bimodal . 3. Can be applied to any set of data at the nominal, ordinal, or interval/ratio level of measurement. 4. Easily identified when a frequency distribution is used.

1. Likelihood that an event will occur, given all possible outcomes. Represented by 𝑝 and expressed as a decimal (e.g., 0.5 represents a 50% likelihood). 2. A system of rules for analyzing a set of outcomes and a means of predicting. 3. Helps evaluate the accuracy of a statistic and test a hypothesis , but does not represent the amount of validity associated with a research hypothesis.

Chapter 10 Analyzing Data Flashcards by Paul Dearing

Analysis of variance (ANOVA)

A parametric procedure used to test whether there is a difference among three group means.

How well did you know this?

Not at all

Perfectly

Chi-square

A nonparametric procedure used to assess whether a relationship exists between two nominal level variables; symbolized as χ².
The most commonly reported nonparametric statistic (i.e., not normally distributed) to determine whether groups are different.
Compares the frequency of an observed occurrence (actual number in each category) with the frequency of an expected occurrence (based on theory or past experience).

How well did you know this?

Not at all

Perfectly

Correlation

A measure that defines the relationship between two variables.

How well did you know this?

Not at all

Perfectly

Descriptive statistics

Statistics that describe, organize, and summarize data.

2. Based on frequency and includes measures of central tendency and measures of dispersion.

How well did you know this?

Not at all

Perfectly

Homogeneity of variance

Situation in which the dependent variables do not differ significantly between or among groups.

How well did you know this?

Not at all

Perfectly

Inferential statistics

Statistics that generalize findings from a sample to a population. Based on parameters.
Two types: Parameter estimation and hypothesis testing.

How well did you know this?

Not at all

Perfectly

Level of confidence

Probability level in which the research hypothesis is accepted with confidence. A 0.05 level of confidence is the standard among researchers (this means that researchers are willing to accept statistical significance occurring by chance 5 times out of 100).

How well did you know this?

Not at all

Perfectly

Mean

A measure of central tendency calculated by summing a set of scores and dividing the sum by the total number of scores; also called the average. Represented by x̄ or M.
The calculation takes into account each score in the distribution. Easily used in more advanced statistical analyses.
A precise, stable, and reliable measure, but it is sensitive to outliers (the mean will be “pulled” in the direction of extreme values).

How well did you know this?

Not at all

Perfectly

Measures of central tendency

Descriptive statistics that describe the location or approximate center of a distribution of data.
Three types: Mean, median, and mode.
The specific measure is chosen based on the level of measurement, shape, or form of the distribution of data and research objective.

How well did you know this?

Not at all

Perfectly

Measures of dispersion

Descriptive statistics that depict the spread or variability among a set of numerical data.
Three types: Range, variance, and standard deviation. (Also used, but less frequent: Percentile and interquartile range.)

How well did you know this?

Not at all

Perfectly

Median

A measure of central tendency that represents the middle score or midpoint in a distribution. Represented by Mdn; sometimes known as the 50th percentile.
An ordinal statistic based on ranks: Calculated by first arranging the scores in rank order. If there is an odd number of scores, the median is the middle score. If there is an even number of scores, the median is the point halfway between the two middle scores (and thus would not be a score that appears in the distribution).
Does not take into account each score in the distribution and is not sensitive to extreme scores.

How well did you know this?

Not at all

Perfectly

Mode

The score or value that occurs most frequently in a distribution; a measure of central tendency used most often with nominal-level data.
There may be more than one mode for any distribution of scores - data with a single mode are called unimodal and data with two modes are called bimodal.
Can be applied to any set of data at the nominal, ordinal, or interval/ratio level of measurement.
Easily identified when a frequency distribution is used.

How well did you know this?

Not at all

Perfectly

Negative correlation

Correlation in which high scores for one variable are paired with low scores for the other variable.

How well did you know this?

Not at all

Perfectly

Outlier

Data point isolated from other data points; extreme score in a data set.

How well did you know this?

Not at all

Perfectly

Parameter

Numerical characteristic of a population (e.g., population mean, population standard deviation).

How well did you know this?

Not at all

Perfectly

Positive correlation

Correlation in which high scores for one variable are paired with high scores for the other variable, or low scores for one variable are paired with low scores for the other variable.

How well did you know this?

Not at all

Perfectly

Probability

Study These Flashcards

Likelihood that an event will occur, given all possible outcomes. Represented by 𝑝 and expressed as a decimal (e.g., 0.5 represents a 50% likelihood).
A system of rules for analyzing a set of outcomes and a means of predicting.
Helps evaluate the accuracy of a statistic and test a hypothesis, but does not represent the amount of validity associated with a research hypothesis.

Range

Study These Flashcards

A measure of variability that is the difference between the lowest and highest values in a distribution.
The simplest measure of dispersion - the lowest score is subtracted from the highest score.
Considered an unstable measure because it is based on only two values in the distribution. It is extremely sensitive to outliers and does not take into account variations in scores between extremes.

Robust

Study These Flashcards

Referring to results from statistical analyses that are close to being valid, even though the researcher does not rigidly adhere to assumptions associated with parametric procedures.

Skewed distribution

Study These Flashcards

A distribution of scores with a few outlying observations in either direction.
Has an off-centered peak and a longer tail in one direction.
The mean, median, and mode have constant positions: The mode is closest to the peak of the curve; the mean is closest to the tail; the median falls somewhere between the mean and the mode.

Standard deviation (SD)

Study These Flashcards

The most frequently used measure of variability; the distance a score varies from the mean.
The square root of the variance - summarizes data in the same unit of measurement as the original data.
The most stable measure of variability; it takes into account each score in the distribution, and is sensitive to outliers.

Symmetrical distribution

Study These Flashcards

A distribution of scores in which the mean, median, and mode are all the same.

𝑡-test

Study These Flashcards

A popular parametric procedure for assessing whether two group means are significantly different from one another.

Variance

Study These Flashcards

Measure of variability, which is the average squared deviation from the mean (i.e., the sum of all squared deviations divided by the number of scores).
Not widely used because it cannot be employed in many statistical analyses.

Parametric vs. nonparametric procedures

1. Parametric procedures: Require certain assumptions to be met for statistical findings to be valid - the dependent variable is measured on an interval/ratio scale that is *normally distributed* in the population, and groups are mutually exclusive. 2. Nonparametric procedures: Make no assumptions about the shape of the distribution; usually used when data represent an ordinal or nominal scale; easier to calculate but less powerful than parametric procedures.

The research design and type of data collected determine selection of appropriate _

Statistical procedures.

When data represent either an interval or ratio scale, the preferred measure of central tendency is the _

Mean. (The mean is also sometimes appropriate for ordinal data.)

When data represent a skewed distribution, the appropriate measure of central tendency is the _

Median. (Provides a balanced picture of the extreme scores or outliers.)

When nominal data is used, the appropriate measure of central tendency is the _

Mode.

Regardless of the purpose of the study, the researcher uses the _ as a preliminary indicator of central tendency.

Mode.

The _ is sometimes used as a point in the distribution at which scores can be divided into two categories containing the same number of respondents.

Median.

Deviated score

1. The difference between a raw score and the mean of that distribution (i.e., x̄ - x). 2. Forms the basis for the *variance* and the *standard deviation*. 3. The sum of the deviated scores in a distribution always equals 0.

The standard deviation is an appropriate measure of dispersion for distributions that are _

Symmetrical, not skewed.

Parameter estimation

Inferential statistical estimate of a characteristic of a population that is evaluated by a single number (point estimate) or an interval (interval estimate).

Confidence interval

1. Confidence associated with an interval estimate. 2. A range of values that has some specified probability (e.g., 0.95 or 0.99) of including a particular population parameter.

Hypothesis testing

1. Inferential statistical procedure that allows the researcher to formulate a hypothesis concerning a parameter, sample the population of interest, and make objective decisions about the sample results of the study. 2. Two types: Parametric and nonparametric procedures.

Pearson's 𝑟

1. The most widely used correlation coefficient - describes the relationship between two variables. Expressed as a decimal between +1.0 and -1.0. 2. A *parametric* procedure used when both variables are *interval or ratio-level measures*.

The _ statistic is the appropriate test when variables are measured on a nominal scale and the researcher counts the number of items in each category.

Chi-square.

A statistical procedure that is appropriate even when the assumptions are violated is said to be _

Robust.

Parameter vs. statistic

A parameter is a characteristic of a population whereas a statistic is a characteristic of a sample.

A post-hoc test is _

A follow-up test to ANOVA when there are three or more groups.

Chapter 10 Analyzing Data Flashcards

(41 cards)