AP Statistics Flashcards

Question

Association

Answer 1

In Statistics, association tells you whether two variables are related. The direction of the association is always symbolized by a sign either positive (+) or negative (-).

Answer 2

Simpson's paradox, which goes by several names, is a phenomenon in probability and statistics, in which a trend appears in several different groups of data but disappears or reverses when these groups are combined.

Answer 3

A Dot Plot, also called a dot chart or strip plot, is a type of simple histogram-like chart used in statistics for relatively small data sets where values fall into a number of discrete bins (categories).

Answer 4

The center is the median and/or mean of the data. The spread is the range of the data. And, the shape describes the type of graph. The four ways to describe shape are whether it is symmetric, how many peaks it has, if it is skewed to the left or right, and whether it is uniform.

Answer 5

The mode is the number that is repeated more often than any other.

Answer 6

The center of a distribution is the middle of a distribution. For example, the center of 1 2 3 4 5 is the number 3. ... Look at a graph, or a list of the numbers, and see if the center is obvious.

Answer 7

Measures of spread describe how similar or varied the set of observed values are for a particular variable (data item). Measures of spread include the range, quartiles and the interquartile range, variance and standard deviation.

Answer 8

In statistics, the range of a set of data is the difference between the largest and smallest values.

Answer 9

In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to variability in the measurement or it may indicate experimental error; the latter are sometimes excluded from the data set. An outlier can cause serious problems in statistical analyses.

Answer 10

A symmetric distribution is a type of distribution where the left side of the distribution mirrors the right side. By definition, a symmetric distribution is never a skewed distribution. ... The normal distribution is symmetric. It is also a unimodal distribution (it has one peak). Standard normal distribution.

Answer 11

A distribution that is skewed right (also known as positively skewed) is shown below. ... For a right skewed distribution, the mean is typically greater than the median. Also notice that the tail of the distribution on the right hand (positive) side is longer than on the left hand side.

Answer 12

A distribution that is skewed left has exactly the opposite characteristics of one that is skewed right: the mean is typically less than the median; the tail of the distribution is longer on the left hand side than on the right hand side; and. the median is closer to the third quartile than to the first quartile.

Answer 13

A unimodal distribution is a distribution with one clear peak or most frequent value. The values increase at first, rising to a single peak where they then decrease. ... The normal distribution is an example of a unimodal distribution; The normal curve has one local maximum (peak).

Answer 14

A multimodal distribution is a probability distribution with more than one peak, or “mode.” A distribution with one peak is called unimodal. A distribution with two peaks is called bimodal. A distribution with two peaks or more is multimodal.

Answer 15

A stem and leaf plot is a way to plot data where the data is split into stems (the largest digit) and leaves (the smallest digits). ... A very long leaf means that “stem” has a large amount of data.

Answer 16

Split stems is a term used to describe stem-and-leaf plots that have more than 1 space on the stem for the same interval. Example would be 1 with leaves 1-4, and a 2nd 1 containing leaves 5-9.

Answer 17

Back-to-back stemplots are a graphic option for comparing data from two populations. The center of a back-to-back stemplot consists of a column of stems, with a vertical line on each side. Leaves representing one data set extend from the right, and leaves representing the other data set extend from the left.

Answer 18

A plot is a graphical technique for representing a data set, usually as a graph showing the relationship between two or more variables.

Answer 19

A histogram is a bar graph-like representation of data that buckets a range of outcomes into columns along the x-axis. The y-axis represents the number count or percentage of occurrences in the data for each column and can be used to visualize data distributions.

Answer 20

A mean score is an average score, often denoted by X. It is the sum of individual scores divided by the number of individuals.

Answer 21

The median is a simple measure of central tendency. To find the median, we arrange the observations in order from smallest to largest value. If there is an odd number of observations, the median is the middle value. If there is an even number of observations, the median is the average of the two middle values.

Answer 22

The interquartile range (IQR) is a measure of variability, based on dividing a data set into quartiles. Quartiles divide a rank-ordered data set into four equal parts. The values that divide each part are called the first, second, and third quartiles; and they are denoted by Q1, Q2, and Q3, respectively. For example, consider the following numbers: 1, 3, 4, 5, 5, 6, 7, 11. Q1 is the middle value in the first half of the data set. Since there are an even number of data points in the first half of the data set, the middle value is the average of the two middle values; that is, Q1 = (3 + 4)/2 or Q1 = 3.5. Q3 is the middle value in the second half of the data set. Again, since the second half of the data set has an even number of observations, the middle value is the average of the two middle values; that is, Q3 = (6 + 7)/2 or Q3 = 6.5. The interquartile range is Q3 minus Q1, so IQR = 6.5 - 3.5 = 3.

Answer 23

The five-number summary is a set of descriptive statistics that provides information about a dataset.

Answer 24

The information that gives a quick and simple description of the data. Can include mean, median, mode, minimum value, maximum value, range, standard deviation, etc.

Answer 25

In descriptive statistics, a box plot or boxplot is a method for graphically depicting groups of numerical data through their quartiles. Box plots may also have lines extending from the boxes (whiskers) indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-and- ...

Answer 26

A quantity expressing by how much the members of a group differ from the mean value for the group.

Answer 27

The variance is a numerical value used to indicate how widely individuals in a group vary. If individual observations vary greatly from the group mean, the variance is big; and vice versa. It is important to distinguish between the variance of a population and the variance of a sample. They have different notation, and they are computed differently. The variance of a population is denoted by σ2; and the variance of a sample, by s2. The variance of a population is defined by the following formula: σ2 = Σ ( Xi - X )2 / N where σ2 is the population variance, X is the population mean, Xi is the ith element from the population, and N is the number of elements in the population. The variance of a sample is defined by slightly different formula: s2 = Σ ( xi - x )2 / ( n - 1 ) where s2 is the sample variance, x is the sample mean, xi is the ith element from the sample, and n is the number of elements in the sample. Using this formula, the variance of the sample is an unbiased estimate of the variance of the population. And finally, the variance is equal to the square of the standard deviation.