Think Stats - Allen Downey Flashcards

Question

In a statistical study what does 'representative' mean?

Answer 1

A sample is representative if every member of the population has the same chance of being in the sample.

Answer 2

The technique of increasing the representation of a subpopulation in order to avoid errors due to small sample sizes.

Answer 3

Values collected and recorded with little or no checking, calculation or interpretation.

Answer 4

Processes that include: 1) validating data 2) identifying errors 3) translating between data types and representations, etc.

Answer 5

The values that appear in a sample and the frequency of each.

Answer 6

A mapping from values to frequencies, or a graph that shows this mapping.

Answer 7

The number of times a value appears in a sample.

Answer 8

The most frequent value in a sample, or one of the most frequent values.

Answer 9

An idealization of a bell-shaped distribution; also known as a Gaussian distribution.

Answer 10

A distribution in which all values have the same frequency.

Answer 11

The part of a distribution at the high and low extremes.

Answer 12

A characteristic of a sample or population; intuitively, it is an average or typical value.

Answer 13

A value far from the central tendency.

Answer 14

A measure of how spread out the values in a distribution are.

Answer 15

A statistic that quantifies some aspect of a distribution, like central tendency or spread.

Answer 16

A summary statistic often used to quantify spread.

Answer 17

The square root of variance, also used as a measure of spread.

Answer 18

A summary statistic intended to quantify the size of an effect like a difference between groups.

Answer 19

A result, like a difference between groups, that is relevant in practice.

Answer 20

A representation of a distribution as a function that maps from values to probabilities.

Answer 21

A frequency expressed as a fraction of the sample size.

Answer 22

The process of dividing a frequency by a sample size to get a probability.

Answer 23

In a pandas DataFrame, the index is a special column that contains the row labels.

Answer 24

The percentage of values in a distribution that are less than or equal to a given value.

Answer 25

The value associated with a given percentile rank.

Answer 26

A function that maps from values to their cumulative probabilities. ------ CDF(x) is the fraction of the sample less than or equal to x.

Answer 27

A function that maps from a cumulative probability, p, to the corresponding value.

Answer 28

The 50th percentile, often used as a measure of central tendency.

Answer 29

The difference between the 75th and 25th percentiles, used as a measure of spread.

Answer 30

A sequence of values that correspond to equally spaced percentile ranks -------- For example, the quartiles of a distribution are the 25th, 50th and 75th percentiles.

Answer 31

A property of a sampling process. “With replacement” means that the same value can be chosen more than once -------- “without replacement” means that once a value is chosen, it is removed from the population.

Answer 32

The distribution of values in a sample.

Answer 33

A distribution whose CDF (cumulative distribution function) is an analytic function.

Answer 34

A useful simplification. Analytic distributions are often good models of more complex empirical distributions.

Answer 35

The elapsed time between two events.

Answer 36

A function that maps from a value, x, to the fraction of values that exceed x, which is 1-CDF(x).

Answer 37

The normal distribution with mean 0 and standard deviation 1.

Answer 38

A plot of the values in a sample versus random values from a standard normal distribution.

Answer 39

The probability of obtaining test results at least as extreme as the results observed, on the assumption that the null hypothesis is correct ———— the p-value (or probability value)

Think Stats - Allen Downey Flashcards

(63 cards)