Chapter 3 Flashcards

(41 cards)

1
Q

3 main “measures of center”

A
  1. Mean
  2. Median
  3. Mode
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Mean

A

Obtained by dividing the sum of all values by the number of values in the data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Median

A

The value that divides a data set that has been sorted in increasing order into two equal halves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Mode

A

The value that occurs w/ the highest frequency in a data set.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Mean for population data

A

u = sum on all x’s / N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Mean for sample data

A

X bar = sum of all x’s /n

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

2 steps to calculate the median

A
  1. Sort the data set into increasing order
  2. Find the value that divides the sorted data set in two equal parts.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Can there be no mode?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Can modes be from qualitative data?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Can there be more than one mode?

A

Yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are the mean, median, and mode of a symmetrical histogram /distribution curve

A

Mean = median = mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Mean, median, and mode of a right-skewed histogram

A

Mean > median > mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Left-skewed histogram mean, median, and mode

A

Mean < median < mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Trimmed mean

A

After we drop K% of the values from each end of a ranked data set, the mean of the remaining values is called the K% trimmed mean.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Weighted mean

A

When each value of a data set is assigned a different weight.
Sum of x* W/ sum of W

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Measures of dispersion tell us…

A

How much variation exists around that “typical value”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

3 main measures of dispersion

A
  1. Range
  2. Variance
  3. Standard deviation
18
Q

Range

A

The difference between the largest value and the smallest value.

19
Q

Variance

A

A measure of how much the values in a dataset differ from the mean.

20
Q

Standard deviation

A

A measure of the average distance of each data point from the mean. The square root of variance

21
Q

Range formula

A

Largest value - smallest value

22
Q

Disadvantages of range

A
  1. Only based on 2 values
    2, affected by outliers
23
Q

Can the variance and the standard deviation be negative?

24
Q

Units for standard deviation

A

Same as the original units

25
Units for variation
The square of the original data's units
26
Coefficient of variation ( CV )
A measure of relative variability. Useful if you are comparing the variation of two datasets w/ different magnitudes of value.
27
Variance and SD depend on...
The units of measurement
28
Coefficient of variation units
Expressed as a percentage of the mean. Has no units and is always expressed as a percentage.
29
Coefficient of variation formula
100 x SD / mean
30
Mean of grouped data for population data
u = sum of frequency x midpoint /n
31
Mean of grouped data for sample data
X bar = sum of frequency x midpoint /n
32
Standard deviation
A measure of the average distance of each data point from the mean.
33
ChebyShev's theorem
For any number k greater than 1, at least ( 1-1 / k^2 ) of the data values lie within K standard deviations of the mean
34
ChebyShey's theorem works for...
Any distribution shape
35
Empirical rule
If our distribution is a "bell-shaped"or “normal" or "Gaussian" we use the empirical rule. 68% of observations lie w/ in one standard deviation of the mean 95% of the observations lie with in 2 SDs of the mean 99. 7% of the observations lie with in 3 SDs of the mean
36
Quartile
Three summary measures ( Q1, Q2, Q3 ) that divide a ranked data set into four equal parts. Q2 is the same as the median. Splits the data into 4 sections. (Each contains 25% of the observations of a data set)
37
Interquartile range (irq)
The difference between Q3 and Q 1 IRQ = Q3 - Q1 Another measure of dispersion. Small IRQ = less spread out data Large IRQ = more spread out data
38
Percentiles
99 summary measures that divide a ranked data set into 100 equal parts. Each portion contains 1% of the observations of a data set.
39
The (approximate) value of the K Th percentile is sample of size N is:
Pk = value of the ( kn / 100 ) Th term in a ranked data set Always round the position up.
40
Given a certain number in a set and find its percentile.
Percentile = number of values less than k / total number of values in the data set X 100%
41
Box-and-whisker plot
Shows 5 measures: 1. Median 2.Q1 3. Q3 4. Minimum 5. Maximum Lower inner fence = Q1 - 1.5x IQR Upper inner fence = Q3 + 1.5x1QR Outliers are plotted outside the fences.