Midterm - Important Flashcards

(22 cards)

1
Q

Mode

A

Value that occurs the most, if there are repeats.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Standard Deviation

A

How a group of numbers are spread out from the mean. Square root of variance.
sqrt of ((sum of (x - mean)^2)/N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Variance

A

The measure of how far each data point is placed from the mean
(sum of (x - mean)^2)/N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Standardized Score/z-score

A

How many standard deviations a value lies from the mean.
(x - mean)/(std dev)
|z| > 1.96 fall outside ~95% of the data
|z| > 2.58 fall outside ~99% of the data
|z| > 3.0 are “definite outliers”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Percentile

A

Divide the data into 100 equal parts. The nth percentile is the value below which n% of observations fall.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Quantiles

A

General term for dividing data into equal-sized groups.
4 parts: quartiles
5 parts: quintiles
10 parts: deciles
100 parts: percentiles

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Percentile Calculation - Greater Than

A

To find value greater than p% of the values:
1. Multiply 0.p * n
2. Round up
3. Add 1 and use that value.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Percentile Calculation - Greater Than or Equal To

A

To find the value greater than or equal to p% of the values:
1. Multiply 0.p * n
2. Round up and use that value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Percentile Calculation - Interpolation

A

To find the pth percentile:
1. rank = p*(n + 1)
2. If rank is an integer, use that value
3. Else use the values from the rank above and below.
4. Take the difference between the values and multiply it by the fraction.
5. Add the lower-rank value, or subtract from the higher-rank.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Symmetric

A

The left side and the right side are roughly mirrored
mean = median
skewness = 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Skewed Left

A

The left side has a long tail, while the right side has a cluster of values
mean < median
skewness < 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Skewed Right

A

The left side has a cluster of values, while the right side
has a long tail mean > median
skewness > 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Skewness

A

A measure of the amount and direction of skew, or departure from symmetry.
< 0.5 is symmetric
between 0.5 and 1 is slight skewness
> 1 is substantial skewness

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Kurtosis

A

A measure of tail heaviness. Larger values of kurtosis indicate a greater presence of extreme values in the distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Mesokurtic

A

Kurtosis is roughly = 3. Matches a normal distribution.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Excess kurtosis

A

Kurtosis minus 3. Mesokurtic is approximately zero.

17
Q

Leptokurtic

A

Kurtosis > 3. Heavy (flat) tails and a peaked center. More outliers.

18
Q

Platykurtic

A

Kurtosis < 3. Lighter (taller) tails and a flatter peak. Fewer outliers.

19
Q

Correlation

A

A statistical measure that expresses the extent to which two variables are linearly related.
Positive - change in same direction
Neutral - no relationship
Negative - change in opposite directions

20
Q

Causation

A

One event is the result of the occurrence of the other event. Cause and effect.

21
Q

Simpson Paradox

A

Groups of data show one trend which is reversed when the groups are combined.