1.3 and 1.4 Overview and Descriptive Stats Flashcards

(25 cards)

1
Q

Smoothed histogram

A

Density estimate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Sample Statistics

A

Numbers describing a sample distribution

  • measures of CENTER (mean, median)
  • measures of spread (standard deviation, range, IQR)
  • other: min, max, quartiles
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

sample mean

A

x bar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

population mean

A

impossible to find bc you can’t get data from everyone ever

dealt with theoretically with infinite populations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

drawback of the Mean

A

sensitive to outliers

-if the data is not symmetric, the mean isn’t very good at measuring the center

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

median

A

~x
middle value
outliers DO NOT change the median = resistant

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

median when n is odd vs even

A

odd, after ordering, median = (n+1)/2 th data value

even, after ordering, median = avg of n/2 th and (n/2 + 1)th data values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

using population mean and population median

A
1. symmetric distribution: u = ~u
(mean & median are close together)
2. negatively skewed distribution u < ~u
3. positively skewed distribution u > ~u
mean gets pulled towards median and vice versa
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Quartiles

A

Q1: median of data values < ~x (25%)
Q3: median of data values > ~x (75%)
range = max - min

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Boxplot

A

visual representation of 5 number summary (max, min, median, Q1, Q3)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Interquartile Range (IQR)

A

range of middle 50% of data

IQR = Q3 - Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to identify outliers

A
  • find IQR
  • multiply * 1.5 (1.5 IQR rule)
  • subtract that # from Q1 and then from Q3
  • anything above #-Q3 and below #-Q1 = outlier
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Deviation from mean

A

xi - x bar

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

summation of (xi-x bar) =

A

0, always

bc…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

sample variance s^2

A

= summation of (xi-x bar)^2
/(n-1)
*if you know the mean and the n-1 values, then you know the last value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

n-1

A

degrees of freedom

since summation of (xi-x bar) = 0, the last deviation can always be calculated if first n-1 is known

17
Q

Sample Standard Deviation

A

how spread out the data is
(only really good with symmetric data)
s = sqrt(s^2)
NOT resistant to outliers

18
Q

When to use standard deviation (s) vs IQR as measure of spread***

A
  • measure of center = mean, use (s)

- measure of center = median, use IQR

19
Q

Finite population standard deviation

A

sigma o-bar

sqrt(summantion (xi-x bar)^2 / N)

20
Q

Letters rule of thumb

A
roman = sample
greek = population (parameters)
mew (u) = mean pop
x bar = mean sample
sigma (o bar) = standard deviation pop
s = standard deviation sample
21
Q

Standard deviation and the mean tell if your data is

A

centered, symmetric, or dispersed

22
Q

Bell shaped data =

A

normally distributed

23
Q

Empirical Rule

A

68-95-99.7% rule
1 standard deviation = 68% of data
2 standard deviations = 95% data
3 standard deviations = 99.7% data

24
Q

Z-score

A
  • standard deviation = ruler on normal distribution
  • measurements = z-scores
  • = how many standard deviations above or below a data point is
  • z = (x - mean) / standard deviation
  • no units, mean of all z-scores = 0, standard deviation of all = 1
25
K-th percentile (Pk)
k% data is less than it k percent of observations are less than the value ex. P99 = 99th percent: # which 99% of data is below Q1 = P25 ~x = P50 Q3 = P75 93rd percentile = 7% of students scored higher than you