Lecture 4 Flashcards

1
Q

Percentile

A

For any set of n measurements arranged in ascending order, the pth percentile is a number such that p% of the measurements fall below that number and (100-p)% fall above it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Quartiles

A

Let n denote the number of observations in a data set. Arrange the observed values in ascending order.
First quartile: n+1/4
Second quartile: n+1/2
Third quartile: 3(n+1)/4

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

If a position is not a whole number, what is used?

A

linear interpolation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Interquartile Range (IQR)

A

the sample interquartile range of the variable, denoted IQR, is the difference between the first and third quartiles of the variable; gives us the range of the middle 50% of the observed values
IQR=Q3-Q1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Q1 is roughly what percentile?

A

25th percentile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Q2 is what percentile?

A

50th percentile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Q3 is what percentile?

A

75th percentile

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Five number summary

A

consists of minimum, maximum, and quartiles written in increasing order
Min, Q1, Q2, Q3, Max

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Sample Z score

A

Z=x-sample mean/s

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Population Z score

A

Z=x-population mean/ population standard deviation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Interpretation of Z score

A

Sign: whether score is above (+) or below (-) the mean
Number: Distance between the score and mean in standard deviation units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is not affected by an extreme value in the data set?

A

median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Outlier

A

an observation that is unusually large or small relative to the other values in a data set. typically are attributable to one of the following causes
1) the measurement is observed, recorded, or entered into the computer incorrectly
2) comes from different population
3) correct but rare event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Box plot

A

based on the five number summary and can be used to provide a graphical display of the center and variation of the observed values of variables in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

How to construct a box plot

A

1) determine the five number summary
2) draw a horizontal (or vertical) axis on which the numbers obtained in step 1 can be located. Above the axis, mark the quartiles and the minimum and maximum with vertical (horizontal) lines
3) connect the quartiles to each other to make a box, and then connect the box to the minimum and maximum with lines
4) Determine the lower inner fence and upper inner fence
Lines are drawn from each hinge to the inner fence boundaries
5) Determine lower outer fence and upper outer fence

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Interpreting a box plot

A

the line inside the box represents the median
the length of the box is the IQR: measure of sample variability
comparing the lengths of the whiskers may determine skewness
less than 5% of observations should fall outside the inner fences

17
Q

Rule of thumb for detecting outliers

A

observations falling between the inner and outer fences are deemed suspect outliers. Observations falling beyond the outer fence are deemed highly suspect outliers
Suspect outliers: between Q1-1.5(IQR) and Q-3(IQR) or between Q3+1.5(IQR) and Q3+3(IQR)
Highly suspect outliers: less than Q1-3(IQR) or greater than Q3+3(IQR)