HW3 CH2 - measure of center, variation, 5 # Sum, Box Plots Flashcards

1
Q

Define the measure of center

A

Descriptive measure that reveals the center or most typical values of a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a sample mean?

A

sum of all values divided by the total number of observations in the data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

how do you obtain the sample mean?

A

add all the data and divide it by how much data there is

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what is the symbol for sample mean?

A

x with a line above it

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what is the symbol for population mean?

A

the u with a tail, “mu”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

what is a median?

A

A number that divides the top 50% of the data from the bottom 50%

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

how do you find the median?

A

rearrange numbers from least to greatest, odd # is in the middle, even # is (add both middle #’s)/2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what is mode?

A

the value that occurs the most often in the data set, frequency > 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Is it possible for a data set to have 2 or more mode? (T/F)

A

yes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is resistant measure?

A

a measure is robust (resistant) if extreme values have little to no influence on its outcome

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is a robust measure, mean or median?

A

median

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

What is measures of Variation (Dispersion)?

A

descriptive measures that describe how much variation or “spread” there is in a data set

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what is range?

A

The difference between the largest observation and the smallest observation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the disadvantages of range?

A
  1. measure is based only on 2 values
  2. not resistant: highly susceptible to outliers
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

what is deviation?

A

The difference between an observation and the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what is a sample standard deviation?

A

Roughly on average, the difference between an observation and the mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Is range resistant?

A

no

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Does range show how spread out the data is?

19
Q

is standard deviation robust?

20
Q

Why transform data?

A

changing units, making the shape symmetric, make the relationship between 2 variables linear

21
Q

define parameter

A

numerical summary oof the population

22
Q

define statistic

A

numerical summary of the sample

23
Q

define quartiles

A

this divides the data set into 4 equal parts

24
Q

What is the interquartile range?

A

the difference between the third and the

25
What is the 5 number summary?
is consists of the info: 1. minimum value 2. first quartile 3. second quartile (median) 4. third quartile 5. maximum value
26
What is an outlier
A value that is distant from other observations in the data set
27
define a boxplot
a graph that displays the distribution of a data set using the 5 number summary , which we can easily see the outliers
28
what advantage does histogram have against boxplot?
displays more information about the distribution of a data set
29
Define Dot plot
a graphical display of data using dots (dot = value in data set) limit value grouping
30
define stem and leaf plot
a table in which each possible value is split into a "stem" (1st digit) and "leaf" (last digit)
31
What are the advantages of stem leaf and dotplot?
displays all possible values in the data set
32
what are the disadvantages in the stem leaf and dotplot?
When the data set is large this will not be informative, use a histogram instead
33
what is a histogram?
a graph is drawn using vertical bars. bar height = frequency
34
what does a frequency histogram
35
what do outliers affect?
Mean and standard deviation (not resistant measures)
36
what is the degrees of freedom?
n-1 of the sample variance
37
Name the 4-step process to organize a statistical problem
state: what is the practical question? plan: what specific statistical operations does this problem call for? solve: analyze the data with graphs and computations conclude: give your practical conclusion
38
The mean is a measure of center whereas the standard deviation measures the ____________ of data about the mean.
variability
39
The line in the box of a boxplot marks where the __________ is.
median
40
Standard Deviation measures...
variability of data about the mean or the difference between an observation and the mean
41
what is deviation?
The difference between an observation and the mean xi - x
42
how do you figure if a sample is an outlier?
if it is within the upper limit or the lower limit calculations (Q1+1.5 x IQR) and (Q3+1.5 x IQR)
43
How do you find the interquartile range?
Q3 - Q1 = IQR
44