Lecture 2 Flashcards

(22 cards)

1
Q

4 aspects needed to correctly describe a single quantitative variable

A

Centre, spread (different ways of measuring spread), Skew (+,-,symmetric), weird things (outliers, multiple modes)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Centre

A

Where most data located

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Spread (2)

A

Over what range we see most of the data/how much alike/different the observations are

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Skew

A

What direction does the spread extend to

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Weird things

A

Some points really far away (outliers) or 2 centres

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Dotplot advantages and disadvantages

A

Get to see all data points, Easy to interpret but gets messy quickly if there are a lot of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Histogram advantages and disadvantages

A

Easy to pick up (have an idea of ) on centre, spread, skew, multiple modes and even outliers + Made by most statistical packages (programs) but Different bin widths could give different interpretations of the data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Mode

A

Number that occurs most often

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

We say that the median is _____________ to outliers

A

robust (an outlier won’t change it)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Mean advantages and disadvantages

A

Good for estimating population means and good inferential properties but affected by outliers and skewed data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Median advantages and disadvantages

A

Easy to interpret, Not influenced by outliers but bad inferential properties and longer to calculate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Mode advantages and disadvantages

A

Highest concentration of data and we can see bimodal data but class definition matters (precision des intervalles, peu precis peut avoir plusieurs modes et trop precis fait aucun mode)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Population and sample mean notation

A

Population mean : µ

Sample mean : x barre ou X barre

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Determine skew w/ median and mean

A

Mean left to median = left skew, mean right to median = right skew, mean = median : symmetric

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Range def and pros and cons

A

Difference between max and min values (it’s a measure of spread). Easy to compute but sensitive to outliers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Sample variance measurement (S exposant 2)

17
Q

Population variance (sigma exposant 2)

18
Q

Why squared deviations (2)

A

The sum of Xi - X barre for all values of Xi is 0. Also, absolute values are not good for inference so we use squared deviations

19
Q

To remember in squared deviations

A

Variance is measured in squared units

20
Q

Sample standard deviation (S)

A

Formule . (Racine carrée rajoutée par dessus toute la formule du sample variance)

21
Q

Units of S

A

Same units as data themselves

22
Q

T of F : The standard deviation is the average absolute deviation from the mean

A

False but it doesn’t hurt much to think of it as the average distance of observations from the mean