Midterm Flashcards Preview

Math 203 > Midterm > Flashcards

Flashcards in Midterm Deck (37)
Loading flashcards...
1

Experiments vs observational studies

Exp. -> can put a control (témoin), researcher has control. Observational study -> we observe. We only associate and do not establish causality

2

Counfounding variable

Variable that can affect both the treatment and the response

3

Random sample

Representative of the population. Not biased

4

3 types of bias

Selection bias. Non-response bias. Measurement errors

5

Selection bias explanation

Subset of the experimental units of population is excluded/no chance of being selected for exp. = not a random sample

6

Non-response bias explanation

Unability to obtain data on all experimental units selected for the sample

7

Measurement error explanation

Inaccuracies in values recorded.

8

Commonly used displays for qualitative data

Pie charts, bar plots

9

Simpson's paradox explanation

Third confounding variable changes the relationship between two other QUALITATIVE variables. Imbalance of the distribution of the categories of the third category with respect to the first two.

10

Graphical displays for quantitative data

Boxplot, dotplot, histogram

11

Adv/Disadv of the dotplot

A : Get to see all the data points + Easy to interpret D: Gets messy quickly if lots of data

12

Adv/Disadv of histograms

A: Easy to pick up on all aspects of quantitative data (centre,spread, etc.) + made by most statistical packages D: Different bin widths can give diff. intepretations

13

Mode

Number (or centre of the bin) that occurs most often (that has the most observations)

14

Frequency vs Percentage

Frequency = number of observ in this bin. Percentage = percentage of total obs. it represents

15

Different possibilities for the mode

Unimodal, Multimodal or no mode

16

Adv/Disadv. of mean

A: Good for estimating pop. mean. + good inferential properties D: Outliers and skewed data

17

Adv/disadv. of median

A: Easy to interpret + not infl. by outliers D: Bad inferential properties + longer to calculate

18

Adv/disadv of mode

A: Highest concentration of data + bimodal data D : Bin width (class definition) matters

19

sample vs pop mean symbols

X bar and μ

20

Sample var/std vs Pop. var/std

S square or S vs sigma square or sigma

21

Different measures of centre

Mean, median, mode

22

Different measures of spread

Range, IQR, Var or std

23

Something particular in variance

squared units

24

% of observations with Z-score values 1,2.3 w/ empirical rule

68, 95, 99.7

25

What are the three quartiles on boxplot

Q1 is 25th percentile, Q2 is median, Q3 is 75th percentile

26

Where whiskers end on boxplot

At most extreme data point within 1.5 IQRs of the EDGE of the box in either direction

27

Extreme outliers in boxplots

Anything beyond 3 IQRs of the EDGE of the box

28

Ad/disadv. or variance

A : Good inferential properties D : Influenced by outliers and skewed data

29

Ad/disadv. IQR

A : Not infl. by outliers D: Bad inferential properties and longer to calculate

30

Ad/disadv. range

A : Easy to calculate D : Highly inf. by outliers and bad inferential properties