Midterm Flashcards

(37 cards)

1
Q

Experiments vs observational studies

A

Exp. -> can put a control (témoin), researcher has control. Observational study -> we observe. We only associate and do not establish causality

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Counfounding variable

A

Variable that can affect both the treatment and the response

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Random sample

A

Representative of the population. Not biased

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

3 types of bias

A

Selection bias. Non-response bias. Measurement errors

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Selection bias explanation

A

Subset of the experimental units of population is excluded/no chance of being selected for exp. = not a random sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Non-response bias explanation

A

Unability to obtain data on all experimental units selected for the sample

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Measurement error explanation

A

Inaccuracies in values recorded.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Commonly used displays for qualitative data

A

Pie charts, bar plots

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Simpson’s paradox explanation

A

Third confounding variable changes the relationship between two other QUALITATIVE variables. Imbalance of the distribution of the categories of the third category with respect to the first two.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Graphical displays for quantitative data

A

Boxplot, dotplot, histogram

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Adv/Disadv of the dotplot

A

A : Get to see all the data points + Easy to interpret D: Gets messy quickly if lots of data

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Adv/Disadv of histograms

A

A: Easy to pick up on all aspects of quantitative data (centre,spread, etc.) + made by most statistical packages D: Different bin widths can give diff. intepretations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Mode

A

Number (or centre of the bin) that occurs most often (that has the most observations)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Frequency vs Percentage

A

Frequency = number of observ in this bin. Percentage = percentage of total obs. it represents

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Different possibilities for the mode

A

Unimodal, Multimodal or no mode

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Adv/Disadv. of mean

A

A: Good for estimating pop. mean. + good inferential properties D: Outliers and skewed data

17
Q

Adv/disadv. of median

A

A: Easy to interpret + not infl. by outliers D: Bad inferential properties + longer to calculate

18
Q

Adv/disadv of mode

A

A: Highest concentration of data + bimodal data D : Bin width (class definition) matters

19
Q

sample vs pop mean symbols

20
Q

Sample var/std vs Pop. var/std

A

S square or S vs sigma square or sigma

21
Q

Different measures of centre

A

Mean, median, mode

22
Q

Different measures of spread

A

Range, IQR, Var or std

23
Q

Something particular in variance

A

squared units

24
Q

% of observations with Z-score values 1,2.3 w/ empirical rule

25
What are the three quartiles on boxplot
Q1 is 25th percentile, Q2 is median, Q3 is 75th percentile
26
Where whiskers end on boxplot
At most extreme data point within 1.5 IQRs of the EDGE of the box in either direction
27
Extreme outliers in boxplots
Anything beyond 3 IQRs of the EDGE of the box
28
Ad/disadv. or variance
A : Good inferential properties D : Influenced by outliers and skewed data
29
Ad/disadv. IQR
A : Not infl. by outliers D: Bad inferential properties and longer to calculate
30
Ad/disadv. range
A : Easy to calculate D : Highly inf. by outliers and bad inferential properties
31
Something particular we can see with boxplots
Possible to have no whisker on one side.
32
Alternative formula for variance
1/n-1 times ((sum of squared Xi) - n*squared mean))
33
Sensitivity (of a test ?) in prob
Pr (positive I disease)
34
Specificity (of a test ?) in prob
Pr (negative | not disease)
35
Probability of false positive
Pr (positive I not disease)
36
Multiplicative rule
Pr union = condition * Pr (B)
37
Limitation to using counting rules
Experience must haves an equal probability for each outcome