Midterm Flashcards by Django E.

Experiments vs observational studies

Exp. -> can put a control (témoin), researcher has control. Observational study -> we observe. We only associate and do not establish causality

How well did you know this?

Not at all

Perfectly

Counfounding variable

Variable that can affect both the treatment and the response

How well did you know this?

Not at all

Perfectly

Random sample

Representative of the population. Not biased

How well did you know this?

Not at all

Perfectly

3 types of bias

Selection bias. Non-response bias. Measurement errors

How well did you know this?

Not at all

Perfectly

Selection bias explanation

Subset of the experimental units of population is excluded/no chance of being selected for exp. = not a random sample

How well did you know this?

Not at all

Perfectly

Non-response bias explanation

Unability to obtain data on all experimental units selected for the sample

How well did you know this?

Not at all

Perfectly

Measurement error explanation

Inaccuracies in values recorded.

How well did you know this?

Not at all

Perfectly

Commonly used displays for qualitative data

Pie charts, bar plots

How well did you know this?

Not at all

Perfectly

Simpson’s paradox explanation

Third confounding variable changes the relationship between two other QUALITATIVE variables. Imbalance of the distribution of the categories of the third category with respect to the first two.

How well did you know this?

Not at all

Perfectly

Graphical displays for quantitative data

Boxplot, dotplot, histogram

How well did you know this?

Not at all

Perfectly

Adv/Disadv of the dotplot

A : Get to see all the data points + Easy to interpret D: Gets messy quickly if lots of data

How well did you know this?

Not at all

Perfectly

Adv/Disadv of histograms

A: Easy to pick up on all aspects of quantitative data (centre,spread, etc.) + made by most statistical packages D: Different bin widths can give diff. intepretations

How well did you know this?

Not at all

Perfectly

Mode

Number (or centre of the bin) that occurs most often (that has the most observations)

How well did you know this?

Not at all

Perfectly

Frequency vs Percentage

Frequency = number of observ in this bin. Percentage = percentage of total obs. it represents

How well did you know this?

Not at all

Perfectly

Different possibilities for the mode

Unimodal, Multimodal or no mode

How well did you know this?

Not at all

Perfectly

Adv/Disadv. of mean

Study These Flashcards

A: Good for estimating pop. mean. + good inferential properties D: Outliers and skewed data

Adv/disadv. of median

Study These Flashcards

A: Easy to interpret + not infl. by outliers D: Bad inferential properties + longer to calculate

Adv/disadv of mode

Study These Flashcards

A: Highest concentration of data + bimodal data D : Bin width (class definition) matters

sample vs pop mean symbols

Study These Flashcards

X bar and μ

Sample var/std vs Pop. var/std

Study These Flashcards

S square or S vs sigma square or sigma

Different measures of centre

Study These Flashcards

Mean, median, mode

Different measures of spread

Study These Flashcards

Range, IQR, Var or std

Something particular in variance

Study These Flashcards

squared units

% of observations with Z-score values 1,2.3 w/ empirical rule

Study These Flashcards

68, 95, 99.7

What are the three quartiles on boxplot

Q1 is 25th percentile, Q2 is median, Q3 is 75th percentile

Where whiskers end on boxplot

At most extreme data point within 1.5 IQRs of the EDGE of the box in either direction

Extreme outliers in boxplots

Anything beyond 3 IQRs of the EDGE of the box

Ad/disadv. or variance

A : Good inferential properties D : Influenced by outliers and skewed data

Ad/disadv. IQR

A : Not infl. by outliers D: Bad inferential properties and longer to calculate

Ad/disadv. range

A : Easy to calculate D : Highly inf. by outliers and bad inferential properties

Something particular we can see with boxplots

Possible to have no whisker on one side.

Alternative formula for variance

1/n-1 times ((sum of squared Xi) - n*squared mean))

Sensitivity (of a test ?) in prob

Pr (positive I disease)

Specificity (of a test ?) in prob

Pr (negative | not disease)

Probability of false positive

Pr (positive I not disease)

Multiplicative rule

Pr union = condition * Pr (B)

Limitation to using counting rules

Experience must haves an equal probability for each outcome

Midterm Flashcards

(37 cards)