Lec 3 Flashcards

1
Q

Sample space
Event
Probability
Complement

A

Sample space: set of all possible outcomes of an experiment

Event: any set of outcomes (from sample space) of interest (Eg: a red card, card of diamonds)

Probability (of an event): the relative freq of the set of outcomes (compromising the event) over an indefinitely large (infinite) # of repetitions of the experiment (trials)
Eg: In the long run, what portion of the time will you expect a diamond? 0.25

Complement: of an event A, the set of outcomes in the sample space that are not in the event A (Aka A’, Ā, Ac)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Union of two events
Union of more than 2 events
Intersection of 2 events
Intersection of more than 2 events
Mutual exclusive events

A

Union of two events: it means A or B(shown as A U B)

Union of more than 2 events: A1 or A2 or A2 … or Ak

Intersection of 2 events: it means A and B, or the outcomes that belong to both A and B
(shown as A ∩ B, AB, A and B)

Intersection of more than 2 events: A1 and A2 and A3… Ak

Mutual exclusive events: A and B cannot occur at the same time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Exhaustive
Partition
Independence
Independence formula

A

Exhaustive: at least one of them must occur (Eg when you role a die, one of {1,2,3,4,5,6} must occur)

Partition: how mutually exclusive events are organized in the sample space (image)

Independence: knowing one event happened doesn’t change the probability of the other event

Independence formula: P(AB) = P(A)P(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Formula for A U B

Formula for A U B, if A and B are mutually exclusive

Are mutually exclusive events independent?

What happens when we sum up events that are a partition of the sample space?

Formula for complementary

A

Formula for A U B: P(A U B) = P(A) + P(B) – P(AB)

Formula for A U B, if A and B are mutually exclusive: P(AUB) = P(A) + P(B)

Is mutually exclusive event independent? No

What happens when we sum up events that are a partition of the sample space?
P(S) = sum of P(Ai to Ak) = 1

Formula for complementary: P(A’) = 1 – P(A)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Formulas:
The conditional probability of B given A

The conditional probability of B given A if A and B are independent

The conditional probability of A given B

General formula for P(AB) or 2 ways to express P(AB)

Formula to infer independence

A

Formulas:
The conditional probability of B given A: P(B|A) = P(AB)/P(A)

The conditional probability of B given A if A and B are independent: P(B|A) = P(A) P(B)/ P(A) = P(B)

The conditional probability of A given B: P(A|B) = P(AB)/P(B)

General formula for P(AB) or 2 ways to express P(AB): P(AB) = P(A) P(B|A) = P(A|B) P(B)

Independence: P(AB) = P(A)xP(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Bayes Theorem

How many formulas

Sensitivity
Specificity

High sensitivity
High specificity

SPIN
SNOUT

A

Bayes’ therorem: helps us determine P(A|B) when we know the probability of P(B|A), P(A), and P(B)

Bayes’ theorem formula: there’s 3

Sensitivity: if you have the disease, what is the prob of +ve test

Specificity: if you do not have the disease, what is the prob of -ve test

high sensitivity: if you have disease, you have high prob of a +ve result
If you have a -ve result on a test with HIGH sensitivity, there’s a high chance you DON’T have the disease (good for ruling out)

Specificity: no disease, high chance of -ve result
So, a +ve result means that you are very likely to have the disease

(NOTE: In Epi, we want to know, given the test result, do we have covid/disease? We can determine this with SPIN and SNOUT)

SPIN: specificity = rule in
SNOUT: Sensitivity = rule out

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Random variable

Capital letters (X,Y)
Small letters (x,y)

Discrete random variable
Examples

Continuous random variable
Examples

A

Random variable: it assigns a real number to a point in the sample space

Capital letters (X,Y) = random variables
Small letters (x,y) = actual values

Discrete random variable: it is countable
Eg: # of heads from 4 coin tosses, # of particles from a radioactive source in 1min, # hospital admissions

Continuous random variable: not countable, usually measured (eg height, weight)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Probability mass function

Formula

Variable type

2 conditions for pmf

E

A

Probability mass function (pmf)
In a sample space, there is a discrete random variable “X”. The pmf gives the probability of X is EQUAL to a value “x”
(Denoted by small letter f(x)

Formula: f(x) = P(X = x)

Variable type: Only for discrete variables

2 conditions of pmf: pmf (a probability) is a value b/w 0 an 1; summation of all probabilities in the sample space = 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

The cumulative distribution function (CDF)

Formula

A

Cumulative distribution function: is the probability that a random variables “X” has a value LESS THAN or EQUAL to “x”

Denoted by capital letter F(x)

Formula: F(x) = P (X ≤ x)

Variable type: discrete and cont random v

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to use pmf to find the expected value or mean for random variable X

How to use pmf to find the VARIANCE for random variable X (2 formulas)

A

Use pmf to find the expected value or mean for random variable X
(Eg: What is the expected value of a role of a fair die µ = E(x) = (1)1/6 + (2)1/6 + (3)1/6 + (4)1/6 + (5)1/6 + (6)1/6 = 3.5
It’s the avg of #s 1 to 6

Use pmf to find the VARIANCE for random variable X (2 formulas)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Factorial

Combination
formula

A

Factorial: n!

Combination: The number of ways to choose “x” items in a set of n items
(eg how many ways to choose 12 juries in a set or pool of 20 people)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Define binomial distribution

Variable type

Notation and explanation for binomial distribution

Formula

pmf for binomial distribution formula

The pattern on graphs - if p increases, what happens to the distribution

mean of binomial distribution

variance of binomial distribution

A

Binomial distribution: The probability of “SUCCESS” or “FAILURE” outcomes in an experiment that is repeated for “n” trials

Variables: discrete

Notation: X~ Bin(n,p)
The random variable “X” is in a binomial distribution with “n” experimental trials and “p” probability

pmf for binomial distribution formula: (image)

Graphs: As probability goes up (p = 0.5 -> 0.75), it becomes more skewed to the left (the tail is on the left side)

mean of binomial distribution: E(X) = µ = np

variance of binomial distribution: σ² = np (1 – p)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Define poisson distribution

Variable type

pmf for random variable X formula

mean for poisson distribution

variance for poisson distribution

A

Poisson distribution: determines the probability of an event happening over a specified period of time

Variables: discrete

pmf for random variable X formula
e = 2.72

mean for poisson distribution: E(X) = µ = λ

variance for poisson distribution: σ² = λ

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Probability density function (pdf)

variable type:

pdf formula: what it does

A

Probability density function (pdf): the area under the curve b/w 2 values (eg a and b) on the horizontal axis is equal to the probability that “X” (the random variable) is b/w those 2 values

Variable type: continuous

pdf formula: integrates and computes the area under the curve

mean for pdf …
variance for pdf …

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain X ~ N (µ, σ2)

variable type for normal distribution

Notation for Standard normal distribution

Process to transform NORMAL distribution to STANDARD NORMAL distribution

A

X ~ N (µ, σ2): “X” (random variable) is located in a normal or Gaussian distribution that has a mean µ and variance σ2

variable type: continuous

pdf for normal distribution formula …

Standard normal distribution: X ~ N(0,1)

pdf for standard normal distribution formula …

Process to transform NORMAL distribution to STANDARD NORMAL distribution: (image)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

For W, the mean and variance after a Location shift ONLY

For W, the mean and variance after a Scale shift ONLY

For W, the mean and variance after a
Scale AND location shift

A

Transformations:
Random variable = X
mean = µ
variance = σ²
constants = a, b

For W, the mean and variance after a Location shift ONLY
New variable W = X + b
mean: µ + b
variance: σ²

For W, the mean and variance after a Scale shift ONLY
New variable W = aX
mean: aµ
variance: a²σ²

For W, the mean and variance after a
Scale AND location shift
New variable W = aX + b
mean: aµ + b
variance: a²σ²

17
Q

Covariance

Variance

Notation for Covariance

If X and Y are independent, then Cov (X,Y) =?

Is the inverse true? Why?

A

Covariance measures the relationship between X and Y (2 random variable), specifically how much they vary

Variance: measures the spread of a data set around its mean

Notation: Cov(X, Y) = E[(X - µx)(Y - µY)]
Denoted as an expectation (E)
X and Y are random variables with means µx and µY respectively

If X and Y are independent, then Cov (X,Y) = 0
IMPORTANT: Cov (X,Y) = 0 does not imply X and Y are independent
IOW: inverse is not true
Covariance = how the 2 differ
Correlation = how the two are related
Covariance does not tell you correlation

18
Q

the mean and variance when we sum RANDOM variables

the variance when we sum RANDOM variables that are independent

the mean and variance when we sum NORMAL, BINOMIAL, POISSON variables that are independent

A

Sum of random variables
Random variable X
mean = µx
variance = σ2 x

Random variable Y
mean = µy
variance = σ2 y

Let W = X + Y
Now, W has
mean = µx +µy
variance = σ2 x + σ2 y + 2Cov(X,Y)

If X and Y are independent
variance of W = σ2 x + σ2 y

Special Case – Normal variable
Given
X ~ N( µx, σ2 x)
Y ~ N(µy, σ2 y)
X and Y are independent

Let W = X + Y
Then W ~ N(µx + µy, σ2 x + σ2 y)
It is also the case the sums of independent binomials or Poissons are also binomial or poisson respectively
IOW: normal + normal = normal

19
Q

2 types of discrete probability distributions

2 types of continuous prob distributions

probability distribution for discrete and continuous

A

2 types of discrete probability distributions: binomial or poisson; look at probability mass function (pmf)

2 types of continuous prob distributions: normal/Gaussian distribution, standardized normal distribution; look at probability density function (pdf)

probability distribution for discrete and continuous: cumulative distribution function

20
Q
A
  • 2 types of sampling: nonprob, and prob
    o Nonprobability sampling:
     Involve convenience sampling and voluntary sampling
     Both are susceptible to selection bias
    o Prob sampling: allow us to get a sample that is representative, and the results produce valid inferences
     It uses random sampling techniques:
  • Simple random sampling
  • Systematic sampling
  • Stratified random sampling
  • Cluster sampling
  • Simple random sample
    o Each subject in the pop has an equal chance of being selected
  • Systematic sample
    o Subjects from pop are selected according to a random starting pt, and then every fixed period interval
    o The interval is determined by dividing the size of pop by desired size of sample
  • Stratified sample
    o A simple random sample is taken from a # of distinct strata of the pop
  • Cluster sampling
    o Used when natural gps exist in pop
    o Pop is divided into clusters, simple random selection is taken of each cluster
21
Q
A
  • Eqn for z statistic:
    o Assumes the value of pop standard deviation σ is known
  • When the # of trials n is large and p (prob of success for each trial) is near 0.5
    o Binomial distribution is approx. equal to normal distribution
  • Prob distributions using sample stats are sampling distributions
  • Sampling distributions: prob distribution of a stat for all possible samples of a given size from a pop
  • Central limit theorem: the distribution of smaple means approx. normal distribution
  • T-distribution also resembles normal distribution
    o Variability in sampling distribution of t depends on the sample size n
  • 2 other cont prob distributions: chi-aqr, F distributions
  • X