Probability and Stats Flashcards

1
Q

Law of Large Numbers

A

As sample size grows, sample mean becomes closer to population mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Bayes Rule

A

P(A|B) = P(B|A) P(A)/P(B)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Probability vs Likelihood

A

Probability =
P(X > 32 | mu, std_dev)
Likelihood = finding best params of models/best distribution given observation(s)
e.g., L(mu=sth, std=sth|X)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bayes Nets

A

DAG + variables conditioned on others (local conditional probabilities)
e.g., rain -> cricket -> traffic

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Bayes Error

A

e.g.,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Markov Decision Process

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Hidden Markov Models

A

Hidden State (X) = Markov Process
Observable (Y) = only depends on current state of X
Accounts for temporal relations between hidden states and how those states emit observations
e.g., Language modeling, Y = word, X = part of speech
Tall player fell
Hidden Markov Process: P(adj, noun, verb) = P(adj) P(noun | adj) P(verb | noun)
P(“Tall player fell”) = P(Tall | adj) P(player | noun) P(fell | verb)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Full joint distribution from Bayes Net

A

P(x1, x2, … xn) = prod_i (x_i | parents(x_i))

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Decoding in HMM

A

Find the most probable sequence of hidden states, given a sequence of observations

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Confidence Interval

A

Collect sample means e.g., using bootstrapping (sampling with replacement)
Define a 95% (typical value) interval i.e., interval covering 95% of those sample means
That’s the CI

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

p-value

A

H0 = no difference; H1 = different
If p < alpha, then reject H0

alpha = acceptable False Positive Rate
i.e., chances of saying there is a difference, even though there is not in reality

cons: not well calibrated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Student’s t-test

A

t = (x_bar - mu) / (estimate_of_population_std_dev/sqrt(n))

Test: if the calculated t is outside the confidence interval for given confidence level (1-alpha) and sample size, then reject H_0

Cons: assumes equal variances of both populations
Pros: For smaller sample sizes. t-distribution approaches normal distribution for large sample sizes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Welch’s t-test

A

Doesn’t assume equal variances. Assumes normal distribution just like student’s t-test

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Central Limit Theorem

A
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Binomial distribution

A

Pr(x | n, p) = C(n, x) p^x (1 - p) ^ (n - x)
x = # of successes (e.g., prefers orange fanta) out of n, given success probability p

p^x (1 - p) ^ (n - x) = proba that ‘x’ events are successful for a given configuration,
C(n, x) total configurations in which x successes out of n are observed
e.g., orange vs grape fanta

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Poisson distribution

A
17
Q

ANOVA

A

Analysis of Variance: Use variance to study means for 2 or more populations
- H_0 = all means are the same; H_1 = at least one differs from the others

F = variance between treatments/variance within treatments = MS_treatments/MS_ERRORS

Look up threshold for (# treaments -1) and (# of samples within a group - 1) to see if the differences are significant

18
Q

Expectation Maximization

A
19
Q

Maximum Likelihood Estimation

A
20
Q

Law of total variance

A

EV VE
Var(Y) = E(Variance(Y|X)) + Var(E(Y | X))