# Probability and Statistics Basics Flashcards

Prob: What are the two equivalent definitions of events A and B being independent?

P(A,B) = P(A)P(B)

OR

P(A) = P(A | B=b) for all values of b

(Pretty darn sure second is correct)

Prob: What are the two equivalent definitions of random variables Y1 and Y2 independent?

F(y1,y2) = F1(y1)F2(y2) (The joint dist factors to the marginal dists)

OR

F1(y1) = F(y1 | Y2 = y2) for all values of y2 (The marginal distribution for either variable is the same as the conditional distribution given any value of the other variable)

(Pretty darn sure second is correct)

Prob: Conceptually, what does it mean for A and B to be independent, either as variables or as events?

A and B are independent variables if the value of one variable gives you no information about the value of the other.

A and B are independent events if knowing whether one event happened or not gives you no information on whether the other happened.

Prob: What are the 2 equivalent definitions of variables X and Y to be uncorrelated?

Their linear correlation coefficient is 0.

OR

E[XY] = E[X]E[Y]. (This actually means their covariance is 0, but their covariance is 0 iff they’re uncorrelated)

Prob: Does 2 variables being independent imply they are uncorrelated?

Yes

Prob: Does 2 variables being uncorrelated imply they are independent?

No

Prob: What is an example of a distribution of 2 variables such that they are uncorrelated, but not independent? Why is it true in this case?

X = U(-1,1) and Y = X^2

Here, E(XY) = 0 = E(X)E(Y), because the distribution of XY is symmetric around 0

But, the value of X gives you information about Y – it in fact tells you Y specifically.

Prob: What is Bayes’ Theorem?

Prob: What is a useful form of E[X^2]

E[X^2] = V[X] + (E[X]^2)

Prob: What are DeMorgan’s Laws?

Prob: What is an experiment?

An activity with an observable outcome.

Ex. Rolling a die, or rolling 2 dice, or flipping a coin…

Prob: What is an outcome?

A unique result of an experiment.

For example, rolling a 6, where the experiment was rolling a die.

Prob: What is a sample space?

All of the possible outcomes of an experiment.

For example, [1,2,3,4,5,6], when the experiment is rolling a die.

Prob: What is an event?

A collection of outcomes forming a subset of the sample space.

For example, rolling an even number, if the experiment is rolling a die.

Prob: What is a formula for P(A union B)?

P(A) + P(B) - P(A and B)

Prob: What is linearity of expectation?

E[cX + kY] = cE[X] + kE[Y], even if X and Y are dependent

Prob: What is one potentially convenient way to find P(A and B) when A and B are dependant?

P(A)*P(B|A), or P(B)*P(A|B)

Stat: What proportion of points drawn from a normal distribution will fall within 1 standard deviation? 2? 3?

68% within 1, 95% within 2, 99.7% within 3

Prob: What is the law of total probability?

If you can decompose the sample space S into n parts B1,…,Bn, then

P(A) = P(A|B1)P(B1) + … + P(A|Bn)P(Bn)

A common form is

P(A) = P(A|B)P(B) + P(A|B^{c})P(B^{c})

Prob: What trick is often used in the denominator of a Bayes’ Rule problem?

Law of total probability

Prob: What is a probability density function, or pdf f(), typically used for?

For a given probability distribution, you can integrate f() over an interval (or area, or n-d area) to find the probability that an experiment will fall in that interval/area.

Prob: what is a cumulative density function F(), or cdf, typically used for? How is it related to the pdf f()?

For a given probability distribution of RV X, F(x) = P(X<x></x>

<p>If you integrate f() from -inf to a, you get F(a)</p>

</x>

Prob: What is the formula for the expected value of discrete RV X?

Prob: What is the formula for E[g(X)], or the expected value of a function g of continuous RV X, with pdf f()?

Prob: V[aX+b]?

a^{2}V[X]

Prob: Technically, what does it mean for a distribution Y to be memoryless?

P(Y > a+b|Y > b) = P(Y > a)

Prob: Conceptually, what does it mean for a probability distribution Y to be memoryless?

For an experiment, past behavior has no bearing on future behavior. For example, if you’re waiting for a bus to come and it follows a memoryless distribution (such as an exponential one), if you wait 5 minutes and there’s still no bus, the probability distribution of when it will arrive *starting now, after 5 minutes* is the same as it was when the experiment began.

Prob: What is the phenomenon being observed in a geometric probability distribution?

We have an event such as a coin toss with probability p of succeeding, and we keep performing attempts until we succeed.

Prob: If we have probability p of succeeding, what is the probability that geometric random variable Y=y?

(1-p)^{y-1}p

Prob: what is the expected value of a geometric random variable with probability p of success?

1/p

Prob: What phenomenon is observed by a binomial probability distribution?

We have an event, such as flipping a coin, with probability p of success, and we look to see how many of our n trials will be successes.

Prob: If our binobial distribution has events with probability p of success, and we conduct n events, what is the probability that y will be successes (assuming 0 <= y <= n)? And what is the intuition behind this result?

p^{y}(1-p)^{n-y} is the odds of a specific result with y successes (so y specific positions being successes, and the other n-y being failures). But we need the probability of any; these occurrences are disjoint, so we sum their probabilities by multiplying by the number of such potential outcomes, which is n choose y.

Prob: What is the expected value of a binomial distribution with n trials and probability of success p?

np

Prob: What is the expected value of Uniform(a,b)

(a-b)/2

Prob: What is the pdf f(x) of Uniform(a,b)

f(x) = 1/(b-a)

Prob: In words, what is the law of large numbers?

When sampling from a distribution, as the number of samples grows, the sampling mean will tend towards the expected value of the distribution.

Stat: In normal distribution notation N(a,b), is b sigma, or sigma^{2}?

sigma^{2}

Normal: If Y follows N(µ,ð^{2}), what is the formula for the z-score Z of Y=y?

Z = (y - µ)/ð

Normal: If Y follows N(µ,ð^{2}), what (in words) is the z-score of Y=y?

The number of standard deviations ð that y is above or below the mean µ.

Stat: What does the standard normal distribution Z follow?

Z follows N(0,1)

Stat: In the context of normal distributions, what is the function shown below, what is its input from an arbitrary normal distribution N, and what does it tell us?

It is the CDF of the standard normal distribution Z.

It’s input is the z-score of your result.

It tells us the probability of getting a result with a z-score as low or lower than your result.

Prob: What is the formula for Cov(X,Y)?

Cov(X,Y) = E[XY] - E[X]E[Y]