distributions Flashcards
binomial distribution
fixed number, N, of Bernoulli trials, with random variable X = # of successes after the N trials
geometric distribution
infinite number of Bernoulli trials, with random variable X = # of trials until 1st success
(this is a kind of discrete exponential distribution)
negative binomial
infinite number of Bernoulli trials, with random variable X = # of trials until r successes
hypergeometric
N items, with r<=N being a certain kind; choose n<=N items without replacement; random variable X = # of items of type r being in that sample of n
(eg fish stocking experiment, distribution of tagged catch proportions)
Poisson distribution
“chained” iid exponential distributions, in a fixed space or time interval s; random variable X = number of events in s (one of the two main distributions associated with Poisson process)
multinomial distribution
fixed number, N, of multinomial trials, with probability of items 1 thru k being p1,…,pk, random variable X = a fixed event count tuple for items 1 thru k
uniform distribution
exponential distribution
continuous pdf, of type L exp(-L x), on [0,inf)
mean=1/L; var=1/L^2
has the memoryless property–P(X>=x) is again an exponential distribution with parameter L (renormalizing)
gamma distribution
“chained” iid exponential distributions; assuming r events, random variable X = length of time (or distance) for the rth event to occur (one of two main distros associated with Poisson process)
Weibull distribution
models a broad range of random variables, largely of the nature of a time to failure or time between events
a L^a x^{a-1} exp(-(Lx)^a)
beta distribution
used to model proportions
related to binomial distribution–the beta distribution has the number of successes as a parameter, and the (binomial) “probability” (or proportion of successes) as the random variable (Beta can be considered a conjugate prior to binomial, among others)
(Gam(a+b)/(Gam(a)Gam(b))) x^{a-1} (1-x)^{b-1}
normal distribution
- X ~ N(mu,sig^2)
- (1/(sig sqrt(2 pi))) exp(-(x-mu)^2 / (2 sig^2))
- 68% is within 1 std dev of mean
- 95% is within 2 std devs of mean
lognormal distribution
- if log(X) is normally distributed, then X has a lognormal distribution
- note mu and sigma are for the log of the RV values
Chi-squared distribution
given iid X_i ~ N(0,1), random variable Y=X_1^2+…+X_k^2 has a Chi-square distribution with k degrees of freedom
Student’s t-distribution
- for sample mean distribution, when population variance is unknown (population assumed approximately normal)
- if pop. variance known, then can use iid with that variance -> CLT
- Student’s allows using sample variance in lieu
- the distribution of sample variance is Chi-square, provided the population is approximately normal (Cochran’s theorem)
- form the t-statistic, (x-mu) / (sig/sqrt(n)), for n samples with mean x, (unbiased) sample variance sig^2 (Bessel correction)
- the denominator is the sample mean standard error estimate (from linear comb. of r.v.’s)
- if using the CLT, the denominator would remain the same, with sig^2 now population variance
- the t-statistic has t-distribution, N(0,1) / sqrt(X^2 / v) (Hayter’s formulation)
- X^2 is a Chi-square distribution
- v is the degrees of freedom, which equals n-1 for Student’s
- the distributions (top and bottom) are independent