Lecture 4 Flashcards
(13 cards)
What is the binomial distribution used for?
Whenever you have a fixed number of independent ‘yes/no’ or ‘success/failure’ trials, and you want to know how many successes you get.
This distribution is useful for scenarios like customer retention rates.
Provide a business example of the binomial distribution.
Emailing 100 past customers where each has a 30% chance of buying again.
This illustrates calculating the probability of various numbers of customers returning.
When should you use the Poisson distribution?
Whenever you’re counting how many times a fairly rare event happens in a fixed window, assuming these events occur independently at a roughly constant average rate.
Examples include counting calls in an hour or website sign-ups in a day.
Give a business example of the Poisson distribution.
Knowing on average 5 purchases happen per hour, the Poisson distribution gives the probability of seeing 0, 1, 2, … purchases in the next hour.
This relates to measuring frequency of purchase.
What is the multinomial distribution used for?
When you run a fixed number of trials, but each trial can result in more than two categories.
This is applicable in scenarios like brand selection across multiple options.
Provide a business example of the multinomial distribution.
If every purchase picks one of three brands with specific probabilities, it tells you the chance of specific counts across those categories.
For instance, 40% chance Brand A, 30% Brand B, 30% Brand C across 100 purchases.
Why is the normal distribution so important?
Empirical reality → Many real-world measurements (heights, weights, ratings) “look” normal when you gather lots of data.
Sampling theory → Even if the original data aren’t exactly normal, the distribution of sample means becomes normal if your sample is moderately large (Central Limit Theorem).
Foundation for inference → The χ², t, and F distributions (used in hypothesis tests and ANOVAs) are mathematical “children” of normal variables.
Error theory → Repeated measurement errors usually show a normal pattern (small errors are common, large errors are rare), so we model noise as normal.
What is a normal distribution?
A symmetric, bell-shaped curve where most values cluster around a central average and taper off equally on both sides.
Normal distribution is characterized by its mean and standard deviation.
How does a log-normal distribution differ from a normal distribution?
It is a right-skewed curve where taking the logarithm of values produces a normal distribution, with most raw values being small and a few being very large.
Log-normal distributions are often found in financial data.
What does the chi-square (χ²) distribution represent?
A skewed-right curve that shows the distribution of the sum of squared standard normal variables, used to test how well observed counts match expected counts.
It is commonly used in hypothesis testing.
What is a t-distribution?
A bell-shaped curve similar to the normal but with heavier tails, accounting for extra uncertainty when estimating a mean from a small sample.
T-distribution is particularly useful for small sample sizes.
What is the purpose of the F-distribution?
A right-skewed curve formed by the ratio of two scaled chi-square variables, used to compare variances or test whether two sample variances are equal.
It is often used in ANOVA (Analysis of Variance).
What is a correlation?
Correlation measures how strongly two variables move together, indicating whether they increase or decrease in tandem.