Chapter 4 - Discrete Distributions Flashcards Preview

Statistics > Chapter 4 - Discrete Distributions > Flashcards

Flashcards in Chapter 4 - Discrete Distributions Deck (17):

Random variable

- random variable = function that assigns numeric values to different events in a sample space
- two types of random variables = discrete, continuous


Discrete random variable

a random variable for which there exists a discrete (finite) set of numeric values


Continuous random variable

a random variable whose possible values cannot be enumerated (infinite)


Probability-mass function

- the values taken by a discrete random variable and its associated probabilities can be expressed by a rule or relationship called a probability-mass function
- assigns to any possible value r a discrete random variable X, the probability P(X = r)
- this assignment is made for all values r that have positive probability


Expression of a probability-mass function

- a pmf can be displayed in a tabular form, or it can be expressed as a mathematical formula giving the probabilities of all possible values
- the probability of any particular value must be between 0 and 1, and the sum of the probabilities of all values must be exactly equal to 1


Frequency distributions

- a list of each value in the data set and a corresponding count of how frequently the value occurs
- the frequency distribution can be considered as a sample analog to a probability distribution
- frequency distribution gives the actual proportion of points in a sample that correspond to specific values



the appropriateness of a model can be assessed by comparing the observed sample-frequency distribution with the probability distribution


Expected value of a discrete random variable

- if a random variable has a large number of values with positive probability, then the pmf is not a useful summary
- measures of location and spread can be developed for a random variable in the same way as for samples
- expected value is also called population mean


Variance of a discrete random variable

- the analog of the sample variance for a random variable
- also called population variance
- the variance represents the spread, relative to the expected value, of all values that have positive probability
- approximately 95% of the probability mass falls within two standard deviations of the mean of a random variable


Cumulative-distribution function

- for a discrete random variable, the cdf looks like a series of steps, called the step function
- with the increase in number of values, the cdf approaches that of a smooth curve



- in a matched-pair design, each sample/case is matched with a normal control of the same sex and age
- once the first control is chosen, the second control can be chosen in (n-1) ways



- in an unmatched study design, cases and controls are selected in no particular order
- thus, the method of selecting n things taken k at a time without respect to order is referred to as the number of combinations


Binomial distribution

- a sample of n independent trials, each of which can have only two possible outcomes
- the probability of a success at each trial is assumed to be some constant p
- the probability at each trial is 1-p=q
- number of trials n is finite, and the number of events can be no larger than n


Calculating binomial probabilities

- for sufficiently large n, the normal distribution can be used to approximate the binomial distribution and tables of the normal distribution can be used to evaluate binomial probabilities
- if the sample size is not large enough to use normal approximation, then an electronic table can be used


Expected value and variance of the binomial distribution

- the expected number of successes in n trials is the probability of success in one trial multiplied by p, which equals np
- for a given number of trials n, the binomial distribution has the highest variance when p=1/2
- variance decreases as p moves away from ½ becoming 0 when p=0 or p=1


Poisson distribution

- the Poisson distribution is the second most frequently used discrete distribution after binomial distribution
- it is usually associated with rare events
- the number of trials is essentially infinite and the number of events can be indefinitely large
- however, probability of k events becomes very small as k increases
- the plotted distribution tends to become more symmetric as the time interval increases, or more specifically, as u increases
- for a Poisson distribution with parameter u, the mean and variance are both equal to u


Poisson approximation to the binomial distribution

- the binomial distribution with large n and small p can be accurately approximated by a Poisson distribution with parameter u=np
- the mean of this distribution is given by np and the variance by npq