Flashcards in Chapter 4 - Discrete Distributions Deck (17):

1

## Random variable

###
- random variable = function that assigns numeric values to different events in a sample space

- two types of random variables = discrete, continuous

2

## Discrete random variable

### a random variable for which there exists a discrete (finite) set of numeric values

3

## Continuous random variable

### a random variable whose possible values cannot be enumerated (infinite)

4

## Probability-mass function

###
- the values taken by a discrete random variable and its associated probabilities can be expressed by a rule or relationship called a probability-mass function

- assigns to any possible value r a discrete random variable X, the probability P(X = r)

- this assignment is made for all values r that have positive probability

5

## Expression of a probability-mass function

###
- a pmf can be displayed in a tabular form, or it can be expressed as a mathematical formula giving the probabilities of all possible values

- the probability of any particular value must be between 0 and 1, and the sum of the probabilities of all values must be exactly equal to 1

6

## Frequency distributions

###
- a list of each value in the data set and a corresponding count of how frequently the value occurs

- the frequency distribution can be considered as a sample analog to a probability distribution

- frequency distribution gives the actual proportion of points in a sample that correspond to specific values

7

## Goodness-of-fit

### the appropriateness of a model can be assessed by comparing the observed sample-frequency distribution with the probability distribution

8

## Expected value of a discrete random variable

###
- if a random variable has a large number of values with positive probability, then the pmf is not a useful summary

- measures of location and spread can be developed for a random variable in the same way as for samples

- expected value is also called population mean

9

## Variance of a discrete random variable

###
- the analog of the sample variance for a random variable

- also called population variance

- the variance represents the spread, relative to the expected value, of all values that have positive probability

- approximately 95% of the probability mass falls within two standard deviations of the mean of a random variable

10

## Cumulative-distribution function

###
- for a discrete random variable, the cdf looks like a series of steps, called the step function

- with the increase in number of values, the cdf approaches that of a smooth curve

11

## Permutations

###
- in a matched-pair design, each sample/case is matched with a normal control of the same sex and age

- once the first control is chosen, the second control can be chosen in (n-1) ways

12

## Combinations

###
- in an unmatched study design, cases and controls are selected in no particular order

- thus, the method of selecting n things taken k at a time without respect to order is referred to as the number of combinations

13

## Binomial distribution

###
- a sample of n independent trials, each of which can have only two possible outcomes

- the probability of a success at each trial is assumed to be some constant p

- the probability at each trial is 1-p=q

- number of trials n is finite, and the number of events can be no larger than n

14

## Calculating binomial probabilities

###
- for sufficiently large n, the normal distribution can be used to approximate the binomial distribution and tables of the normal distribution can be used to evaluate binomial probabilities

- if the sample size is not large enough to use normal approximation, then an electronic table can be used

15

## Expected value and variance of the binomial distribution

###
- the expected number of successes in n trials is the probability of success in one trial multiplied by p, which equals np

- for a given number of trials n, the binomial distribution has the highest variance when p=1/2

- variance decreases as p moves away from ½ becoming 0 when p=0 or p=1

16

## Poisson distribution

###
- the Poisson distribution is the second most frequently used discrete distribution after binomial distribution

- it is usually associated with rare events

- the number of trials is essentially infinite and the number of events can be indefinitely large

- however, probability of k events becomes very small as k increases

- the plotted distribution tends to become more symmetric as the time interval increases, or more specifically, as u increases

- for a Poisson distribution with parameter u, the mean and variance are both equal to u

17