Chapter 4: Normal Distribution Flashcards Preview

Statistics > Chapter 4: Normal Distribution > Flashcards

Flashcards in Chapter 4: Normal Distribution Deck (44)
Loading flashcards...
1
Q

The Normal Distribution

A
  • The normal distribution is the most important one in all of probability and statistics.
  • Many numerical populations have distributions that can be fit very closely by an appropriate normal curve.
  • Examples include heights, weights, and other physical characteristics, measurement errors in scientific experiments, reaction times in psychological experiments, measurements of intelligence and aptitude, scores on various tests, and numerous economic measures and indicators.

2
Q

Normal Distribution definition

A
3
Q

Parameters of the Normal Distribution

A
4
Q

Normal Distribution Graphs with different parameters (means and variances)

A
5
Q

Normal Distribution Graphs with different parameters (means and variances) (contd.)

A
6
Q

σ and µ in Normal Distribution Graph

A
  • Each density curve is symmetric about µ and bell-shaped, so the center of the bell (point of symmetry) is both the mean of the distribution and the median.
  • The value of σ is the distance from µ to the inflection points of the curve (the points at which the curve changes from turning downward to turning upward).
  • Large values of σ yield graphs that are quite spread out about µ, whereas small values of σ yield graphs with a high peak above m and most of the area under the graph quite close to µ.
  • Thus a large σ implies that a value of X far from µ may well be observed, whereas such a value is quite unlikely when σ is small.
7
Q

Every normal curve (regardless of its mean or standard deviation) conforms to the following “rule“:​

A
  • About 68% of the area under the curve falls within 1 standard deviation of the mean.
  • About 95% of the area under the curve falls within 2 standard deviations of the mean.
  • About 99.7% of the area under the curve falls within 3 standard deviations of the mean.
  • Collectively, these points are known as the empirical rule or the 68-95-99.7 rule. Clearly, given a normal distribution, most outcomes will be within 3 standard deviations of the mean.
8
Q

The Standard Normal Distribution

A
9
Q

Parameters of a Standard Normal Distribution

A

Definition
The normal distribution with parameter values µ = 0 and σ = 1 is called the standard normal distribution.
A random variable having a standard normal distribution is called a standard normal random variable and will be denoted by Z.

The pdf of Z is:

10
Q

Example 13

A

(see powerpoint slides 13-19)

11
Q

Example 14: 99th Percentile

A
  • The 99th percentile of the standard normal distribution is that value on the horizontal axis such that the area under the z curve to the left of the value is .9900.
  • Appendix Table A.3 , gives, for fixed z, the area under the standard normal curve to the left of z, whereas here we have the area and want the value of z.
  • This is the “inverse” problem to P(Z <= z) = ?
    • so the table is used in an inverse fashion:
  • Find in the middle of the table .9900; the row and column in which it lies identify the 99th z percentile.

Here .9901 lies at the intersection of the row marked 2.3 and column marked .03, so the 99th percentile is (approximately) z = 2.33.

12
Q

Percentiles of the Standard Normal Distribution

A
  • In general, the (100p)th percentile is identified by the row and column of Appendix Table A.3 in which the entry p is found (e.g., the 67th percentile is obtained by finding .6700 in the body of the table, which gives z = .44).
  • If p does not appear, the number closest to it is often used, although linear interpolation gives a more accurate answer.
  • For example, to find the 95th percentile, we look for .9500 inside the table.
  • Although .9500 does not appear, both .9495 and .9505 do, corresponding to z = 1.64 and 1.65, respectively.
  • Since .9500 is halfway between the two probabilities that do appear, we will use 1.645 as the 95th percentile and –1.645 as the 5th percentile.
13
Q

za Notation for z Critical Values

A

In statistical inference, we will need the values on the horizontal z-axis that capture certain small tail areas under the standard normal curve.

Notation
za will denote the value on the z-axis for which a (alpha) of the area under the z curve lies to the right of za.
(See Figure 4.19.)

For example, z.10 captures upper-tail area .10, and z.01 captures upper-tail area .01.

Since a (alpha) of the area under the z curve lies to the right of za, 1 – a of the area lies to its left. Thus za is the 100(1 – a)th percentile of the standard normal distribution.

By symmetry, the area under the standard normal curve to the left of –za is also a. The za’s are usually referred to as z critical values.

14
Q

Most Useful z percentiles and za values

A
15
Q

Example 15

A
16
Q

Non-standard Normal Distributions

A
17
Q

Non-standard Normal Distributions (contd.)

A
18
Q

Non-standard Normal Distributions (contd. part 2)

A

The key idea of the proposition is that by standardizing, any
probability involving X can be expressed as a probability involving a standard normal rv Z, so that Appendix Table A.3 can be used.

This is illustrated in Figure 4.21.

19
Q

Example 16

  • The time that it takes a driver to react to the brake lights on a decelerating vehicle is critical in helping to avoid rear-end collisions.
  • The article “Fast-Rise Brake Lamp as a Collision-Prevention Device” (Ergonomics, 1993: 391–395) suggests that reaction time for an in-traffic response to a brake signal from standard brake lights can be modeled with a normal distribution having mean value 1.25 sec and standard deviation of .46 sec.

What is the probability that reaction time is between 1.00 sec and 1.75 sec?

A
20
Q

Example 16 contd.

A
21
Q

Percentiles of an Arbitrary Normal Distribution

A
22
Q

Example 18

The amount of distilled water dispensed by a certain machine is normally distributed with mean value 64 oz and standard deviation .78 oz.

What container size c will ensure that overflow occurs only .5% of the time? If X denotes the amount dispensed, the desired condition is that P(X > c) = .005, or, equivalently, that P(X <= c) = .995.

A

Thus c is the 99.5th percentile of the normal distribution with µ = 64 and σ = .78.

23
Q

Example 18 contd.

A
24
Q

The Normal Distribution and Discrete Populations

A
  • The normal distribution is often used as an approximation to the distribution of values in a discrete population.
  • In such situations, extra care should be taken to ensure that probabilities are computed in an accurate manner
25
Q

Normal approximation to binomial

A
26
Q

Gamma Distribution

A
27
Q

Gamma Density Curves

A
28
Q

Gamma Distribution Parameters

A
29
Q

The Chi-squared Distribution

A
30
Q

Chi-squared Densities

A
31
Q

Chi-Squared Distribution

A
  • Basis for a number of procedures in statistical inference (apparent in coming lectures)
  • Statistical tables for the chi-squared distribution –> Table A.7 of your text book
32
Q

Log-Normal Distribution

A
33
Q

The Weibull Distribution

A
34
Q

The Weibull Distribution (contd.)

A
35
Q

Weibull Distribution Density Curves

A
36
Q

Weibull Distribution Parameters

A
37
Q

Cdf of Weibull Distribution

A
38
Q

Example 25: Weibull Distribution

A
39
Q

Example 25: Weibull Distribution (contd.)

A
40
Q

Probability Plots Introduction

A
  • An investigator will often have obtained a numerical sample x1, x2, …, xn and wish to know whether it is plausible that it came from a population distribution of some particular type (e.g., from a normal distribution).
  • For one thing, many formal procedures from statistical inference are based on the assumption that the population distribution is of a specified type.
  • The use of such a procedure is inappropriate if the actual underlying probability distribution differs greatly from the assumed type.
  • For example, the article “Toothpaste Detergents: A Potential Source of Oral Soft Tissue Damage” (Intl. J. of Dental Hygiene, 2008: 193–198) contains the following
    statement:
  • “Because the sample number for each experiment (replication) was limited to three wells per treatment type, the data were assumed to be normally distributed.”
  • As justification for this leap of faith, the authors wrote that “Descriptive statistics showed standard deviations that suggested a normal distribution to be highly likely.” ​Note: This argument is not very persuasive.
  • Additionally, understanding the underlying distribution can sometimes give insight into the physical mechanisms involved in generating the data.
  • An effective way to check a distributional assumption is to construct what is called a probability plot.

41
Q

Probability Plots

A
  • The essence of such a plot is that if the distribution on which the plot is based is correct, the points in the plot should fall close to a straight line.
  • If the actual distribution is quite different from the one used to construct the plot, the points will likely depart substantially from a linear pattern.
42
Q

Example 29: Probability Plots

A
  • The value of a certain physical constant is known to an experimenter.
  • The experimenter makes n = 10 independent measurements of this value, using a particular measurement device and records the resulting measurement errors (error = observed value – true value).
  • The percentiles of the sample data appear below. The needed standard normal (z) percentiles are also displayed in the table.
  • Is it plausible that the random variable measurement error has a standard normal distribution?
    • Thus the points in the probability plot are:

(–1.645, –1.91), (–1.037, –1.25),…, and (1.645, 1.56).

43
Q

Example 29: Probability Plots (contd.)

A
  • Figure 4.33 shows the resulting plot. Although the points deviate a bit from the 45° line, the predominant impression is that this line fits the points very well.
  • The plot suggests that the standard normal distribution is a reasonable probability model for measurement error
44
Q

Example 29: contd. pt. 2

A
  • Similarly, the two largest sample observations are much smaller than the associated z percentiles.
  • This plot indicates that the standard normal distribution would not be a plausible choice for the probability model that gave rise to these observed measurement errors.