Week 2 - Lesson 5.2 The Density Curve of the Normal Distribution Flashcards

1
Q

an idealized representation of a distribution in which the area under the curve is defined to be 1.this need not be normal, but the normal density curve will be the most useful to us.

A

Density curve

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

We already know from the Empirical Rule that approximately 2
3 of the data in a normal distribution lies within 1
standard deviation of the mean. With a normal density curve, this means that about 68% of the total area under the curve is within z-scores of ±1. Look at the following three density curves:

A

-read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

You may have noticed that the density curve changes shape at two points in each of our examples. These are the
points where the curve changes concavity. Starting from the mean and heading outward to the left and right, the
curve is concave down. (It looks like a mountain, or ’n’ shape.) After passing these points, the curve is concave
up. (It looks like a valley, or ’u’ shape.) The points at which the curve changes from being concave up to being
concave down are called the inflection points. On a normal density curve, these inflection points are always exactly
one standard deviation away from the mean.

A

-read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Example: Estimate the standard deviation of the distribution represented by the following histogram.

A

This distribution is fairly normal, so we could draw a density curve to approximate it as follows:

Now estimate the inflection points as shown below:

It appears that the mean is about 0.5 and that the x-coordinates of the inflection points are about 0.45 and 0.55,
respectively. This would lead to an estimate of about 0.05 for the standard deviation.

The actual statistics for this distribution are as follows:
s ⇡ 0.04988
x ⇡ 0.4997

We can verify these figures by using the expectations from the Empirical Rule. In the following graph, we have
highlighted the bins that are contained within one standard deviation of the mean.

If you estimate the relative frequencies from each bin, their total is remarkably close to 68%. Make sure to divide
the relative frequencies from the bins on the ends by 2 when performing your calculation.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

While it is convenient to estimate areas under a normal curve using the Empirical Rule, we often need more precise
methods to calculate these areas. Luckily, we can use formulas or technology to help us with the calculations.

A

-read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

All normal distributions have the same basic shape, and therefore, rescaling and re-centering can be implemented
to change any normal distributions to one with a mean of 0 and a standard deviation of 1. This configuration is
referred to as a standard normal distribution. In a standard normal distribution, the variable along the horizontal
axis is the z-score. This score is another measure of the performance of an individual score in a population. To
review, the z-score measures how many standard deviations a score is away from the mean. The z-score of the term
x in a population distribution whose mean is µ and whose standard deviation is s is given by: z = xµ
s . Since s is
always positive, z will be positive when x is greater than µ and negative when x is less than µ. A z-score of 0 means
that the term has the same value as the mean. The value of z is the number of standard deviations the given value of
x is above or below the mean.

A

-read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Example: On a nationwide math test, the mean was 65 and the standard deviation was 10. If Robert scored 81, what
was his z-score?

A

Z=1.6

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Example: On a college entrance exam, the mean was 70 and the standard deviation was 8. If Helen’s z-score was
1.5, what was her exam score?

A

X=58

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Now you will see how z-scores are used to determine the probability of an event.
Suppose you were to toss 8 coins 256 times. The following figure shows the histogram and the approximating
normal curve for the experiment. The random variable represents the number of tails obtained

A

The blue section of the graph represents the probability that exactly 3 of the coins turned up tails. One way to
determine this is by the following:

Geometrically, this probability represents the area of the blue shaded bar divided by the total area of the bars. The
area of the blue shaded bar is approximately equal to the area under the normal curve from 2.5 to 3.5.

Since areas under normal curves correspond to the probability of an event occurring, a special normal distribution
table is used to calculate the probabilities. This table can be found in any statistics book, but it is seldom used today.
The following is an example of a table of z-scores and a brief explanation of how it works:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

The values inside the given table represent the areas under the standard normal curve for values between 0 and
the relative z-score. For example, to determine the area under the curve between z-scores of 0 and 2.36, look in
the intersecting cell for the row labeled 2.3 and the column labeled 0.06. The area under the curve is 0.4909. To
determine the area between 0 and a negative value, look in the intersecting cell of the row and column which sums
to the absolute value of the number in question. For example, the area under the curve between 1.3 and 0 is equal
to the area under the curve between 1.3 and 0, so look at the cell that is the intersection of the 1.3 row and the 0.00
column. (The area is 0.4032.)
It is extremely important, especially when you first start with these calculations, that you get in the habit of relating
it to the normal distribution by drawing a sketch of the situation. In this case, simply draw a sketch of a standard
normal curve with the appropriate region shaded and labeled.

A

-read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Example: Find the probability of choosing a value that is greater than z = 0.528. Before even using the table, first
draw a sketch and estimate the probability. This z-score is just below the mean, so the answer should be more than
0.5.

A

Next, read the table to find the correct probability for the data below this z-score. We must first round this z-score
to 0.53, so this will slightly under-estimate the probability, but it is the best we can do using the table. The table
returns a value of 0.50.2019 = 0.2981 as the area below this z-score. Because the area under the density curve is
equal to 1, we can subtract this value from 1 to find the correct probability of about 0.7019.

What about values between two z-scores? While it is an interesting and worthwhile exercise to do this using a table,
it is so much simpler using software or a graphing calculator.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Example: Find P(2.60 < z < 1.30)

A

This probability can be calculated as follows:
P(2.60 < z < 1.30) = P(z < 1.30)P(z < 2.60) = 0.90320.0047 = 0.8985

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

It can also be found using the TI-83/84 calculator. Use the ’normalcdf(2.60, 1.30, 0, 1)’ command, and the
calculator will return the result 0.898538. The syntax for this command is ’normalcdf(min, max, µ, s)’. When
using this command, you do not need to first standardize. You can use the mean and standard deviation of the given
distribution.
Technology Note: The ’normalcdf(’ Command on the TI-83/84 Calculator
Your graphing calculator has already been programmed to calculate probabilities for a normal density curve using
what is called a cumulative density function. The command you will use is found in the DISTR menu, which you
can bring up by pressing [2ND][DISTR]
Press [2] to select the ’normalcdf(’ command, which has a syntax of ’normalcdf(lower bound, upper bound, mean,
standard deviation)’.
The command has been programmed so that if you do not specify a mean and standard deviation, it will default to
the standard normal curve, with µ = 0 and s = 1.
For example, entering ’normalcdf(1, 1)’ will specify the area within one standard deviation of the mean, which we
already know to be approximately 0.68.

A

-read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

’Normalcdf (a,b,µ,s)’ gives values of the cumulative normal density function. In other words, it gives the probability of an event occurring between x = a and x = b, or the area under the probability density curve between the
vertical lines x = a and x = b, where the normal distribution has a mean of µ and a standard deviation of s. If µ and
s are not specified, it is assumed that µ = 0 and s = 1

A

-read

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Example: Find the probability that x < 1.58.

A

The calculator command must have both an upper and lower bound. Technically, though, the density curve does not
have a lower bound, as it continues infinitely in both directions. We do know, however, that a very small percentage
of the data is below 3 standard deviations to the left of the mean. Use 3 as the lower bound and see what answer
you get.

The answer is fairly accurate, but you must remember that there is really still some area under the probability density
curve, even though it is just a little, that we are leaving out if we stop at 3. If you look at the z-table, you can see
that we are, in fact, leaving out about 0.50.4987 = 0.0013. Next, try going out to 4 and 5.

Once we get to 5, the answer is quite accurate. Since we cannot really capture all the data, entering a sufficiently
small value should be enough for any reasonable degree of accuracy. A quick and easy way to handle this is to enter
99999 (or “a bunch of nines”). It really doesn’t matter exactly how many nines you enter. The difference between
five and six nines will be beyond the accuracy that even your calculator can display

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Example: Find the probability for x 0.528

A

Right away, we are at an advantage using the calculator, because we do not have to round off the z-score. Enter the
’normalcdf(’ command, using 0.528 to “a bunch of nines.” The nines represent a ridiculously large upper bound
that will insure that the unaccounted-for probability will be so small that it will be virtually undetectable.

Remember that because of rounding, our answer from the table was slightly too small, so when we subtracted it from
1, our final answer was slightly too large. The calculator answer of about 0.70125 is a more accurate approximation
than the answer arrived at by using the table.

17
Q

In most practical problems involving normal distributions, the curve will not be as we have seen so far, with µ = 0
and s = 1. When using a z-table, you will first have to standardize the distribution by calculating the z-score(s).

A

-read

18
Q

Example: A candy company sells small bags of candy and attempts to keep the number of pieces in each bag
the same, though small differences due to random variation in the packaging process lead to different amounts in
individual packages. A quality control expert from the company has determined that the mean number of pieces in
each bag is normally distributed, with a mean of 57.3 and a standard deviation of 1.2. Endy opened a bag of candy
and felt he was cheated. His bag contained only 55 candies. Does Endy have reason to complain?

A

To determine if Endy was cheated, first calculate the z-score for 55

Using a table, the probability of experiencing a value this low is approximately 0.5 0.4719 = 0.0281. In other
words, there is about a 3% chance that you would get a bag of candy with 55 or fewer pieces, so Endy should feel
cheated.
Using a graphing calculator, the results would look as follows (the ’Ans’ function has been used to avoid rounding
off the z-score):

However, one of the advantages of using a calculator is that it is unnecessary to standardize. We can simply enter
the mean and standard deviation from the original population distribution of candy, avoiding the z-score calculation
completely

19
Q

A density curve is an idealized representation of a distribution in which the area under the curve is defined as 1,
or in terms of percentages, a probability of 100%. A normal density curve is simply a density curve for a normal
distribution. Normal density curves have two inflection points, which are the points on the curve where it changes
concavity. These points correspond to the points in the normal distribution that are exactly 1 standard deviation away
from the mean. Applying the Empirical Rule tells us that the area under the normal density curve between these
two points is approximately 0.68. This is most commonly thought of in terms of probability (e.g., the probability of
choosing a value at random from this distribution and having it be within 1 standard deviation of the mean is 0.68).
Calculating other areas under the curve can be done by using a z-table or by using the ’normalcdf(’ command on the
TI-83/84 calculator. A z-table often provides the area under the standard normal density curve between the mean
and a particular z-score. The calculator command allows you to specify two values, either standardized or not, and
will calculate the area under the curve between these values.

A

-read