OpenIntro 4 Flashcards
(32 cards)
Percentiles
Percentile is the percentage of observations that fall below a given data point. Graphically, percentile is the area below the probability distribution curve to the left of that observation.
Z scores
Z = observation − mean / SD Z score of an observation is the number of standard deviations it falls above or below the mean. Z scores are defined for distributions of any shape, but only when the distribution is normal can we use Z scores to calculate percentiles. (Observations that are more than 2 SD away from the mean are usually considered unusual)
Calculating the Z score
Z = observation − mean / SD
if z score = negative - it is below mean
if z score = positive - above the mean

Pnorm() of Z score to get percentile
round(pnorm(1.50), digits=4)
output: 0.9332 = 93,32 % of possoms
Pnorm shows distribution up to the Z score
1 - pnorm shows the data from Z score and on
Find the head lenght between 2 Z scores
subtract the 2 Z-scores from eachother
Solving equation for percentile
Calculating percentiles
There are many ways to compute percentiles/areas under the
curve:
In R:
> pnorm(1800, mean = 1500, sd = 300)
output: 0.8413447
Heinz ketchup factory the amounts which go into bottles of ketchup
are supposed to be normally distributed with:
mean 36 oz.
standarddeviation 0.11 oz.
What percent of bottles have less than
35.8 ounces of ketchup?
pnorm(35.8, mean = 36, sd = 0.11)
output: 0.0345
qnorm()
Bernouilli random variable
only 2 outcomes
Geometric distribution
If p represents probability of success, (1 − p) represents probability
of failure, and n represents number of independent trials
P(success on the nth trial) = (1 − p)n−1p
Geometric distribution needs:
- independence: outcomes of trials don’t affect each other
- identical: the probability of success is the same for each trial
Example from book:
On average, how many transistors would you expect to be produced before the first with a defect? What is the standard deviation?
1/0.02 = 50 made until one defect one
SD: sqrt (1 - 0.02 / 0.02^2) = 49.49
roling a 6 (Geometric distribution)
Suppose we randomly select four individuals to participate in this
experiment. What is the probability that exactly 1 of them will
refuse to administer the shock?
The Binomial distribution describes the probability of having
exactly k successes in n independent Bernouilli trials with
probability of success p.
Let’s call these people Allen (A), Brittany (B), Caroline (C), and
Damian (D). Each one of the four scenarios below will satisfy the
condition of “exactly 1 of them refuses to administer the shock”:
The probability of exactly one 1 of 4 people refusing to administer
the shock is the sum of all of these probabilities.
0.0961 + 0.0961 + 0.0961 + 0.0961 = 4 × 0.0961 = 0.3844
When is it binomial distribution?
What part does pnorm() return?
The stuff up to the input
Put 1 - to get the stuff after the input
68-95-99.7 rule
Further on Z scores
The z-score is positive if the value lies above the mean, and negative if it lies below the mean
A z-score describes the position of a raw score in terms of its distance from the mean, when measured in standard deviation units
Tells you “how far ahead, or how far below your raw score is from the population mean, in standard deviation units“.
qnorm() and pnorm()
The qnorm() function is simply the inverse of pnorm()
fx:
> qnorm(.10) (percentile)
[1] -1.281552
> pnorm(-1.28) (Z-score)
[1] 0.1002726
Find the cutoff
- the warmest 5 %
The 4 conditions of the binomial distribution
1: The number of observations n is fixed.
2: Each observation is independent.
3: Each observation represents one of two outcomes (“success” or “failure”).
4: The probability of “success” p is the same for each outcome.
The National Vaccine Information Center estimates that 90% of Americans have
had chickenpox by the time they reach adulthood.
(b) Calculate the probability that exactly 97 out of 100 randomly sampled American adults had chickenpox during childhood.
(c) What is the probability that exactly 3 out of a new sample of 100 American adults have not had chickenpox in their childhood?
> dbinom(97,100,.90)
[1] 0.005891602
> dbinom(03,100,.10)
[1] 0.005891602
Geometric distribution - expected value
How many people is Dr. Smith expected to test before finding the first one that refuses to administer the shock?
The expected value, or the mean, of a geometric distribution is defined as
1 p . µ = 1 p = 1 0.35 = 2.86
She is expected to test 2.86 people before finding the first one that refuses to administer the shock.
Choose function
k successes in n trials.
Note: You can also use R for these calculations: > choose(9,2)
output: 36
A 2012 Gallup survey suggests that 26.2% of Americans are obese. Among a random sample of 10 Americans, what is the probability that exactly 8 are obese?
(binomial distribution)
(n=10 k=8) × (0.262)^8 × (0.738)^2 = 0.0005