Stats Exam 2 Flashcards Preview

Spring Semester > Stats Exam 2 > Flashcards

Flashcards in Stats Exam 2 Deck (60):
1

probability (chance)

The likelihood that something will occur. Probability is a mathematical description of randomness and uncertainty. It is a way to measure or quantify uncertainty. The probability of an event ranges from 0 to 1 (0 ≤ P(A) ≤ 1). Probability can be expressed in decimals or percentages.
2 approaches: classical and relative frequencies

2

classical (theoretical) approach

predictable events
(rolling dice)
equallly likely, predictable outcomes
P= # of ways to succeed / # of possible outcomes

3

relative frequencies (empirical)

outcomes NOT inherently predictable
1) run a bunch of trials
2) count the number of successes
P= successful trials/ all trials
empirically derived information

4

P=0

no chance of occurring

5

P=1

absolute certainty

6

event

a specific outcome from a trial. defined by a scenario or question

7

simple events

events that cannot be broken down further

8

sample space

the collection of all possible outcomes

9

equal liklihood rule

when all outcomes are equally likely, the probability of event A is the number of ways A can happen divided by the number of outcomes in the sample space

10

Non-disjointed

a double negative term that just means events CAN happen together

11

Addition Rule for non-disjointed events

P(A or B) = P(A) + P(B) - P(A and B)

12

disjointed

events that CAN NOT co-occur

13

OR vs AND

OR= ADD
AND= MULTIPLY

14

Addition Rule for Disjointed independent outcomes

P(A or B) = P(A) + P(B)

15

Multiply independent probabilites when..

1. two or more conditions must exist
2. 2 or more outcomes must occur together or sequentially
independence & non-disjointed

16

complementary probabilites

two mutually exclusive outcomes with a combined probability of 1
P(A) + P(not A)=1

17

P(at least one)

Use complementary probabilities
1) the complement of at least one is non
P(at least one) + P(none) = 1
2) P(none) is calculated in one step
3. Subtract P(none) from 1
P(at least one) = 1-P(none)

18

conditional probability

the probability of an event or condition given that another influential event or condition already occurred

19

conditional probability notation

P(B|A)= probability of B given that A has occurred or among subjects characterized by A

20

Benefits of 2-way table in probability

1) easy to set up
2) clearly display the sample space, event, & simple events
3)conditional probabilities can be found with fewer calculations & no formulas

21

important notes regarding conditional probabilities

1) P(A|B) is the inverse of P(B|A).
2) P(A|B) is NOT the compliment of P(B|A)
3) Complimentary probabilities have the same sample space: P(A|B) and P(not A|B)

22

4 tests to identify conditional probabilites

All are false or all are true. If ANY are false, the probabilities are conditional.
1) P(B | A) = P(B)
2) P(A | B) = P(A)
3) P(B | A) = P(B | not A)
4) P(A and B) = P(A) * P(B)
only need one test

23

General Addition Rules for conditional probabilities

P(A or B) = P(A) + P(B) – P(A and B)

24

General multiplication rule for conditional probabilities

P(A and B) = P(A)*P(B|A)

25

conditional probabilities formula

P(B|A) = P(A and B)/ P(A)

26

Law of total probability

P(B) = P(A) * P(B | A) + P(not A) * P(B | not A)

27

Bayes Theorem

1) A method for mathematically relating inverse conditional probabilities
2) Applies to sequential events that are not independent (are conditional) and assumes that one event has already happened.
relates P(A|B) to P(B|A)

28

Sensitivity

Test accuracy when a condition is present. P(S|D)

29

Specificity

– Test accuracy when a condition is absent.
P(not S|not D)

30

False Positive

+ test, but no disease/condition
P(S|not D)

31

False Negative

- test, but disease/condition exists
P(not S|D)

32

Positive predictive value (PPV)

The probability a disease/condition is present, given a positive test.
P(D|S)

33

Negative Predictive Value (NPV)

The probability a disease/condition is absent, given a negative test.
P(not D|not S)

34

random variable

A variable that acquires unique numerical values determined by random trials.

Random trials have outcomes that are determined by chance and influenced by probability.

35

2 types of random variables

discrete & continous

36

discrete random variables

-countable and determined by chance (“X”)
-quantified with whole numbers

37

continuous random varaible

-Measurable and determined by chance
-Measured on interval or ratio scales

38

P(X=x)

X= big X, a discrete random variable
x= little x, specific values of X

39

binomial variables

a variable with only two possible outcomes:
cancer/no cancer prognosis

40

4 Criteria for binomial variables

1) A fixed number of trials exists (n)
2) Each trial is independent
3) The same two possible outcomes per trial
"success" (the outcome of interest)
"failure” (the complimentary outcome)
4) The probability of success (p), or failure (1-p) is the same for every trial.

The language of binomial variables
“X is Binomial with n = … and p = …”

41

density curve/ probability density function

-Total area = 1
-Continuous data (Interval or Ratio)
-Probability of value ranges = Area
Area under the curve correlates with probability

42

Understanding the likelihood of events boils down to:

1) knowing how many standard deviations away from the mean the events are.
2) assigning the events to tail areas.

43

Z score

= how many standard deviations a value is from the population mean

A measure of how many standard deviations from the mean a given value or sample mean or sample proportion is. Z scores are used only with normal distributions. A Z score separates an area under the normal curve and left of the Z score, which is the probability associated with the range of values less then the one used in the Z score. The complement to this area, is the area right of the Z score and is the probability associated with the range of values greater than the one used in the Z score.

44

Likelihood =

Area

45

Rare Event Rule

If under a given assumption, the probability of a particular observed event is very small (

46

Normal as Approximation to Binomial

1) The binomial is often close enough to a normal curve, such that we can use Z-scores.
Specifically, when:


2) Continuity correction: discrete values made “continuous-like”.

3) Calculate Z scores using mean and std dev of binomial.

47

Law of Large numbers

The actual (or true) probability of an event (A) is estimated by the relative frequency with which the event occurs in a long series of trials. As the number of trials increases, the relative frequency becomes the actual probability. Thus, as the number of trials increases, the empirical probability gets closer and closer to the theoretical probability.

48

Random experiment

An experiment that produces an outcome that cannot be predicted in advance (hence the uncertainty).

49

venn diagram

A visual display for independent events, showing complimentary, disjoint events and non-disjoint events

50

tests for independence

If any of the 4 tests below are true, the variables are independent. They will either all be true, or they will all be false. When the tests are false, the variables A and B are dependent on each other, and their probabilities are conditional.
1) P(B | A) = P(B)
2) P(A | B) = P(A)
3) P(B | A) = P(B | not A)
4) P(A and B) = P(A) * P(B)

51

probability tree

A diagram for showing probabilities for events that occur in stages and that involve conditional probabilities.

52

probability distribution

A distribution of all possible values and probabilities of a random variable. It represents a population of values, not a sample. Probabilities range between 0 and 1 and must sum to 1.

53

probability histogram

A histogram, with discrete events on the X-axis and probability on the Y-axis. Each rectangle has a width of 1 and the area of all rectangles sum to 1.

54

mean of a random variable

Average of events weighted by their probability of occurring

55

variance

Squared standard deviation (σ2).

56

standard deviation of a random variable

The typical (or long-run average) distance between the mean of the random variable and the values it takes.

57

probability density curve

A smooth curve showing the relationship between a continuous random variable and probability. The area under the curve = 1. The curve can be subdivided into ranges of values with a probability equal to their area, but the probability of a single value can not be precisely calculated.

58

Normal curve

The distribution of a continuous random that has a single peak and is symmetrical about the center

59

normal table

a table of z scores and probabilities. The table indicates the probability of a normal variable taking on any value less than the standardized z score provided.

60

unstandardizing a z score

When given a probability and asked to find the value associated with this probability, we find the z score corresponding to the probability, and solve the z score formula for x.