Probability and CLT Flashcards
(35 cards)
What is the difference in descriptive and inferential statistics?
Descriptive statistics: A description of some collected data (sample); e.g. the average age of ..
Inferential statistics: What are the properties of a population? The observed data is assumed to be sampled from a larger population. Thus we need population estimates and hypothesis testing (we have uncertainty).
Why does inferential statistics need probabilities?
“If we know what a ‘random’ distribution looks like, we can tell random variation from non-random variation. Specific individual cases are
unpredictable, but they follow predictable laws in the aggregate. Once we learn to identify this ‘pattern of chance,’ we can confidently distinguish it from patterns that are not due to random phenomena.”
What does P(A) indicate?
Probability of A: is the proportion of elements from some set that satisfy the condition A. Mathematically, probability measures the size of a set in space Ω
e.g probability of rolling an even number = 3/6
How would you calculate the probability of finding a male students (80) in a psychology lecture hall? (400)
P(selected student = Male):
P(X = male) = Nmale / Ntot (set space)
80/400
=0.2
What are the two basic rules when calculating with probabilities?
Sum rule and product rule
What two basic concepts are there in probability theory?
Dependent vs independent probabilities and conditional probabilities
Explain the sum rule
P (A or B) = P (A) + P (B)
Probability of multiple events is the sum of the probabilities of each individual event.
Only if these events are mutual exclusive (cannot happen at the same time)
Example dices:
P (X = 1 or X = 2) = P (X = 1) + P (X = 2)
= 1/6 + 1/6
= 1/6
What can you do if the events are not mutually exclusive?
Use a more general sum rule:
P (A or B) = P (A) + P (B) −P (A and B)
P (M or SP ) = P (M ) + P (SP ) −P (M and SP )
How do we calculate P (A and B)? (3)
The product rule:
P (A and B) = P (A) ∗P (B) (if A and B are independent)
P (A and B) = P (A) ∗P (B | A) (if A and B are dependent)
P (A and B) = P (B) ∗P (A | B) (if A and B are dependent)
When are two events independent?
one event cannot influence the other’s
outcome
Mathematically when are two events independent?
Is P (Y = passedMath) = P ((Y = passedMath | X = M )? Or Is P (X = M ) = P (X = M | Y = passedMath)?
If so, P (X = M and Y = passedMath) are independent.
aka if the probability of a given b is the same as the probability of a
In probability theory, what is conditional probability?
A measure of the probability of an event given that (by assumption, presumption, assertion or evidence) that another event has occurred.
What does a probability distribution show?
Can be thought of as providing the probability of occurence of different possible outcomes in an experiment.
What kind of mathematical function can we use to describe the expected number of heads and tails when we flip a coin n times?
Binomial distribution (two terms): P(k successes) = (|n,k|) p^k(1-p)^n-k
.
.
What is meant by a Xhosa exam
Exam where you are given a word in the african language Xhosa and two possible translations. Ideally you are guessing and so it forms a normal binomial distribution between correct and incorrect answers
What is the probability of getting exactly 01011 in a Xhosa exam?
0.5 x 0.5 x 0.5 x 0.5 x 0.5 = 0.0312
You take a Xhosa exam with 5 questions. The suym score is 3. How many possible ways can you get that sum score?
What function in R would help with this?
choose(5, 3)
How do we calculate the probability of getting a sum score of 3?
You could get all the possible series resulting in 3 and divide them by all the possible series, but there is an easier way using the d binomial function in R:
p
How do we calculate the probability of getting a sum score of 3 or less?
You add the probabilities of getting sum score of 0 - 3, e.g in R:
1 - (choose(5, 5) * p^5 * (1-p)^0 + choose(5, 4) * p^4 * (1-p)^1) #1 - P(4) + P(5)
#or simply pbinom(3, 5, .5)
How can you calculate which lowest sum scores have a probability of 30% or less in R?
qbinom(.3, 5, .5)
Statistical inferences makes propositions about a population, using data drawn from the population with some form of sampling
How can we calculate how likely our observed data is? (2)
- Sampling distributions
- Central limit theorem
How do we quantify the expected variation of our estimation procedure?
Confidence intervals (next cue cards)
The sumscores on 10 Xhosa items have, for the population of 1000 psychology students, a binomial distribution
What would n, p and mean sumscore, u be?
n = 10 (items) p = .5 (probability of a correct answer) u = 5 (mean sum score)