week 1 Flashcards

(70 cards)

1
Q

Probability

A

: a branch of mathematics concerning the analysis of random phenomena.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Random phenomena

A

processes with an uncertain outcome.
(e.g., flipping a coin; gambling games)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Inferential statistics and probability are related because…

A

sampling a group of people from the population is a random phenomenon.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

In probability, we know the true model/mechanism in the population. Based on the true model, we compute…

A

the probability of different outcomes.

e.g., If I flip a fair coin 10 times, how likely is it that I will get 5 heads?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

In inferential statistics, we do NOT know…

A

the true model/mechanism in the population. We infer the true model-based on the outcomes from our sample data

e.g., If my friend flips a coin 10 times and gets 10 heads, are they playing a trick on me? In other words, is the coin a fair coin?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

In probability, the term experiment is used in a loose sense to mean…

A

a procedure for which the outcome is uncertain.

Examples of experiments include:
§ an experimental study
§ toss of a coin

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

sample space of the experiment

A

The set of all possible outcomes of an experiment

is denoted by S.

§ Best to think of the sample space as an area.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

random event or an event

A

A subset of the sample space

If the experiment consists of flipping two coins, then an event can be getting head on the first coin:

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Probability measure

A

function that maps the random events in
the sample space onto the real numbers between 0 to 1.

The function “measures” the area of the event out of the whole sample space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Probability of an event E is denoted as

A

P(E)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Frequentist and Bayesian perspectives have different conceptualizations of…

A

the probability measure.

different views on how we should map the events in the sample space onto the real numbers between 0 and 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

N(E) represents the…

A

number of times in the first N repetitions of the experiment that the event E occurs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

In the frequentist perspective, what is the probability of an event?

A

The probability of the event is the proportion of times the event E has occurred as we perform the same experiment infinitely many times (i.e., N reaches infinity).

probability is the frequency of the event
occurrence, hence called the frequentist perspective.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

In the Bayesian perspective, what is the probability of an event?

A

represents a degree of your subjective belief about the occurrence of an event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

frequentist definition

A

long-run probability

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

bayesian

A

degree of belief

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Properties of frequentist perspective

A

Objective/Unambiguous

Can’t assign probability to events that are not replicable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

Properties of bayesian perspective

A

subjective/ ambiguous

can assign probability to any event

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

What is a random variance?

A

A random variable is a function that maps random events in the sample space of an experiment onto the real number line.

Through a random variable, we can use numbers to quantify
or represent the occurrence of an event.
-usually denoted by a capital letter (e.g., X or Y )
- different from the algebraic variable (e.g., a ` 5), which means any unspecified number.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

An indicator (or Bernoulli) random variable (X) maps…

A

the occurrence of the event to 1.
the non-occurrence of the event to 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

How to denote a bernoulli random variable:

For example, let X indicate whether we get a head after a coin flip.

A

X(H) = 1
X(T) =0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

Discrete random variables

A

can only take on specific values, usually whole numbers

indicator random variable (X “ 0, 1); binomial random variable
(X “ 0, 1, 2, 3 . . .)

countable number of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

Continuous random variables

A

can take on any value in an
interval

e.g., normal random variable.Can take on any value on the real number line from positive to negative infinity
X = 0.00001

uncountable number of values.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

What does the probability measure of the random variable map?

A

For a random variable, the probability measure maps the values of the random variable onto a value between 0 and 1, which measures the likelihood of the values of the random variable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
probability distribution.
Each random variable has a probability distribution. Discrete: probability mass function (PMF) § Tells us the probability associated with each possible value of the random variable. § Continuous: probability density function (PDF)
26
Probability mass function of random variable
In the example of X being the indicator random variable representing getting a head after a fair coin flip, the PMF of X is P(X=0) = 0.5 P(X=1) = 0.5
27
Bernoulli Distribution
If a random variable is a Bernoulli random variable, we can say that the random variable follows the Bernoulli distribution. By Bernoulli distribution, we mean the probability distribution associated with the Bernoulli random variable.
28
If X is a Bernoulli random variable where P(X=1) = p then we can write
X ~Ber(p) where the symbol “~” stands for “follows”, and Ber stands for Bernoulli distribution.
29
For brand-named random variables, their distributions are characterized by a small number of parameters. Explain parameter in this context
For X ~Ber(p), p is the parameter that fully describes the Bernoulli distribution. Parameters are considered non-random, fixed variables. This usage of the term “parameter” is a bit different but related to the case when “parameter” is used to mean the quantities computed with population data.
30
Normal distribution
A normal random variable (a.k.a., Gaussian random variable) is a continuous random variable that follows the famous “bell curve” distribution.
31
The “bell curve” distribution is called the __________________________________________________________ of the normal random variable
The “bell curve” distribution is called the Probability Density Distribution (PDF) of the normal random variable
32
The normal random variable is characterized by two parameters:
1. expected value u 2. variance o^2
33
How to denote X as a normal random variable:
X ~ N (u,o^2)
34
Standard Normal Random Variable and how to denote it
When the normal random variable has a mean of 0 and a variance of 1, then it is called the standard normal random variable, usually denoted as Z ~ N (0,1)
35
We can transform any normal random variable to the standard normal variable. Then you can transform X to follow the standard normal distribution by
Z = X-u/o
36
what is the probability of a Continuous Random Variable taking on any specific value ?
For a continuous random variable, we cannot talk about the probability of the random variable taking on any specific value. the probability of a continuous random variable taking on a specific value is always zero. For a continuous random variable, we can only talk about the probability of the random variable taking on a range of possible values.
37
cumulative distribution function (CDF)
tells us the probability of a random variable taking on a value that is equal to or less than a cutoff point. P(X< a) or P(X < a) is the area under the curve below a
38
68–95–99.7 Rule
The 68–95–99.7 rule is a shorthand used to remember the percentage of values that lie within an interval estimate in a normal distribution.
39
There are four R functions for the normal distribution:
dnorm() pnorm() qnorm() rnorm()
40
dnorm()
The dnorm() function computes the PDF of the normal distribution. Output the probability density of a normal random variable at a specific value Not commonly used because for continuous random variables, the probability of a range of values is more important (i.e., the area under the PDF)
41
pnorm()
The pnorm() function computes the CDF of the normal distribution Output the probability of a normal random variable taking on values below the quantile value. Need to input: q: the quantile value at which you want to compute the probability. mean: value for the parameter µ. sd: value for the parameter σ. Other input: § lower.tail: logical; whether you want the upper tail or the lower tail probability. By default, lower.tail=T.
42
qnorm()
The qnorm() function computes the quantile value given a probability below the quantile value. Output the quantile value. Need to input: p: the probability below the quantile value. mean: value for the parameter µ. sd: value for the parameter σ. Other input: lower.tail: logical; whether you specified the upper tail or the lower tail probability for p. By default, lower.tail=T.
43
In the R functions what do you need to remember about the variance?
Note: Remember to square root the variance to get the standard deviation for the argument sd.
44
rnorm()
generates/simulates random numbers from the normal distribution Suppose our population data follow a normal distribution N(100, 400). We want to simulate randomly sampling 10 values from the population. Then we can do rnorm(n = 10, mean = 100, sd = sqrt(400))
45
Binomial distribution: Notation? What kind of random variable?
X ~ Bin(N,p) discrete random variable
46
Chi-square distribution: Notation? What kind of random variable?
X ~ x^2(df) continuous random variable
47
t distribution: Notation? What kind of random variable?
X ~ t(df) continuous random variable
48
Random variables characteristics
Associated with random events. Have probability distribution Can take on more than one possible value. Denote using capital letters XY
49
Constants or Fixed Values
Associated with non-random event Do not have probability distribution Can only take on one possible value Denote using small letters ax
50
What does a random variable quantify?
a random procedure’s different outcomes.
51
once you see the random procedure’s outcome, it is called....
the realized value of a random variable. The realized value of a random variable is treated as constant
52
empirical probability distribution.
We can also realize this random variable multiple times and then graph the empirical probability distribution We can realize the random variable 10 times by flipping a fair coin 10 times. The empirical probability distribution is an estimation of the theoretical probability distribution.
53
Usually, the population data of a variable are assumed to follow....
the normal distribution
54
Sample statistics (e.g., the sample mean) across repeated studies are _____________________ __________
random variables - has a probability distribution
55
Population parameters (e.g., the population mean) are __________________
constants do not have a probability distribution
56
The sample data are random across repeated sampling; therefore, sample statistics are also ___________
random
57
Population parameters are considered _____________________________ in the Frequentist perspective.
constants (or fixed values)
58
Do population parameters have any probability distributions associated?
No bc they are constants
59
Parameters of a random variable:
numerical quantities that fully describe a distribution u and o^2 in X ~ N (u,o^2)
60
Population parameters:
numerical quantities characterizing the population data
61
From the Bayesian perspective, population parameters are considered...
random variables because we are uncertain about their values. In Bayesian statistics, you can specify a probability distribution for each parameter. § called prior distribution.
62
CLT roughly implies what?
that when we add or average a large number of random variables, the sum or the mean of the random variables is a random variable that follows a normal distribution. CLT implies when you add or average different random events together and use a random variable to quantify it, then the probability measure of the random variable follows the normal distribution
63
CLT formula
At a large n, Xbar approximately follows a normal distribution N(uxbar = u, o2/x = o^2/n)
64
uxbar
the mean of the sampling distribution of sample mean xbar
65
o
66
oxbar
the standard deviation of the sample distribution of the sample mean X; standard error of the mean SEM
67
In essence, the CLT roughly implies
that when we add oraverage a large number of random variables each with finite µ and σ2 the sum or the mean of the random variables follows a normal distribution. This implies when you add different random events together and map them onto a number line, it follows the normal distribution.
68
One of the most common applications of the CLT is regarding
the sampling distribution of the sample mean.
69
What is the sampling distribution of the sample mean
The sampling distribution of the sample mean is the distribution of the sample mean over repeated samples. § “Over repeated samples” means “conducting the same experiment (with a fixed sample size n) infinitely many times.” § Related to the frequentist perspective.
70
according to CLT, the sampling distribution of the sample mean is a ______________ distribution
normal distribution.