Hypothesis Testing Flashcards

1
Q

Checking if a random variable is a normal random variable

A

“normal probability plot”
The data are plotted against a theoretical normal distribution in such a way that the points should form an approximate straight line
order value, z-score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Whose performance was more impressive (assuming a normal distribution) ?

A

need the value, the mean and the standard deviation then can compute z score. This shows how many standard deviation away from the mean it is. The further the more impressive :D

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

PMF and CDF

A

can be used for hypothesis testing of is it taken from that distribution
Binomial distribution and normal distribution
can sample n times the random variable following that distribution and
checking with it how likely the outcome we got is from the PMF/CDF
if <0.05 then significant, not taken from the distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Bernoulli & binomial

A

0 1 samples, doing it n times. discrete

CLT tells us that if 𝑛 is large, binomial random variables will be distributed approximately normally.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Central Limit Theorem

A

CLT
The central limit theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement , then the distribution of the sample means will be approximately normally distributed. Thus only a test for big sample size, for small ones use student t-test
assume that and compute z score with sample data or use cdf with null hypothesis being an average of interest.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

student t test

A

When n is small, the Central Limit Theorem can no longer be used. In this case, if the samples are drawn from an approximately normal distribution, then the correct distribution to use is called the [Student’s t distribution]
Normal distribution with heavier tails

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

error type

A

+ A type I error is the incorrect rejection of a true null hypothesis (a “false positive”).
+ A type II error is incorrectly accepting a false null hypothesis (a “false negative”).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

data dredging/ p hacking

A

the misuse of data analysis to find patterns in data that can be presented as statistically significant, thus dramatically increasing and understating the risk of false positives (incorrect rejection of null hypothesis: false significant difference). This is done by performing many statistical tests on the data and only reporting those that come back with significant results.
To prevent this: Cross validation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

inference

A

draw a conclusion from a sample, won’t be perfect representation of the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

confidence interval

A

interval around an estimated parameter taken from a sample of a full population.
compute z score and get interval around it to have the confidence interval around the mean
The 95% confidence interval does not mean that with probability 95%, the true value of 𝜇 lies within the interval.
A 95% confidence interval means that if we were to repeat the same experiment many times, and compute the confidence interval using the same formula, 95% of the time it would contain the true value of 𝜇 .

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

two sample 𝑡 -test

A

2 independent samples, we compare the means

proportion t test if dealing with categorical variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

A B testing

A

use 2 versions, apply them on random people. Test if there is a statistical difference between the 2.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Chi square

A

whether categorical variables are independent

Null hypothesis: are independent.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

ANOVA

A

compare 3+ means

How well did you know this?
1
Not at all
2
3
4
5
Perfectly