Statistical distributions + Hypothesis testing Flashcards

(27 cards)

1
Q

What is a random variable

A

A variable whose outcome depends on a RANDOM event - The outcome isn’t known until the experiment is carried out.
A variable can take on a range of specific values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is a sample space

A

The range of values a random variable can take

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What makes a variable discrete

A

If it can only take on specific numerical values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What does a probability distribution do

A

Fully describes the probability of any outcome in the sample space

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is a discrete uniform distribution

A

When all the probabilities of a discrete random variable in a sample space are the same

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Probability mass function definition

A

A function that gives the probability that a discrete random variable is exactly equal to some value. Sometimes it is also known as the discrete probability density function/frequency function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Sum of probabilities for a random variable X

A

= 1 for all little x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When can you model a random variable X with a binomial distribution B(n,p), where X is the number of successful trials

A
  • If there are a fixed number of trials n
  • There are two possible outcomes (success
    and failure)
  • There is a fixed probability of success (p)
  • TRIALS ARE INDEPENDENT OF EACH OTHER (always specify this in assumptions)

P(X=r) = nCr(p^r)(1-p)^(n-r)

where n is the index and p is the parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What does a cumulative probability function for a random variable X tell us

A

The sum of the given probabilities up to and including the given value of x in the calculation P(X≤x) for various values of n and p

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

How to carry out a hypothesis test

A
  • Assume null hypothesis H₀ is true
  • Consider how likely the observed value of
    the test statistic to occur is
  • If less than a given threshold (significance
    level) then you reject the null hypothesis
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is a hypothesis (elaborate on key words used in definition)

A

A statement made a about a population parameter (A number that describes something about an entire group/population)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How to test a hypothesis

A

Taking a sample or carrying out an experiment on the population

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the test statistic

A

The result of the experiment carried out on/statistic calculated from the sample

Note that the test statistic is a DISTRIBUTION e.g. X-B(10, p) where p isn’t known until we start making assumptions

For a hypothesis test involving the binomial distribution, the test statistic is always the number of successes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Null and alternative hypotheses definitions

A

Null - Hypothesis assumed to be true
Alternative - Tells you about parameter if assumption is shown to be wrong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Two tailed vs one tailed tests

A

One tailed - H₁:p<… or H₁:p>…
Two tailed - H₁:p ≠…

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Critical region definition

A

Region of the probability distribution which if the test statistic falls within, the null hypothesis would be rejected

17
Q

Formally writing out hypothesis test

A

A test statistic is modelled as B(10,p), and a hypothesis test at the 5% significance level uses H₀: p=0.4 H₁:p<0.4

Assuming null to be true, X has distribution X~B(10,0.4)

18
Q

Critical value

A

First value to fall within critical region

19
Q

Actual significance level

A

Probability of incorrectly rejecting null hypothesis - falling within significance level only suggests that there is evidence to suggest that null hypothesis is incorrect

20
Q

Calculating critical region w/o thought

A

Smallest value for P(X≥ r) < significance %
region is then everything greater than r

21
Q

What is a hypothesis test

A

A hypothesis test
uses a sample or an experiment to determine whether or not to reject the hypothesis.

22
Q

Critical value/acceptance region

A

The critical value is the first value to fall inside of the critical region.
The acceptance region is the region where we accept the null hypothesis

23
Q

Difference between r and ρ

A

r is pmcc for a sample, ρ is for a population

24
Q

Assumptions made in hyp test using pmcc

A

assumption that the population has a
bivariate normal distribution.

25
Residual
Difference between actual val from bivariate data vs regression line y(actual) - (a+bx)
26
When normal distribution can be used to approximate a binomial one
● n >oreq 20 and p ≈ 0.5
27