Probability and Statistics Flashcards

1
Q

What is the binomial coefficient

A

(n k) = n! / k!(n-k)!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

what is a Bernoulli trial

A

only has 2 possible outcomes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

rule for probability that 2 independent events both occur

A

‘and’ rule -> multiplication

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

rule for probability that one or another event occurs

A

addition

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

how to find probability that a and b occur given that b occurs

A

P(a and b) / P(b)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

pdf for binomial distribution
P(X=k) =

A

(n k) p^k (1-p)^n-k

binomial coefficient X probability of success k times X probability of failure n-k times

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

definition of Expectation

A

the sum of all possible outcomes, weighted by their probabilities

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

When can the Poisson distribution be used

A

large n
small p
(ie rare events)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

formula for µ, the density parameter

A

µ = np (=E(x))
n = number of trials
p = probability of success

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

pdf for poisson distribution P(X=k) ≈

A

e^µ µ^k / k!

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

E(x) for binomial distribution

A

np

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

E(x) for Poisson distribution

A

µ = np

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what are the parameters for the geometric distribution

A

p, probability of success

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

pdf for geometric distribution
P(X = k) =

A

(1 - p)^n-1 p

probability of the n-1 failures before the one probability of success

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

expectation E(x) for geometric distribution

A

1 / p

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

parameters for the exponential distribution

A

lambda = the rate parameter

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what’s the difference between the exponential and geometric distribution

A

geometric = discrete
exponential = continuous
exponential distribution can be used to model the geometric when n gets large and p gets very small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

pdf for exponential distribution
f(x) =

A

lambda e^ - (lambda x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

cdf for exponential distribution
F(x) =

A

1 - e^ - (lambda x)

(if can’t remember can integrate the pdf between 0 and x)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what does the cdf show

A

an expression that gives the probability that a random variable X falls between 0 and x

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

expected value of the exponential distribution

A

1 / lambda

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

parameters of the normal distribution

A

µ - the mean
sigma - std

23
Q

expectation for normal distribution

24
Q

what does the Z scale do (normal distribution)

A

measures how many stds a point lies from the mean of its parent distribution

normalises the data

25
formula for Z scale
Z = (Xi - µ) / std X = point µ = mean of parent distribution std = std of parent distribution
26
critical value for 2 tailed standard normal at alpha=0.05
+- 1.96 +-1.96*sigma for not normalised
27
when is the t distribution used
small sample size don't know mean
28
difference between t distribution and normal distribution
t has longer tale, therefore has more extreme critical values for same significance level as the sample size in t increases, the t distribution tends to the normal
29
formula for t scale
( X - µ) / Sx X = sample mean µ = population mean (often unknown) Sx = standard error of mean
30
what is standard error of mean (SEM)
Sx = s / root(n) s = sample std n = sample size
31
what is the p value
probability of observing a result equal to or more extreme than the outcome
32
what is a type one error
rejecting the null when its true 'False positive'
33
what is a type two error
fail to reject the null when its false 'False negative'
34
what is alpha level
level of confidence at which we reject the null probability of a type one error
35
why shouldn't you use multiple t tests for multiple comparisons
the probability of a type 1 error gets large
36
what should you use instead of multiple t tests for comparisons
ANOVA
37
what is the within-group variance
comparing the distribution of replicates to their treatment mean
38
what is the among/between group variance
comparing the distribution of the treatment means to the grand mean
39
what is the F statistic in ANOVA
among / within
40
what are treatments in ANOVA
the different samples
41
what are replicates in ANOVA
sample units within treatments
42
formula for Chi-square test statistic
∑ (o - e)^2 / e
43
formula for Pearsons r test statistic
(use Z scale) r = ∑(Zxi + Zyi) / n-1
44
formula for slope estimate, b of a regression line
b = ∑(Xi - X)(Yi - Y) ---------------------------- ∑(Xi - X)(Xi - X) Xi = x values X = mean of x values
45
residual formula
residual = Yi - ^Yi y value minus the value of y on the regression line
46
problems with regression analysis
- induced correlations ( ie values that sum to 100% or 1, such as mineral compositions may indicate correlation in more than one variable falsely) - correlation vs causation - pseudoreplication (single area data taken from doesn't represent all)
47
What is the t-test used for
- test whether a sample is drawn from a population of specific mean - test if means of 2 samples differ
48
what is the ANOVA test used for
- test whether ≥ 3 samples are drawn from populations with equal means (like students t)
49
what is the Chi-square test used for
- test how well observed categorial data fits a given model/expected values
50
How to find within-group variance
s.s / d.f s.s. = ∑(Xi - X)^2 -> distance from treatment means d.f. = n-1 (for each treatment, then added together (ie total replicates - number of treatments))
51
how to find among (between) group variance
s.s / d.f s.s. = ∑ (Xti - Xg) -> distance of treatment means from grand mean d.f. = n - 1 ( number of treatments -1)
52
when do you reject ANOVA null hypothesis
when F statistic > table value, based on numerator and denominator degrees of freedom
53
assumptions for t test
- data from normally distributed populations - data from populations of equal variance - samples drawn at random from parent distributions
54
assumptions for ANOVA
- data drawn from normally distributed populations - data from populations of equal variance - data independent of one another