Model Fitting Flashcards

(42 cards)

1
Q

{xi…xm}=

A

random sample from pdf p(x) with mean μ and variance σ^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

sample mean

A

μ hat = 1/m sum from i=1 to M of xi

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

as sample size increases, sample mean

A

increasingly concentrated near to true mean

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

var(μ hat)=

A

σ^2/M

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

for any pdf with finite variance σ^2, as M approaches infinity, μ hat follows

A

a normal pdf with mean μ and variance σ^2/M

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

the central limit theorem exaplains

A

importance of normal pdf in statistics

but still based on asymptotic behaviour of an infinite ensemble of samples that we didn’t actually observe

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

bivariate normal pdf

A

p(x,y) which is specified by μx, μy, σx, σy, p

often used in the physical sciences to model the joint pdf of two random variables

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

the first four parameters of the bivariate normal pdf are

A

equal to the following expectation values

E(x)=μx
E(y)=μy
var(x)=σx^2
var(y)=σy^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

the parameter p is known as the

A

correlation coefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what does the correlation coefficient satisfy?

A

E[(x-μx)(y-μy)]=pσxσy

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

if p=0, then

A

x and y are independent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is E[(x-μx)(y-μy)]=pσxσy also known as

A

the covariance of x and y and is often denoted cov(x,y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

what does the covariance define?

A

how a parameter (x) varies with another parameter (y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

p>0

A

positive correlation
y tends to increase as x increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

p<0

A

negative correlation
y tends to decrease as x increases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

contours become narrower and steeper as

A

|p| approaches 1

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

what is pearson’s product moment correlation coefficient

A

r

given sampled data, used to estimate the correlation between variables

18
Q

if p(x,y) is bivariate normal, then r is

A

an estimator of p

19
Q

the correlation coefficient is a unitless version of

A

the covariance

20
Q

if x and y are independent variables, cov(x,y)=

A

0

so p(x,y)=p(x)p(y)

21
Q

the method of least squares

A

workhorse method for fitting lines and curves to data in the physical sciences

useful demonstration of underlying statistical principles

22
Q

ordinary least squares

A

scatter plot of (x,y) is assumed to arise from errors in only one of the two variables

23
Q

ordinary least squares - can write

24
Q

what is Ɛi

A

the residual of the ith data point

i.e the difference between the observed value of yi and the value predicted by the best fit, characterised by parameters a and b

25
we assume that the Ɛi are
an independently and identically distributed random sample from some underlying probability distribution function with mean zero and variance σ^2 (residuals are equally likely to be positive or negative and all have equal variance)
26
ds/da=0 when
a=a_LS
27
Weighted least squares is an efficient method that makes good use of
small data sets
28
weighted least squares - in the case where σi^2 is constant for all i, the formulae
reduce to those for the unweighted case
29
principle of maximum likelihood is a method to
estimate the parameters of a distribution which fit to observed data
30
principle of maximum likelihood - first
decide which model we think best describes the process of generating the data.
31
Maximum likelihood estimation is a method that will find the values
of mu and sigma that result in the curve that best fits the data
32
Assuming all events are independent, then the total probability of observing all of data is
the product of observing each data point individually (i.e. the product of the individual probabilities)
33
when is chi2 used?
when we know there are definite outcomes e.g. flipping a coin, measure whether email arrival rate is constant in time => no errors on measurement
34
when is reduced chi2 used?
when we know there is uncertainty or variance in a measured quantity e.g. measure flux from a galaxy => errors on measurement
35
poisson distribution, k=
1 (mean)
36
normal distrubution, k=
2 (mean and variance)
37
degrees of freedom=
N-K-1
38
For the reduced Chi2, don’t know number of outcomes, so degrees of freedom are
the number of data points
39
p-value
If the null hypothesis were true, how probable is it that we would measure as large, or larger, a value of chi2 ?
40
standard value to reject a hypothesis
a p-value <0.05
41
If we obtain a very small P-value (e.g. a few percent?) we can interpret this as
providing little support for the null hypothesis, which we may then choose to reject.
42