Statistics Flashcards
(10 cards)
notation
Xi - random var with distribution N(a, b^2)
xi - observed value
sample mean for observation. x1, x2, .., xn
x’ = 1/n* sum from 1 to n of xi
sample var for observations
s^2 = 1/(n-1) * sum (xi - x’)^2 = 1/(n-1) * sum(x1^2) - n*x’^2
sample correlation of the observations
r = sum((xi-x’)(yi-y’))/sqrt(sum((Xî-x’))^2sum((yi-y’)^2)) - r linear correlation
what is the statistic, estimator and estimate
statistic - random variable giving the formula for the function of the data
λ’ = 1/X’ - estimator
λ’ = 1/x’ - estimate
estimator is random var and has distribution
methods of estimation
least squares estimation is useful because it does not make any assumption about the distribution of the data, only about mean
maximum likelihood estimation does require a particular distribution form for the data
properties of estimators
bias = E[Φ’ - Φ]
Φ’ - estimator of parameter Φ
mean squared error = E[(Φ’ - Φ)^2]
E[(Φ’-Φ)^2] = Var(Φ’) if Φ’ is unbiased
E[(Φ’ - Φ)^2] = Var(Φ’) + bias^2 - if Φ’ is biased
confidence level
expressed as percentage with common values 95%, 99% or 90%, refers to idea that it gives an interval which includes the parameter with the specified probability
α - level associated with confidence level
c - confidence level
α = 1 - c
Cl for mean of normal data σ^2 known and X1, X2, …, Xn are random sample from a normal distribution N(a, σ^2)
a 100(1- α)% confidence interval for a is based symmetrically around X’ as:
(X’ - z * σ/sqrt(n), X’ + z * σ/sqrt(n)
where
z’s index is 1 - α/2 and z solves Φ(z) = 1 - α/2 and is evaluated using Matlab quantile function norming(1 - α/2, 0, 1)
cl for mean of normal data σ^2 unknown and t distribution with n-1 degrees of freedom
t-distribution similar to normal, you can calculate t quantiles in Matlab using tin(p, v), where p - probability 1 - α/2 and v is degrees of freedom