terms and definitions Flashcards

Question

# method of moments

Answer 1

* assume we've a population under a specific distribution, with parameters a\_1,...,a\_k: pdf = f(x,a\_1,...,a\_k) * suppose further, we can express the population moments (mean, variance, skewness, kurtosis, ...) as a function of the parameters a\_1,...,a\_k; a simple example would be a uniform distribution with endpoint parameters a,b: ie f=1/(b-a) on the interval [a,b]; then mu = (a+b)/2; var = (b-a)^2 / 12 * we then can create a system of equations, k equations in k unknowns, by setting the population parameter-moments equal to respecitve moment estimates derived from a population sample * solving this system will produce an estimate for the distribution

Answer 2

* eg discrete case, half-infinite, have a family of random variables, (X0,X1,...), indexed over the set N (naturals), defined over the common probability space (O,B,P) (O the sample space, B the event space), and all mapping into the same measurable space (O1,B1) (O1 the state space) * a random variable in the process may be considered dependent on both the index set (eg "time"), and some measurable set in O * a sample function aka realization aka sample path aka trajectory aka path function aka path is a single outcome of a stochastic process--ie a set of single-possible instances of each random variable; note that though all Xi are based in the same (O,B,P) space, they are not necessarily independent * stationary stochastic process--the joint distribution of subsets is invariant to shifts in the time index

Answer 3

* for random variable X with mean mu and variance sig, P(mu-c\*sig < = X < = mu+c\*sig) >= 1-1/c^2 * note, this inequality is referenced to the true population variance (which we may not know) * may be useful for non-normal distributions, for rough inequality

Answer 4

* this is a general term, refering to the "strength of the relationship between two variables" * categories of measurements of effect size include * correlation--eg Pearson's r * differences between means--eg Cohen's d (i.e. some underlying variable is related to the means, so larger difference means a stronger relation) * categorical--for effect sizes among categorical variables, eg odds ratio (for comparing two binary variables) * may be considered in context of statistical significance (eg highly significant, but with small effect size)

Answer 5

ANOVA techniques can be applied to obtain measures and statistics on linear regression fits, extensible to regression model output in general r^2 = SSM/SST = 1 - SSE/SST * where SST = SSM + SSE (T=total, M=model, E=error) * ie this is a literal proportion of variability, between the intra-model variability, and the total (data) variability * SSE comes from the residuals--sum (yi-yi\_hat)^2 * SST is related to total sample variance--sum(yi-mu)^2 relation to Pearson's r: if fitting a linear regression model Y~X, then the coefficient of determination of the fit equals Pearson's r (squared) on variables X and Y

Answer 6

* something that didn't actually happen, and generally can't be observed * eg (to test effects of capital punishment on crime) the number of crimes that go "uncommitted" because capital punishment exists * eg how much would someone have earned on the job if, all else being equal, they were a different gender

Answer 7

(variance centric) the explanation of variation in the context of what remains unexplained

Answer 8

a "sample" equal to the entire population

Answer 9

for a factor with k levels, a measure of how homogenous the samples are procedure for n samples: * consider all pairs, (x\_i, x\_j), and total the number of pairs with x\_i, x\_j from different classes * divide result by n(n-1) * will be in [0,1] (=0 means all in same class)

Answer 10

the degree of variation in a sample-based parameter estimate; related to confidence intervals

Answer 11

* in context of sample-based parameter estimate, this amounts to the "pdf" of the point estimate * this can be created simply by repeatedly sampling n samples from the population

Answer 12

* given a large enough sample, estimate what the sampling distribution (for a parameter estimate) looks like, allowing especially eg confidence interval estimates * for n samples, repeatedly choose n, with replacement

Answer 13

* partial relationship--a relationship between predictor and outcome variables, with one or more covariates / confounders / nuisance variables held constant * total relationship--a relationship between predictor and outcome variables, letting any other explanatory / independent / predictor variables change as they will (aka mutatis mutandis)

Answer 14

* frequentist--a perception of probability that eg suggests for a coin toss that: * you base a probability model of the coin as having a probability of it coming up heads, and * the value you assign to that probability is based on running lots of experiments (coin flips) and assigning the resulting frequency of heads to the probability parameter * subjectivist--the probabilities of events encode the modeler's assumptions and beliefs * eg tomorrow's forecast calls for 10% chance of rain: the subjectivist would interpret the 10% as the forecaster's way of imparting some information, based on their experience, available data, etc. * useful for encoding beliefs, but the probability calculus should be used to work through the consequences of these beliefs (as with eg Bayes)

Answer 15

* percentile--the input argument is a measured value, what could be the output of a single draw from the probability distribution (eg someone's IQ, it being in some percentile) * quantile--the input is a percentile, while the output is on the scale of the measured varaible (eg what is the 25th quantile for home prices on the local market)

Answer 16

* deductive * a series of rules that bring you from given assumptions to the consequences of those assumptions * eg a syllogism or symbolic algebra * inductive * generalizes or extrapolates from a set of observations to conclusions * can be wrong

Answer 17

the significance level of a hypothesis test is the accept/reject threshold for the p-value; this is the conditional probability, P(reject the null | the world is such that the null is true)

Answer 18

the pet idea of what the world is like if the null hypothesis is wrong; this usually plays the role of the thing you'd like to prove complementarity and links to Type II error * "What is the probability that, in such a world [as the alternate hypothesis being true], you would end up failing to reject the null hypothesis [under the just-the-H\_0 test]?"--such a "mistake" is called Type II error * in a sense with a high-power test, the modeler makes rejecting the null more informative--H\_A might be true; conversely, failure to reject the null suggests H\_A is not true

Answer 19

the value used that is the focus of the study or hypothesis test; eg a sample mean, or model coefficient

Answer 20

* a "first principle" test in the context of ANOVA, indicating how significant a fitted model's R^2 is * this is a ratio of Chi-square distributions, each element divided by respective degrees of freedom * in ANOVA for models: * the SSM (model output variance) and SSE (residuals variance) are obtained, and each is divided by respective degrees of freedom * the ratio of the results forms an F value, which, under some normality assumptions, can be checked against an F distribution

Answer 21

* don't try to reason about the real world, and build a model from that, but set up a hypothetical world that is completely understood, base a model on that, then compare the result of the model to the "observed patterns of the data" * accept / reject * accepting the null as plausible does not tell us much (many other possibilities could explain this) (converse) * rejecting the null gives salient information, that something with our assumptions is wrong (contrapositive)

terms and definitions Flashcards

(45 cards)