Statistics/ML Flashcards by E E

Bonferroni

Alpha/m

How well did you know this?

Not at all

Perfectly

Bagging

Take n bootstrap samples, fit model to each, take average

How well did you know this?

Not at all

Perfectly

Random forest

Basically bagging with decision trees, but for each split, choose from random p/3, for example, of the covariates

How well did you know this?

Not at all

Perfectly

Support vector machines

Maximize M subject to all but some number of points being farther away than M from decision boundary, with sum of distances for those that aren’t <= C

How well did you know this?

Not at all

Perfectly

Newton Raphson

x_1 = x_0 - f(x_0)/f’(x_0)

How well did you know this?

Not at all

Perfectly

Gradient descent

x_1 = x_0 - gamma * gradient of F

How well did you know this?

Not at all

Perfectly

Logit

P(y=1) = e^(X beta)/(e^(X beta)+1)

log(p/(1-p)) = X beta

How well did you know this?

Not at all

Perfectly

K nearest neighbors

Use plurality vote for classification, or mean for regression

How well did you know this?

Not at all

Perfectly

Std error

Std deviation of a statistic’s sampling distribution , or an estimate of it, eg

(1/sqrt(n))sqrt(sum(xi - mean)^2)

How well did you know this?

Not at all

Perfectly

Normal density

(1/sigma sqrt(2 pi)) e^( - (1/2)((x- mu)^2/sigma^2))

How well did you know this?

Not at all

Perfectly

T test

Sample should be normal, but ok for large samples I believe.

Tau-hat/(se(tau-hat))

Eg difference divided by

Sigma hat * Sqrt (1/n1 + 1/n2)

How well did you know this?

Not at all

Perfectly

Covariance of beta hat for regression

Sigma^2. (X’X)^(-1)

Estimate sigma with
1/(n-p). *. Sum of (y - X beta)^2

How well did you know this?

Not at all

Perfectly

Law of large numbers

Lim as n -> inf
P(|mean(Y1,..,Yn) - mu|>= ep)
=0

How well did you know this?

Not at all

Perfectly

Central Limit Theorem

Limit as n -> infinity
P(. (1/(sigma/sqrt(n)). *
(Mean(Y1,…,Yn) - mu) <= z). =

Phi(z)

(1/(sigma/sqrt(n)) times (Ybar - mu)
converges to a unit normal

Note that Ybar has std dev
sigma * sqrt(n)

How well did you know this?

Not at all

Perfectly

SUTVA

Stable Unit Treatment Value Assumption

Response of one unit only depends on their treatment not on treatment of others

Eg if some people assigned to travel on public transportation, some in cars, then wouldn’t hold, because it affects the traffic

How well did you know this?

Not at all

Perfectly

Kmeans

Study These Flashcards

Start with initial k points.
Assign points to one of these.
Calculate new centroids
Repeat

SMOTE

Study These Flashcards

For class imbalance:
select one of uncommon class at random,
then select neighbor in class at random,
draw line between them, choose something on line, assign uncommon value to that

You can also undersample majority class and/or oversample (i.e. repeat) minority class.

Accuracy

Study These Flashcards

(TP + TN)/
(TP + TN + FP + FN)

Precision

Study These Flashcards

TP/(TP + FP)
PREcision is TP divided by PREdicted positive

Sensitivity

Study These Flashcards

TP/(TP + FN)
SeNsitivity is Positive
Correct positives among all positives

Specificity

Study These Flashcards

TN/(TN + FP)

Also called recall

SPIN. sPecificity is Negative
Correct negatives among all negatives

Amazon recommendation systems, Netflix

Study These Flashcards

item-based collaborative filtering algorithms, using cosine distance.

Type 1 and 2

Study These Flashcards

Type 1 False Positive, ie mistaken rejection of null hypothesis
Type 2 False Negative

softmax

Study These Flashcards

exp(zi) / (sum exp(zi))

imputing missing values

can just impute with mean or median etc. or some constant "MICE stands for Multivariate Imputation via Chained Equations, and it’s one of the most common packages for R users. It assumes the missing values are missing at random (MAR). The basic idea behind the algorithm is to treat each variable that has missing values as a dependent variable in regression and treat the others as independent (predictors). https://www.r-bloggers.com/2023/01/imputation-in-r-top-3-ways-for-imputing-missing-data/

gbm hyperparameters

shrinkage or learning rate add on only small multiple of new tree at each step bag fraction "fraction of independent training observations selected to create the next tree in the expansion. Introduces randomness in the model fit; if bag_fraction < 1 then running the same model twice will result in similar but different fits." num_features: "number of random features/columns to use in training model. " interaction depth: max depth of each tree (although different in gbm3) some of this from gbm3 documentation

boxplot definitions

` boxplot quantile 25 to 75, median, outer lines at most 1.5*IQR, outliers

Statistics/ML Flashcards

(27 cards)