Full Course Flashcards
(157 cards)
What is a Machine Learning Model?
A function that maps input data to predicted outputs based on training.
What is a Cost Function?
A mathematical function measuring how wrong the model’s predictions are (e.g., Mean Squared Error)
What is a Train and Test Error?
Training error: Error on the training data. Test error: Error on unseen data — more important for model evaluation.
What is Overfitting?
Model fits noise instead of pattern; low training error, high test error.
What is Bias-Variance Trade-off?
Bias: Error from wrong assumptions. Variance: Error from model sensitivity to data. Need a balance to minimise total prediction error.
What is Cross Validation?
Repeatedly splits data into training and validation sets to estimate test error reliably (e.g., k-fold CV).
What is Machine Learning Pipelines?
A structured sequence: data cleaning → feature engineering → model training → validation → deployment.
What is a Linear Regression Model?
Predicts continuous outcome as a linear function of inputs.
What is Polynomial Regression?
Extends linear regression by including polynomial terms (e.g., x^2, x^3).
What is a General Linear Model?
Includes multiple predictors and interaction terms.
What is Linear Basis Functions?
Transforms inputs into new space (e.g., polynomials, splines) before linear modeling.
What is the Estimators and Likelihood Function?
Estimator: Rule for estimating parameters. Likelihood function: Probability of data given parameters.
What is Maximum Likelihood Estimates (MLEs)?
Parameter values that maximise likelihood function.
What is Bias, Variance, and Mean Squared Error of Estimators?
Bias: Difference between estimator’s expected value and true parameter. Variance: Variability of estimator. MSE = Bias² + Variance.
What is Asymptotic Optimality of MLEs?
MLEs are consistent and efficient as sample size increases.
What are Confidence Intervals?
Range estimate for parameters with a specified probability.
What is Hypothesis Testing and p-values?
Test whether an effect exists; p-value measures strength of evidence against null hypothesis.
What is Frequentist Inference for Linear Regression?
Uses OLS estimates, confidence intervals, t-tests, F-tests for model inference.
What are the Assumptions, and Limitations of Regression Output?
Assumptions: Linearity, Independence, Homoscedasticity, Normality of errors. Limitations: Sensitive to outliers, multicollinearity.
What are the Elements of Bayesian Inference?
Updates beliefs using Bayes’ theorem: Posterior∝Likelihood×Prior
What are the Prior and Posterior Distributions?
Prior: Beliefs before seeing data. Posterior: Updated beliefs after data.
What are Conjugate Models?
Priors chosen so that posterior belongs to the same family.
What are the Bayes Estimators?
Posterior mean or median used as point estimates.
What are the Credible Intervals?
Bayesian version of confidence intervals; contains parameter with certain probability.