Regression Models Flashcards by Daniel Casley

What is supervised learning?

Learning a function that maps an input and some parameters to a predicted output (label), which is then evaluated based on a ground truth value

How well did you know this?

Not at all

Perfectly

What is an observation in supervised learning?

One row of data in the dataset

How well did you know this?

Not at all

Perfectly

What is a feature in supervised learning?

One column of data in the dataset

How well did you know this?

Not at all

Perfectly

What are hyperparameters?

Parameters that we define before running the model, and are not learned directly from the data

How well did you know this?

Not at all

Perfectly

Models are typically trained in two phases: […] and […]

Training and evaluation (/prediction)

How well did you know this?

Not at all

Perfectly

The training phase involves…

Teaching the models what predictors fall into what category, learning parameters that define the relationships between the features

How well did you know this?

Not at all

Perfectly

The prediction/evaluation phase involves…

Gaining new observations and feeding them into our trained model to create a prediction

How well did you know this?

Not at all

Perfectly

An update rule is defined which is calculated using the value from a […].

Loss function

How well did you know this?

Not at all

Perfectly

What is a loss function?

A quantitative measure of error in predicted values of a model

How well did you know this?

Not at all

Perfectly

Problems with quantitative responses are usually […] problems, while those with qualitative responses are usually […] problems.

Regression, classification

How well did you know this?

Not at all

Perfectly

Provide an example of a regression model used for qualitative responses.

Logistic regression, since it estimates the probability of a choice

How well did you know this?

Not at all

Perfectly

What are regression models?

Regression models are used to model any continuous target or outcome (i.e. loss, revenue)

How well did you know this?

Not at all

Perfectly

How does a linear regression model create predictions?

Defining a straight mathematical line that attempts to go through each point, with which we can substitute x and y to find y’.

How well did you know this?

Not at all

Perfectly

In a linear regression model, epsilon represents…

A hyperparameter chosen at implementation that guides the model to learn the data

How well did you know this?

Not at all

Perfectly

In a linear regression model, b0 and b1 represent…

Parameters that represent the y-intercept and slope of the line respectively, calculated via an equation

How well did you know this?

Not at all

Perfectly

What is Mean Squared Error (MSE), and how can we use it for a linear regression model?

Study These Flashcards

MSE is the sum of squared errors divided by the number of data points, which we can use to measure the error of our model and adjust coefficients accordingly

Error metrics are useful for training a model because…

Study These Flashcards

We can focus on easier-to-read parameters rather than the predictions themselves to generate model/data insights

What is a black-box model?

Study These Flashcards

A black-box is a type of learning model that does not explain its process, creating a sort of ‘black-box’ between input and output

The best step-by-step practice for modelling involves…

Study These Flashcards

Establishing the ideal cost function, developing multiple models with different hyperparameters, and comparing the results according to our loss function (establishing, developing, comparing)

We can calculate the parameters of our linear regression model by…

Study These Flashcards

Solving the linear system of equations formed by our data points for epsilon, calculating the sum of these equations, then using the derivative to solve for bn.

The area above the mean of a linear regression line is called…

Study These Flashcards

Explained variation

The area below the mean of a linear regression line is called…

Study These Flashcards

Unexplained variation

Total variation is calculated as the sum of…

Study These Flashcards

Explained and unexplained variation

How can we add more curvature to our linear regression model?

Study These Flashcards

By using a polynomial regression instead

If a first order polynomial regression is a linear regression that generates a straight line, a second order polynomial regression generates...

A curve with one local minima/maxima

The order of a polynomial refers to...

The largest exponent in any of the terms

Are polynomial regression models linear?

Yes, the algorithm itself is still a linear combination of features

What is the bias-variance tradeoff?

Model adjustments that decrease bias, often increase variance, and vice versa, therefore this tradeoff is analogous to a complexity tradeoff

What can we use to measure the complexity of a model?

Bayesian Information Criterion (BIC)

Why should we always include an unseen testing set?

If we train only on the training set, our model may generalise too hard to it, learning only how to replicate and not predict

If knowledge of the test set leaks into the training set, we can encounter an issue called...

Data leakage

What is cross validation?

Instead of using one training and testing set, you instead split the dataset into four sets of training/testing sets, then test the model on each and compare the error between them

What is the problem with randomly sampling the training set?

In the case where we have data with a large bias, that bias may not be properly represented in the random samples

What problem does stratified sampling solve and how?

Bias in the training set, by dividing the dataset into groups based on the different classes, and randomly selecting an amount of data equal to their proportion in the data from among each group

Regression Models Flashcards

(34 cards)