Linear Regression Flashcards

(8 cards)

1
Q

What is supervised learning?

A

A learning paradigm where a model is trained on input-output pairs (x_i, y_i) to learn a mapping from inputs to outputs.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

How is the linear regression model expressed using basis functions?

A

f(x) = ∑ₖ βₖ φₖ(x), where φₖ are basis functions (e.g., φ₀(x) = 1, φ₁(x) = x).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What loss function does linear regression use and why?

A

Mean Squared Error: L = (1/n) ∑ᵢ (yᵢ − f(xᵢ))²; penalizes larger errors and is differentiable.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What is the normal equation for the closed-form solution?

A

β = (Φᵀ Φ)⁻¹ Φᵀ y, where Φ is the design matrix of basis functions.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How does gradient descent optimize parameters?

A

β_new = β_old − rate × ∂L/∂β, thus moving each step in the negative‐gradient direction to reduce the loss.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is the gradient descent update rule in linear regression?

A

β_new = β_old – rate * (–2 * Φ^T * y + 2 * Φ^T * Φ * β_old), Step in the negative‐gradient direction of the MSE loss.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How does learning rate affect gradient descent convergence?

A

A small learning rate yields slow convergence; a large learning rate can cause oscillation or divergence.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

How is the design matrix Φ?

A

Φ = np.vstack((np.ones(n), x)), stacking a row of ones (for the intercept) on top of the feature row.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly