intro to deep learing Flashcards
(27 cards)
What is the goal of supervised learning?
To learn a function that maps inputs to outputs using labeled data.
What does the model output in supervised learning?
A prediction ŷ for a given input x.
What are θ in supervised learning models?
The parameters (weights, biases) the model learns during training.
What is inference in supervised learning?
Making a prediction using the learned function on new input data.
What does the loss function measure?
How far the predicted value ŷ is from the actual output y.
What is the total loss over a dataset?
The sum of individual losses across all training examples.
How is the model trained?
By adjusting parameters to minimize the total loss using optimization.
What does gradient descent do in training?
It updates model parameters in the direction that reduces the loss.
What does η (eta) represent in gradient descent?
The learning rate, which controls the step size of updates.
What does overfitting mean in supervised learning?
The model fits training data well but performs poorly on new data.
What is underfitting?
When the model is too simple to capture patterns in the training data.
What is generalization?
The model’s ability to perform well on unseen (test) data.
What is the purpose of a test set?
To evaluate how well the model generalizes beyond training data.
Why can’t we just brute-force all parameters in training?
The parameter space is too large in realistic models.
Why don’t we solve models analytically?
Because complex models (e.g., deep nets) lack closed-form solutions.
What is the model in 1D linear regression?
ŷ = θ₀ + θ₁x
What loss function is used in linear regression?
Squared error: (y - ŷ)²
What does each data point in the dataset provide?
A training pair (xᵢ, yᵢ) for learning the function.
What does the term ‘model family’ mean?
A set of functions defined by the form of f(x; θ)
What do we do in the training phase?
Fit the model to data by minimizing loss using labeled examples.
What is the role of θ₀ and θ₁ in a linear model?
θ₀ is the intercept, and θ₁ is the slope of the line.
What is the optimization objective in supervised learning?
Minimize the loss function over all training samples.
What do we mean by testing a model?
Evaluating its performance on data it hasn’t seen before.
What happens if we don’t include a loss function?
The model has no criterion for learning from mistakes.