3) The supervised learning problem Flashcards
(49 cards)
In supervised learning, what are the input space, label space, and concept function
What is a loss function in supervised learning, and what is its purpose
What is the hypothesis space in supervised learning
The hypothesis space H is a set of functions h:X→Y that serve as potential candidates for the concept function c
How is the generalisation error of a hypothesis h defined, and what does it represent
What is the goal of the supervised learning task, and how is it formulated as an optimisation problem
What is training data
Training data refers to pairs (x1 ,y1),…,(xN ,yN)∈X×Y, which are used to learn and approximate the concept function c
How is the empirical error of a hypothesis h defined using training data
What is the empirical learning problem, and what is the goal of training in supervised learning
What is the data-generating distribution setting in supervised learning, and how does it generalise the concept setting
What is the difference between regression and classification problems in supervised learning
Describe examples of regression and classification problems
Regression -
* House price prediction
* Temperature prediction
* Stock price forecasting
Classification -
* Spam email detection
* Disease diagnosis (disease present or not)
* Image recognition (e.g. digits)
What is a parametric model in the context of hypothesis space
- A parametric model is one where the model is fully described using a fixed number of parameters.
- The hypothesis space is the set of all functions the model can represent, and each function is determined by a specific choice of parameters
- Think of it like: Hypothesis space = all possible models we can get by plugging different values into our parameterized formula
Formally - H={H(x,w):w∈W}
How is the zero-one loss (0-1 loss) defined
Describe an example of a linear classifier defined in a binary classification problem with Y={−1,1} and X=R ^M
How does a linear classifier use a hyperplane to classify points in a binary classification problem
What is quadratic loss
What is the space of square-integrable functions
In regression with quadratic loss, how can we minimise the generalisation error
Describe the proof tha the conditional expectation is the best predictor under squared loss
How is linear regression formulated using quadratic loss and dictionary functions
What are some common choices for dictionary functions in linear regression
How are the generalisation error and empirical error expressed in linear regression, and what type of optimisation problem does this lead to
How is the logistic function defined in a probabilistic classification setting
What is logistic regression, and how does it model classification problems