Week 1 Flashcards

(26 cards)

1
Q

Definition of machine learning

A

when the program learns from experience E with respect to some class of tasks, T and has a measured performance P. and performance P improves with experience E

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define classification generically

A

input x provides some feature representation e.g. colour, shape etc. Output Y takes some values in the prescribed set {1,2,…,k}, where k is the number of classes. Some function f:R^d -> {1,..k} is constructed such that if object with features x in R^d belongs to class y then f(x)=y

Alternatively, construct a function that given features returns the probability of the features

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Difference between unsupervised and supervised learning

A

unsupervised has only input features. supervised has input-output pairs

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

what are two approaches to measuring Performance P

A

-Accuracy: proportion of correctly classified examples i.e. the number of correctly classified examples divided by the total number of examples
-Residual: used for regression tasks, which is the difference between true output and predicted output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

what are the 3 types of learning tasks?

A

supervised, unsupervised, reinforcment

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

describe supervised learning

A

we have a training dataset in terms of input-output pairs, output is the supervised information. We aim for our model to have predictions from inputs map to an output that is close to the true predicted output

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

describe unsupervised learning

A

we only have feature information. We aim to find a pattern/relationship among feature information whether by a partition of the data into different groups or an internal representation of the input (higher dimensional data to low dimensional data) so that the relationships between examples are preserved

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Reinforcement learning

A

a learning task where we want to interact with the environment. we aim to maximize some reward function/ minimise loss function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

what are two common tasks in supervised learning and define their use cases

A

Classification: used to predict discrete values
Regression: used to predict continuous values

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what are two common tasks in unsupervised learning and define their use cases

A

Clustering: the algorithm tries to detect similar groups using some similarity measure to quantify the similarity. we aim to find a partition of the training dataset into different groups. examples within the same group are similar under the similarity measure, examples from other groups are dissimilar

Dimensionality Reduction: simplifies data without loosing too much information,

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what are two common tasks in unsupervised learning and define their use cases?

A

Clustering: the algorithm tries to detect similar groups using some similarity measure to quantify the similarity. we aim to find a partition of the training dataset into different groups. examples within the same group are similar under the similarity measure, examples from other groups are dissimilar

Dimensionality Reduction: simplifies data without losing too much information. an example may have many features only some of which are actually important to learning. Dimensionality reduction aims to transform the original features into some few features whilst keeping as much information as possible

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Give a slightly more thorough account of reinforcement learning

A

The agent interacts with the environment:
For each Time Step:
-Agent receives observations (e.g. (x,y)), which give it information about the state
-Agent chooses the action that affects the state
Agent periodically receives a reward
Agent wants to learn a policy or mapping from observations to action which maximises the average rewards over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

give a simple account of a reinforcement learning example

A

Consider an agent in a maze
-agent observes location (x,y) which is a tate
-the agent can take action by moving north, south, west or east
-after taking the aforementioned action, the agent’s new state is e.g. (x+1,y)
-the environment will give the agent feedback in terms of reward, in this example the reward may be -1 per step since we want to escape the maze asap
-Agent chooses her action according to a policy, the policy is a map from state to action, agent learns this policy to obtain maximum rewards over time

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

what are the components to an agents actions in reinforcement learning

A

Agent (Z) -> Action (a_t) , Environment, Reward (r_t) and State (S_t), back to agent

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

how are correct incorrect guesses of a function represented

A

classification, y in {-1,+1},
regression, y in set of real numbers

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

what arguments do loss functions take

A

model prediction value
actual value

17
Q

if someone says because the model performs well on training data it can be deployed, why are they a numpty?

A

Training data is subject to overfitting and training a model on just a particular instance of the data does not constitute an accurate model. The model must first be put through testing and validation before it is deployed

18
Q

What is the formal definition of training error

A

Err_train(f)= (1/n) * Sum(of n elements)(Loss(f(x^(i)-y^(i))^2

Loss function could be: Loss( f( x^(i) ), y^(i) ) = f( x^(i) ) - y^(i) )^2

19
Q

What is the formal definition of testing error

A

Err_test(f) - Err_traing(f) + Err_test (f) - Err_train (f)

Err_test(f) - Err_train(f) can be generalised as Gen_gap(f)

20
Q

what is the generalization gap and what value is it typically greater than and why

A

The generalization gap is the difference between the test error and the training error

it is typically larger than 0 as the model is trained on the training dataset and therefore can achieve a small training error,

21
Q

how do under-fitted / overfitted models perform on test vs training data

A

-underfitted, training poor and test is poor
-overfitted, training good test is poor

22
Q

describe underfit, overfit and optimum curve representations of training models

23
Q

what does overfitting mean in plain words

A

model performs well on training data but does not generalize well to tet data, this happens when model is too complex for the amount of data

24
Q

describe the relationship between test and training error as a function of model complexity

25
What is the standard partition of data (training/testing)
80% training, 20% testing
26
What is the difference between the training set and validation set
the training set is used to test the performance of the model while validation is used to choose the best model among the prediction function