Linear regression Flashcards

1
Q

Data-set

A

Supervised Learning problem:
{(x1,y1),…,(xN,yN)}
xi app R^d
yi app R

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Hypothesis

A

Assuming a linear relation between x and y
h(x) = sum(i=0,d) wi*xi = w’ x

where
w = [w0, w1, …, wd]’
x = [x0, …, xd]’

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Learning algorithm

A

Minimize wrt h the sum of the squared distances between the data and the line h(x)

In theory, minimization of the out of sample error:
Eout(h) = E[ (h(x) - f(x))^2 ]

Since the probability distribution of f is unknown, in practice, minimization of the in sample error:
Ein(h) = 1/N * sum(n=1,N) (h(xn) - yn)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Analytical formula that solves the problem

A

Least squares formula:
w^ = (X’ X)^-1 X’ Y

where X is the input matrix N x d+1
Y is the output vector app R^N

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Generalization

A

Theorem:

Eout(h) = E[ (h(x) - f(x))^2 ] = Ein(h) + O(d/N)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly