Lecture 15 - Neural Networks Part 3 Flashcards

1
Q

What is the generalised delta rule?

show the formula?

A

A rule that can be used to modify the weights to minimize |z-y| regardless of the order of the output function

σxj=(z-y)(dy/dwj)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the formula of the logistic function?

What are its properties?

A

y = 1 / ( 1 + e-wbar xbar)

Curve has a sigmoid shape

Differentiable

Monotonically increasing (i.e. higher x is always higher y)

Tends to 1 as input tends to +inf

Tends to 0 as input tends to -inf

Equal to 0.5 when input equal to 0

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the derivative of the logistic function?

A

y(1-y)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Why is training hidden units difficult?

A

No idea what their expected outputs should be

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the solution to the difficulty of training hidden units?

A

Devise a way of making plausible guesses at what the outputs should be

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

To use the generalised delta rule, the output function must be ____________

A

Differentiable

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

How are hidden units trained?

A

Feeding back the error from the output layer

-> the estimated error of a hidden unit is the weighted sum of the errors of the output units

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Passing back a weighted error to train a hidden unit is known as ?

A

back propagation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is the issue with large values of a in back propagation?

A

May make it impossible to find a minimum error value

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is the issue with a small value of a in back propagation?

A

Prolong the gradient descent and increase the chance of finding a local (rather than global) minimum

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What are two ways of determining when to stop training using back propagation?

A

Keep going until error falls below a given threshold

Keep going until average error change is small

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can we reduce overfitting in back propagation networks?

A

Decrease number of hidden units

Add weight decay (each iteration all weights decrease slightly)

Use large number of high quality training samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What is the problem with decreasing the number of hidden units and adding weight decay to combat overfitting in back propagation networks?

A

Restrict the complexity of the output function

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is cross validation, as it relates to overfitting?

A

Split training data into two sets: one to modify weights and one to test the weights

Test the test data after each weight change

Use the weights that gave the best result

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

For what purpose are back propagation networks particularly useful?

A

Approximation, regression, prediction, classification

How well did you know this?
1
Not at all
2
3
4
5
Perfectly