Lecture 15 - Neural Networks Part 3 Flashcards

Question 1

Q

What is the generalised delta rule?

show the formula?

Answer

A

A rule that can be used to modify the weights to minimize |z-y| regardless of the order of the output function

σx_j=(z-y)(dy/dw_j)

Question 2

Q

What is the formula of the logistic function?

What are its properties?

Answer

A

y = 1 / ( 1 + e^{-wbar xbar})

Curve has a sigmoid shape

Differentiable

Monotonically increasing (i.e. higher x is always higher y)

Tends to 1 as input tends to +inf

Tends to 0 as input tends to -inf

Equal to 0.5 when input equal to 0

Question 3

Q

What is the derivative of the logistic function?

Question 4

Q

Why is training hidden units difficult?

Answer

A

No idea what their expected outputs should be

Question 5

Q

What is the solution to the difficulty of training hidden units?

Answer

A

Devise a way of making plausible guesses at what the outputs should be

Question 6

Q

To use the generalised delta rule, the output function must be ____________

Answer

A

Differentiable

Question 7

Q

How are hidden units trained?

Answer

A

Feeding back the error from the output layer

-> the estimated error of a hidden unit is the weighted sum of the errors of the output units

Question 8

Q

Passing back a weighted error to train a hidden unit is known as ?

Answer

A

back propagation

Question 9

Q

What is the issue with large values of a in back propagation?

Answer

A

May make it impossible to find a minimum error value

Question 10

Q

What is the issue with a small value of a in back propagation?

Answer

A

Prolong the gradient descent and increase the chance of finding a local (rather than global) minimum

Question 11

Q

What are two ways of determining when to stop training using back propagation?

Answer

A

Keep going until error falls below a given threshold

Keep going until average error change is small

Question 12

Q

How can we reduce overfitting in back propagation networks?

Answer

A

Decrease number of hidden units

Add weight decay (each iteration all weights decrease slightly)

Use large number of high quality training samples

Question 13

Q

What is the problem with decreasing the number of hidden units and adding weight decay to combat overfitting in back propagation networks?

Answer

A

Restrict the complexity of the output function

Question 14

Q

What is cross validation, as it relates to overfitting?

Answer

A

Split training data into two sets: one to modify weights and one to test the weights

Test the test data after each weight change

Use the weights that gave the best result

Question 15

Q

For what purpose are back propagation networks particularly useful?

Answer

A

Approximation, regression, prediction, classification

Brainscape's Knowledge GenomeTM

Lecture 15 - Neural Networks Part 3 Flashcards

Brainscape's Knowledge Genome^TM