Focus on deep learning in artificial machine learning networks and comparison to biological systems - Which biological processes do deep networks imitate? - What is missing in artificial networks? - What might make AI/machine learning more like biological intelligence/learning

Principles of deep learning in artificial networks Flashcards by Heleen Kaemingk

Deep learning approach (1)

Learn from experience (machine learning):

No formal rules of transformations
No ‘knowledge base’
No logical inference

How well did you know this?

Not at all

Perfectly

Deep learning approach (2)

Process inputs through a hierarchy of concepts:

Each concept defined by its relationship to simpler concepts
So, build complicated concepts out of simpler concepts

How well did you know this?

Not at all

Perfectly

Course goal (1)

Explore the relationship between cognitive science and AI

How well did you know this?

Not at all

Perfectly

Course goal (2)

Focus on deep learning in artificial machine learning networks and comparison to biological systems

Which biological processes do deep networks imitate?
What is missing in artificial networks?
What might make AI/machine learning more like biological intelligence/learning

How well did you know this?

Not at all

Perfectly

Course goal (3)

Become familiar with the use of AI in cognitive science research

How well did you know this?

Not at all

Perfectly

Course goal (4)

Build some deep learning networks to do human-like tasks

How well did you know this?

Not at all

Perfectly

Why deep learning?

AI has made great advances in tasks that are:
- Described by formal mathematical rules
- Relatively simpel for computers
- Difficult for humans
AI had been less effective in tasks that are:
- Hard/impossible to describe using formal mathematical rules
- BUT easy for humans to perform (intuitive or automatic)
Simulation of neural computation

How well did you know this?

Not at all

Perfectly

Representation & features

Machine learning performance depends on the ….

representation of the case to be classified

what information the computer is given about the situation

How well did you know this?

Not at all

Perfectly

Representation & features

Each piece of input information is knows as a …

feature
(the same feature can be represented in different formats, often easy to convert between formats. The chosen format strongly affects the difficult of the task

How well did you know this?

Not at all

Perfectly

Representation in deep networks

Useful features may need to be transformed or extracted first.
So deep networks have multiple representations -> each is build from an earlier representation
This can: Transform features to a different format before learning their links to the output AND extract complex features from simpler features
Essentially multiple steps in a program
- Each layer can be seen as the computer’s memory state after executing a set of instructions.
- Deeper networks execute more instructions in sequence
Just like a computer program, the individual steps are generally very simple.
- Complex outcomes emerge from interactions between many simple steps

How well did you know this?

Not at all

Perfectly

What is a deep network?

A learning network that transforms or extracts features using:

Multiple nonlinear processing units
Arranged in multiple layers with:
Hierarchical organisation
Different levels of representation and abstraction

How well did you know this?

Not at all

Perfectly

20th century view of object recognition

Builds a representation of local image features
Builds a representation of larger-scale shapes and surfaces
Matches shapes and surfaces with stored object representations-recognition

How well did you know this?

Not at all

Perfectly

Why nonlinear functions?

Any operation that can be done with only linear functions of the input can be straightforwardly described by formal mathematical rules, so is not a good use fore deep networks.

How well did you know this?

Not at all

Perfectly

Name the complex nonlinear function with four operations or processing steps

Filter, threshold, pool and normalize

How well did you know this?

Not at all

Perfectly

Name 1 issue which arises with ReLU

Is has no maximum output, while a biological neuron does have a maximum firing rate

How well did you know this?

Not at all

Perfectly

What does the filter operation do especially?

Study These Flashcards

The response of each unit depends on several neighbouring inputs. So the units after filtering respond to a certain area of the input image, and the activation of neighbouring units will often be similar. After several filter steps, each integrating inputs over an area, each unit will respond very similarly to an extensive area of the input. So neighbouring units are representing very similar information.

What does the pooling operation do?

Study These Flashcards

Downsamples the units to improve computational efficiency. Discards some data in favour of computational efficiency.

The threshold and pool operations use …

Study These Flashcards

max functions. That is why by the pool stage we have a mean activation above zero and an arbitrary range.

What does the normalisation operation do?

Study These Flashcards

It linearly scales the data to have a mean of zero activation for each feature map’s responses to all images.

Why is normalisation important? Name 4 reasons.

Study These Flashcards

Machine learning generally assumes that data reflects measurements of independent and identically-distributed (IID) variables. Normalisation forces identical distributions.
If the activation function depends whether the units response is above or below zero, having zero-mean inputs and zero-mean filters, about half of the units will be active and half inactive. This even split of activation is a very efficient way to store information in a network of limited size.
Having the same range for all feature maps and layers means the same maximum threshold in the activation function can be sued throughout the network..
As a result of these consideration and other technical considerations, training rates are far better after normalisation, and final classification accuracy.

Filter/convolve:

Study These Flashcards

determine how well each group of nearby pixels matches each of a group of filters

Threshold/rectify:

Study These Flashcards

introduces a nonlinearity by setting negative activations of units to zero (and maybe set a maximum activation)

Pool:

Study These Flashcards

Downsample the units to improve computational efficiency

Normalise:

Study These Flashcards

Rescale responses of each feature map to have mean zero and standard deviation one, so each feature map contributes similarly to classification

As we get higher up the network, these filters get harder to understand in two important ways. Name them.

1. The filter shape crosses multiple independent feature maps. An edge detector applied to an image is easy enough to conceptualise, but such a high-dimensional filter is harder to conceptualise. 2. The input feature maps become more abstract. It gets very hard to conceptualise what feature is represented.

Name the reasons why shared weights are used in artificial deep networks (and which do not apply in biological deep networks)?

Filters generally have a single set of weights for all positions in the feature map because: 1. If a feature is useful to compute at one position, it is probably also useful at another position. 2. The filter values are weights that need to be learned. It is very computationally demanding to do this if the set is too large. 3. The convolution operation is a very fast matrix function. If filters are not fixed, the convolution operation cannot be used.

The softmax operation ...

The weights through our network will transform each input image into a 'score', reflecting the match between the top layer's pattern of activation by previous examples of each category. This score must then be converted to a probability that this input image falls into each category. This is almost always done with the 'softmax' function.

Filter structures are targets for machine learning ...

The convolution filters are the main links between different layers of our network, so they effectively form the weights of connections between the nodes in a neural network. Here, the nodes are pixels in a feature map, and the connections between these are filters. So, to learn the weight of connections, the network learns the structure of the filters.

Principles of deep learning in artificial networks Flashcards

(28 cards)