Principles of deep learning in artificial networks Flashcards
(28 cards)
Deep learning approach (1)
Learn from experience (machine learning):
- No formal rules of transformations
- No ‘knowledge base’
- No logical inference
Deep learning approach (2)
Process inputs through a hierarchy of concepts:
- Each concept defined by its relationship to simpler concepts
- So, build complicated concepts out of simpler concepts
Course goal (1)
Explore the relationship between cognitive science and AI
Course goal (2)
Focus on deep learning in artificial machine learning networks and comparison to biological systems
- Which biological processes do deep networks imitate?
- What is missing in artificial networks?
- What might make AI/machine learning more like biological intelligence/learning
Course goal (3)
Become familiar with the use of AI in cognitive science research
Course goal (4)
Build some deep learning networks to do human-like tasks
Why deep learning?
AI has made great advances in tasks that are:
- Described by formal mathematical rules
- Relatively simpel for computers
- Difficult for humans
AI had been less effective in tasks that are:
- Hard/impossible to describe using formal mathematical rules
- BUT easy for humans to perform (intuitive or automatic)
Simulation of neural computation
Representation & features
Machine learning performance depends on the ….
representation of the case to be classified
what information the computer is given about the situation
Representation & features
Each piece of input information is knows as a …
feature
(the same feature can be represented in different formats, often easy to convert between formats. The chosen format strongly affects the difficult of the task
Representation in deep networks
- Useful features may need to be transformed or extracted first.
- So deep networks have multiple representations -> each is build from an earlier representation
- This can: Transform features to a different format before learning their links to the output AND extract complex features from simpler features
- Essentially multiple steps in a program
- Each layer can be seen as the computer’s memory state after executing a set of instructions.
- Deeper networks execute more instructions in sequence - Just like a computer program, the individual steps are generally very simple.
- Complex outcomes emerge from interactions between many simple steps
What is a deep network?
A learning network that transforms or extracts features using:
- Multiple nonlinear processing units
- Arranged in multiple layers with:
- Hierarchical organisation
- Different levels of representation and abstraction
20th century view of object recognition
- Builds a representation of local image features
- Builds a representation of larger-scale shapes and surfaces
- Matches shapes and surfaces with stored object representations-recognition
Why nonlinear functions?
Any operation that can be done with only linear functions of the input can be straightforwardly described by formal mathematical rules, so is not a good use fore deep networks.
Name the complex nonlinear function with four operations or processing steps
Filter, threshold, pool and normalize
Name 1 issue which arises with ReLU
Is has no maximum output, while a biological neuron does have a maximum firing rate
What does the filter operation do especially?
The response of each unit depends on several neighbouring inputs. So the units after filtering respond to a certain area of the input image, and the activation of neighbouring units will often be similar. After several filter steps, each integrating inputs over an area, each unit will respond very similarly to an extensive area of the input. So neighbouring units are representing very similar information.
What does the pooling operation do?
Downsamples the units to improve computational efficiency. Discards some data in favour of computational efficiency.
The threshold and pool operations use …
max functions. That is why by the pool stage we have a mean activation above zero and an arbitrary range.
What does the normalisation operation do?
It linearly scales the data to have a mean of zero activation for each feature map’s responses to all images.
Why is normalisation important? Name 4 reasons.
- Machine learning generally assumes that data reflects measurements of independent and identically-distributed (IID) variables. Normalisation forces identical distributions.
- If the activation function depends whether the units response is above or below zero, having zero-mean inputs and zero-mean filters, about half of the units will be active and half inactive. This even split of activation is a very efficient way to store information in a network of limited size.
- Having the same range for all feature maps and layers means the same maximum threshold in the activation function can be sued throughout the network..
- As a result of these consideration and other technical considerations, training rates are far better after normalisation, and final classification accuracy.
Filter/convolve:
determine how well each group of nearby pixels matches each of a group of filters
Threshold/rectify:
introduces a nonlinearity by setting negative activations of units to zero (and maybe set a maximum activation)
Pool:
Downsample the units to improve computational efficiency
Normalise:
Rescale responses of each feature map to have mean zero and standard deviation one, so each feature map contributes similarly to classification