GAMLE SPM Flashcards
Artificial neural networks are?
Parallel distributed computing systems
In the delta rule, the weight changes are given by?
By the difference between target and output, multiplied by the input
In simulated annealing
At high temperatures the network randomly explores the state space
Difference between batch and online (or mini batch learning?
Gradient descent against the global error function in batch learning (computed first across all patterns and then weights are changed.
Stochastic gradient descent against the partial error function (i.e. weights are changed according to the gradient computed for a single example (or for a small number of examples in the minibatch)
What does AUC mean, range in between and what does 1 in AUC mean?
Area under the curve. Ranges between 0 and 1. Represents how good the classifier is, where AUC means the predictions are 100 pct. correct (no false positives, 100 pct. true positives).
What are variational autoencoders?
Variational Autoencoders (VAEs) are a type of generative model in the field of machine learning and artificial intelligence. They belong to the broader family of autoencoders but have distinctive features, particularly in generating new data samples.
what is associative learning (hopfield networks)?
Associative learning, particularly in the context of Hopfield Networks, refers to a type of artificial neural network that is capable of learning and recalling associations or patterns. Hopfield Networks are recurrent neural networks with the ability to store and retrieve specific patterns or memories.
What is a discount factor / gamma?
The discount factor, often denoted by the symbol “γ” (gamma), is a parameter used in reinforcement learning algorithms. It determines the importance of future rewards in the decision-making process of an intelligent agent. The discount factor is a value between 0 and 1.
When an agent makes decisions in a sequential environment, it often receives rewards at each time step. The discount factor allows the agent to weigh immediate rewards against future rewards. A discount factor of 1 means that the agent considers future rewards with equal importance to immediate rewards, while a discount factor less than 1 gives less weight to future rewards.
How can a gamma / discount factor of 1 be useful?
(Reinforcement learning)
In practical terms, a discount factor less than 1 encourages the agent to prioritize short-term rewards, making its planning horizon finite. This is often useful in scenarios where the long-term consequences of actions are uncertain or less relevant.
Handling Infinite Horizons:
When dealing with infinite horizons, a discount factor less than 1 ensures that the sum of future rewards converges to a finite value. This is particularly important in mathematical formulations of reinforcement learning algorithms.
Question 1: What does it mean when a problem is not linearly separable, and which models are used to address this issue?
a) The problem is easy to solve with linear models.
b) The problem is complex and requires non-linear models.
c) The problem has no solution; no models can address it.
d) The problem is linear, and any model can be applied.
Answer 1: The correct answer is b) The problem is complex and requires non-linear models.
Question 2: What is the purpose of the Hebb rule?
a) To suppress neural activation
b) To reinforce neurons with correlated connections
c) To randomize weight changes
d) To activate hidden neurons randomly
Answer 2: The correct answer is b) To reinforce neurons with correlated connections.
Question 3: In the delta rule, what does the weight change depend on?
a) The learning rate
b) The input
c) The difference between target and output multiplied by bias
d) The activation function
Answer 3: The correct answer is c) The difference between target and output multiplied by bias.
Question 4: In contrastive divergence, what method is effective for generalization?
a) Backpropagation
b) Pruning
c) Simulated annealing
d) Reinforcement learning
Answer 4: The correct answer is b) Pruning.
How do Boltzmann Machines utilize hidden units to learn higher-order correlations in the data?
Boltzmann Machines use hidden units to learn higher-order correlations in the data by employing stochastic activations, Gibbs sampling, and simulated annealing during the iterative update process.
What is the positive phase in the training of Restricted Boltzmann Machines using contrastive divergence?
The positive phase involves presenting the pattern to the network, clamping it to the visible neurons, and computing the correlations between all visible and hidden neurons.
Explain the challenges associated with training Recurrent Neural Networks (RNNs).
Training RNNs is challenging due to the issue of vanishing or exploding gradients over many time steps, making it difficult to effectively backpropagate and update weights.
How does backpropagation work, and what is its role in training multilayer neural networks?
Backpropagation is a method to train multilayer neural networks by computing gradients and backpropagating them through all layers, adjusting weights to minimize error. It is crucial for finding optimal weights and minimizing the error in the network.
Describe the structure and purpose of Convolutional Neural Networks (CNNs).
CNNs are designed with convolutional and pooling layers to process data in the form of multiple arrays. Convolutional layers detect local conjunctions of features, while pooling layers merge semantically similar features. Stages of convolutional and pooling layers are stacked, followed by more convolutional and fully connected layers.
Explain the storage and retrieval process in Hopfield Networks.
Hopfield Networks are fully recurrent neural networks used to store and retrieve data patterns. The storage is performed by gradually changing connection weights using a Hebbian-like learning rule. Retrieval involves iteratively updating the state of neurons until a stable state is reached, assigning higher probability to observed configurations during training.
What is the purpose of Simulated Annealing, and how does it work?
Simulated Annealing is a method used to minimize the energy of a system by heating it up and gradually cooling down. It is employed to find the best energy minima and is particularly useful in optimization problems.
Describe the concept of Competitive Learning and how it is implemented.
Competitive Learning involves projecting input patterns to a pool of neurons with lateral inhibitory connections. The neuron with the highest activation gradually dominates others through competitive dynamics. The implementation follows a winner-takes-all approach, where the neuron with the highest activation becomes the winner.
What is the significance of Self-Organizing Maps (SOMs) in competitive layer structures?
In competitive layer structures, Self-Organizing Maps impose a topological structure where each neuron forms coalitions with its neighbors. This allows for a more accurate mapping of the input space into a lower-dimensional manifold, enabling neurons to compete with distant neurons but cooperate with close neighbors to represent input patterns effectively.
How does the learning process of Self-Organizing Maps (SOMs) work?
In the learning process of SOMs, the input vector is compared to the weight vector of each hidden neuron. The neuron with the closest weights is declared the winner. The weights of the winner neuron and its neighbors are adapted, with nearby neurons receiving similar updates. This process helps respond to nearby input patterns.
What is the objective of Principal Component Analysis (PCA) in the context of neural networks?
PCA, or Principal Component Analysis, is a statistical technique that aims to find directions of maximum variability in a given dataset. In the context of neural networks, PCA helps discover a set of linearly uncorrelated variables that explain as much variance as possible, achieved through a rotation of the data to maximize variance in new axes.