Neural Networks and Deep Learning Foundations Flashcards by Franklin Hole

What is the main difference between Traditional Machine Learning and Deep Learning?

Traditional Machine Learning involves manually selecting features, while Deep Learning learns features automatically from raw data.

How well did you know this?

Not at all

Perfectly

What is the purpose of Gradient Descent in machine learning?

To reduce errors in predictions and improve the model’s accuracy.

How well did you know this?

Not at all

Perfectly

What are the steps involved in the Backpropagation process?

Forward Pass
Compute Loss
Backpropagation
Gradient Descent

How well did you know this?

Not at all

Perfectly

True or False: Neural Networks are inspired by the structure of the human brain.

True

How well did you know this?

Not at all

Perfectly

What are the three types of layers in a Neural Network?

Input Layer
Hidden Layers
Output Layer

How well did you know this?

Not at all

Perfectly

Fill in the blank: The equation for a prediction in a neural network is Prediction = Input × Weight → _______.

[Activation Function]

How well did you know this?

Not at all

Perfectly

What are the two types of Loss Functions mentioned?

Binary Cross-Entropy
Categorical Cross-Entropy

How well did you know this?

Not at all

Perfectly

What is the role of an Optimizer in deep learning?

To help the AI learn faster by adjusting learning rates and strategies.

How well did you know this?

Not at all

Perfectly

What dataset is used in the example of Handwritten Digit Recognition?

MNIST Dataset

How well did you know this?

Not at all

Perfectly

What is the first step in building a simple neural network using Keras?

Import Libraries

How well did you know this?

Not at all

Perfectly

What is the purpose of normalizing pixel values in the MNIST dataset?

To improve training efficiency by scaling values from 0-255 to 0-1.

How well did you know this?

Not at all

Perfectly

What is One-Hot Encoding used for in the context of neural networks?

To convert labels into a format the neural network can understand.

How well did you know this?

Not at all

Perfectly

What activation function is used in the first layer of the example neural network?

ReLU (Rectified Linear Unit)

How well did you know this?

Not at all

Perfectly

What metric is used to evaluate the performance of the model on test data?

Accuracy

How well did you know this?

Not at all

Perfectly

True or False: Keras simplifies the process of building neural networks.

True

How well did you know this?

Not at all

Perfectly

What is the goal of training a neural network on the MNIST dataset?

To correctly predict handwritten digits (0-9).

How well did you know this?

Not at all

Perfectly

What is the goal of Backpropagation and Gradient Descent?

Minimize the error between predicted and actual outputs.

How well did you know this?

Not at all

Perfectly

What do activation functions prevent in neural networks?

They prevent neural networks from behaving like linear regression models and allow them to learn complex relationships.

How well did you know this?

Not at all

Perfectly

What is the equation without activation functions?

output = dot(W, input) + b

What is the equation with activation functions?

output = ReLU(dot(W, input) + b)

What is a Linear Activation Function?

Output = Input (Straight line)

What is the main issue with the Sigmoid activation function?

Vanishing Gradient – When values go beyond ±3, the gradient becomes tiny, and learning slows down.

What does the Softmax activation function do?

Converts values into probabilities that sum to 1.

What is an example of Softmax output?

Cat: 70%
Dog: 20%
Bird: 10%

What are the benefits of the Tanh activation function?

Maps inputs between -1 and 1, allowing negative values.

What is a key problem with Tanh?

Still suffers from vanishing gradient for large values.

What distinguishes ReLU from other activation functions?

Only activates for positive values; negative inputs become 0.

What is a problem associated with ReLU?

Dead Neurons – if a neuron only gets negative values, it stops learning.

How does Leaky ReLU address the dead neuron problem?

Gives negative values a small value instead of zeroing them out.

What is the best use case for the Sigmoid activation function?

Probability outputs (binary classification).

What problems are associated with Softmax?

Can be computationally expensive.

When is Tanh best utilized?

In situations where negative values are needed.

What are the common uses for ReLU?

Deep learning models (default choice).

What is the main advantage of Leaky ReLU?

Prevents dead neurons and improves training.

Fill in the blank: Activation functions allow _______ models to learn complex data.

[deep learning]

What is the most commonly used activation function for hidden layers?

ReLU

What activation function is best for multi-class classification?

Softmax

What is the purpose of activation functions?

They enable deep networks to learn complex data relationships.

What is a key limitation of linear activation functions?

They can't handle complex patterns.

Name a common activation function used for probabilities.

Sigmoid

What is the main drawback of the sigmoid activation function?

It suffers from vanishing gradients.

Which activation function is used in multi-class classification?

Softmax

What range does the Tanh activation function map inputs to?

-1 and 1.

What is the most commonly used activation function?

ReLU

What problem does Leaky ReLU address?

The 'dying neuron' problem.