Lecture 9 Flashcards

Question 1

Q

Convolutional Neural Network (CNN)

Answer

A

A Convolutional Neural Network (CNN) is a type of deep learning algorithm that is particularly well-suited for image recognition and processing tasks. It is made up of multiple layers, including convolutional layers, pooling layers, and fully connected layers.

Question 2

Q

How do we process and recognize images?

Answer

A

For visual perception, our neuronal cells are in charge of different
orientation. For example, some will respond to vertical edges, some
horizontal, some diagonal, etc. These neuronal cells are organized in columnar architecture and function together to full the visual
perception tasks.

Question 3

Q

Key Insights from Mammalian Vision

Answer

A

An image is not processed, perceived or understood in one huge lump
The vision system considers small chunks of the visual field and
extracts key features from each
Features are combined at later stages of processing into something
recognizable as an object
This insight suggests that at the lowest level we can slide a small
“receptive window” over input data – convolution – to process small
chunks of input

Question 4

Q

What is Happening in Convolutional Layer?

Answer

A

Filters are composed of two parts:
* A set of weights
* An activation function

Question 5

Q

convolution

Answer

A

convolution is the summation of
the element-wise product of 2
matrices.

Question 6

Q

Sets of Layers in Typical Sequences

Answer

A

The convolution, non-linear, and pooling layers are typically used as a set. Multiple sets of the above three layers can appear in a CNN design.

Question 7

Q

Sets of Layers in Typical Sequences

Answer

A

Input -> Conv. -> Non-linear -> Pooling -> Conv. -> Non-linear -> Pooling -> …->
Output

Question 8

Q

Sets of Layers in Typical Sequences

Answer

A

After a few sets, the output is typically sent to one or two fully
connected (dense) hidden layers.
* A fully connected layer is an ordinary neural network layer as in other neural networks.
* Typical activation function is the sigmoid function.
* Output is typically class (classification) or real number (regression).

Question 9

Q

Keras/TensorFlow in Python

Answer

A

Many different software platforms support neural network analysis, generally, and CNNs particularly. Python was used to build some of the earliest tools, but as an interpreted language
Python is far too slow to actually fit neural models at scale. Instead, we use a “front end” – “back
end” arrangement to take advantage of the efficiency of languages like C++ and CUDA (a GPU language). Here, we are using the Keras package as the “front end” for setting up our model and
data, and then Keras passes this to the TensorFlow backend to do the actual model fitting.

Question 10

Q

Two Keras Model Types

Answer

A

Sequential
(Functional) Model

Question 11

Q

Sequential

Answer

A

Simplest approach and used in the majority of examples
Allows for one “input tensor” and
one “output tensor”
Each successive layer of the model is “stacked” on the previous layer
The layers are connected in order
of how they are invoked and the
connections between layers are
made automatically

Question 12

Q

(Functional) Model

Answer

A

More complex and flexible
approach – addresses difficult
“non-standard” computing
problems
Allows for more than one “input
tensor” and more than one
“output tensor”
The output of a layer can be
connected to more than one
subsequent layer (think of this like
parallel branches)

Question 13

Q

What is Tensor?

Answer

A

Tensor is a dimensional data structure
A first-rank tensor can be a vector
A second-rank tensor can be a matrix

Is a matrix = second-rank tensor?

“all squares are rectangles, but not all rectangles are squares”

Tensors obey specific transformation rules as part of the structure they have
but matices do not necessarily have this.

Question 14

Q

Many Types of Layers Supported

Answer

A

Each layer has a particular
architectural configuration meant
to accomplish a particular kind of
task
For example, we know that pooling layers do data reduction while highlighting strong features
Each layer has options for size,
initialization, and activation
function

Question 15

Q

Many Types of Layers Supported

Answer

A

Partial list:
Preprocessing layers (e.g., text)
Core layers (basic types, e.g., “Dense”)
Convolution layers (1D, 2D, and 3D)
Pooling layers (1D, 2D, and 3D; max or
average)
Recurrent layers (e.g., LSTM)
Normalization and regularization layers
Attention layers (multi-head)
Reshaping/merging
Activation layers

Question 16

Q

Activation Function Reminder

Answer

Study These Flashcards

A

The ”secret sauce” of neural
networks is non-linear activation
functions
Linear functions model linear
phenomena; anything more
complex and we get predictions
that only work in a narrow range
After the inputs to a neural node
are summed, the activation
function produces an output value
(Y) based on the sum of the input
values (X) according to curves like
the ones at the right

Question 17

Q

Loss Function, Optimizer, Metrics

Answer

Study These Flashcards

A

A loss function (AKA “cost” or “error” function) is an expression
that produces a value for “how wrong we are” with a set of
predictions
There are two big groups of loss functions, one for classification
tasks (probabilistic losses) and one for metric prediction tasks
(regression losses)
The most well known (and widely used) regression loss is “mean
squared error” – the mean of the squared differences between
predicted and actual y values

Question 18

Q

Loss Function, Optimizer, Metrics

Answer

Study These Flashcards

A

Optimizers (in Keras) control
the practicalities of how
model fitting pursues the loss
function
“stochastic gradient descent”
– imagine a skier making small
random turns to go downhill
as quickly as possible
AdaDelta optimizer can adjust
that learning rate dynamically
to make model fitting more
efficient

Question 19

Q

Embedding Layer -
Tweet Matrix

Answer

Study These Flashcards

A

Each tweet ti consists of a sequence of tokens w1，w2,…wni . L1 is the maximum tweet length. Short tweets are padded using zero padding.
Every word is represented as a d-dimensional word vector
The publicly available pre-trained GloVe word vectors for Twitter
by (Pennington et al., 2014).

Question 20

Q

Embedding Layer -
Hash-Emo Matrix

Answer

Study These Flashcards

A

Hashtags, emoticons and emojis
for each tweet ti, we extract hashtags h1, h2, … and emoticons/emojis e1, e2, e… and concatenate the hashtags and emoticon/emoji vectors
L2: the height of the Hash-Emo Matrix. Tweets with the number of
hash-emo features less than L2 are padded with zero while tweets
with more hash-emo features than L2 are truncated.
d-dimension Word vectors from GloVe
Random initiation
no word vector is found for a particular word and emoticons.
for emojis, we first map it to something descriptive; and then generate random word vectors

Question 21

Q

Convolutional Layer

Answer

Study These Flashcards

A

Apply m filters of varying window sizes over the Tweet Matrix from
the embedding layer
window size (k) refers to the number of adjacent word vectors in the Tweet Matrix that are filtered together (when k > 1)

Question 22

Q

Dropout and Max Pooling Layer

Answer

Study These Flashcards

A

ReLU is applied before dropout layer
Dropout is used as a regularization strategy to avoid overfitting
Max-pooling: the maximum value for each filter

Question 23

Q

Dropout and Max Pooling Layer

Answer

Study These Flashcards

A

1 3 2 1 3
2 9 1 1 5 9 9 5
1 3 2 3 2 -> 9 9 5
8 3 5 1 0 8 6 9
5 6 1 2 9

The filter moves through the layer in a 3x3 matrix extracting the highest value.

Question 24

Q

Fully Connected Layer

Answer

Study These Flashcards

A

Maps the inputs to a number of outputs corresponding to the
number of classes we have.
Emotion recognition: a multi-class classification task
- Softmax as the activation function and categorical cross-entropy as the loss function
- The output of the softmax function is equivalent to a categorical probability
  distribution which generally indicates the probability that any of the classes are true

Lecture 9 Flashcards

Convolutional Neural Network (CNN) (24 cards)