Chapter 4: Early Connectionism & Perceptrons, First AI Winter Flashcards

1
Q

What did Donald Hebb publish, and in what year?

A

Donald Hebb published the organization of behavior: A Neuropsychological Theory in 1949. His work attempted to explain associative learning, now called Hebbian learning:
“Cells that fire together, wire together”

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is SNARC? Who created it and when?

A

SNARC was the first ever artificial neural network, created by Marvin Minsky and Dean Edmonds in 1951. It learned parameters through Hebbian/Skinnerian mechanisms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

When was the Perceptron invented? Who created it?

A

Frank Rosenblatt invented the perceptron in 1958. This first version was initially simulated on an IBM 704, and was later that year implemented in custom hardware. The idea of perceptrons where introduced 3 years later in the Principles of Neurodynamics: Perceptron and the Theory of Brain Mechanisms.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

What features did the perceptron have that were different from the artificial neuron created by McCulloch & Pitts in 1943?

A
  • The perceptron uses numerical inputs and weights, instead of binary values
  • It uses a bias term instead of a threshold value

Neuron with n + 1 inputs is characterized by weights, where each weight is a real number. And the first weight w0 is the bias term: a constant of 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

How do you calculate the output of a perceptron?

A

You can sum up the product of the weights
w0, … , wn and the input values x0, … , xn where the indices correspond.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Explain how weights can be updated using the Perceptron learning rule.

A

The perceptron update rule has 4 components:
- The current weight
- The learning rate
- The error (difference between actual and desired outcome)
- The input associated with the weight

New weights are computed by this formula:
wt+1 = wt + r · (yi − yˆi) · xi

(The reason the input value is included in the learning rule is to adjust the weight in proportion to how much it contributed to the error)

The learning rule stops if the total error is small.
Total error: E = (1/n) * Σ(y - y^)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

When updating the weights of the perceptron, explain the direction of the weight change if the output was too small.

A

If (yi − yˆi) > 0 the output was too small.
- If xi,j > 0, weight wj gets increased ⇒ activation goes up
- If xi,j < 0, weight wj gets decreased ⇒ activation goes up

Analogously (but in the other direction) when (yi − yˆi) < 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is an affine hyperplane?

A

An affine hyperplane is a subspace of an affine space that is defined by a set of linear equations. In other words, it is a flat surface that is defined by a set of linear equations and is parallel to a given vector space. In the context of machine learning, an affine hyperplane is often used to separate different classes of data in a multi-dimensional space. For example, in a two-dimensional space, an affine hyperplane is a straight line that separates two classes of data. In a three-dimensional space, it’s a plane that separates two classes of data, and so on.

A general form of the hyperplane is called an affine hyperplane which is defined as the set of points that satisfy:
w_1 * x_1 + w_2 * x_2 + … + w_n * x_n + b = 0
where w_i are the weights and b is the bias term. This is the equation of a hyperplane in n-dimensional space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Define linear separability.

A

Two sets A, B ⊆ Rn are called linearly separable if there is a vector v ∈ Rn and scalar c ∈ R such that, for all a ∈ A and b ∈ B:
v · a ≥ c and v · b < c

  • There is a separating hyperplane such that A and B lie on different sides
  • This is encoded in a neuron with weights w~ = (−c, v1, . . . , vn)
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Give a low-level explanation of the term “disjoint convex hulls”.

A

Disjoint convex hulls means that there is no overlap between the shapes that tightly fit around all points of each class. The two shapes are completely separate and do not intersect each other. In the context of linear separability, this means that there exists a hyperplane (such as a line or a plane) that can be used to separate the two classes without any points from one class falling into the other class.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is meant with “universal function representors” in the context of multilayer perceptrons?

A

In the context of MLPs, “universal function representors” refer to the ability of these networks to approximate any arbitrary function. This means that, given enough hidden neurons and training data, an MLP can be used to represent any function, no matter how complex. This property is known as the universal approximation theorem, which states that a feedforward network with a single hidden layer containing a finite number of neurons can approximate any continuous function to arbitrary accuracy. This makes MLPs a powerful tool for a wide range of tasks, such as function approximation, regression, and classification.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Explain how Rosenblatt designed the hidden layer of a multi-layer perceptron.

A

Rosenblatt’s MLP hidden neurons refers to a specific way of designing the hidden layer of a multi-layer perceptron (MLP) to represent a target function F. The target function F maps input vectors from {0, 1}^n to {0, 1}. The idea is to create a hidden layer with exactly one pattern-matching neuron for every possible input vector x in {0, 1}^n.

For example, if n=3, then the input space is {0, 1}^3 = {(0,0,0), (0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), (1,1,0), (1,1,1)}. In this case, there would be 2^3 = 8 hidden neurons, each one “matching” a unique input vector.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

What did Michael Lighthill publish, when was it and what was written in it?

A
  • He published Artificial Intelligence: A General Survey in 1973, also named The Lighthill Report
  • Provided a critical look at promises and accomplishments of AI in preceding decades: “In no part of the field have the discoveries made so far produced the major impact that was then promised.”
  • Widely cited as reason for withdrawal of major funding sources in UK.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

When was the first AI winter, and what characterized it?

A
  • 1974 - 1980
  • DARPA reduced funding throughout academia in the US
  • Lighthill report decimates funding in the UK.
  • Overall result is decreased research activity over the next few years.
How well did you know this?
1
Not at all
2
3
4
5
Perfectly