Midterm Flashcards

(47 cards)

1
Q

Backpropagation Steps. Input and Output Params

A

(1) plug input vector in, run it through until output value found
(2) use delta learning rule to modify weight values based on diff between expected and got output values

Input is Alpha, Inputs, Expected Outputs
(1) Use inputs through forward prop to get the expected values
(2) Apply found o to formula ∆Wij = α * (c - o) o (1-o) xj
(3) Add found ∆W to old W.

For 2 layer network ∆V = ∑ Wij ∆Wij (1 − hj) xk, use h instead of X for W.
V_2,2 = W_1,2 ∆W_1,2 (1-h_2)Xk, the summation means every i, so when only 1 input i always 1.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Delta Learning Rule

A

Delta Wij = alpha (c - o) o (1 - o) xj where c, o, x are vectors. c and o are of i and X of j.

Each weight contributes some amount to the output. Want to change weight so output gets more accurate.
Move each weight individually down the gradient using alpha
Since we want to move down the error gradient, use:
-α ( ∂E/∂Wij ) -> -α ( ∂E/∂o ) ( ∂o/∂Wij)

When 2 layer:
same process as above, in addition to: also use h instead of x for layer 1.
∆Vjk = -α ∑ ∂E/∂o ∂o/∂h ∂h/∂Vjk

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Mean Squared Error

A

.5 ∑ (vec_c - vec_o)^2

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

output vector equation

A

o = step (∑ Wij hj)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

McCulloch Pitts Network, how to solve a next step problem

A

Step function: X^(t+1) = step (∑ Wij Xj - Ui) where X_j is t
Multiply the weights of all edges by the value of all nodes, then subtract by the threshold. If more than 0 then its a 1 if less then 0.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Forward Propagation Steps. What’s its input and output values.

A

(1) h = σ ( V i)
(2) o = σ (W h)

Use the sigmoid function to standardize the result of Matrix times input vector.
Generates an expected output for a given input

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Genetic Algorithms 3 Crossover Types: What is Crossover?

A

(1) 1 point, (2) 2 Point, (3) Uniform
Swapping all subtrees between 2 parent trees.
1 2 3 4 5 6 7 8 9
A B C D E F G H I

1 Point:
Offspring 1: 1 2 3 4 E F G H I
Offspring 2: A B C D 5 6 7 8 9

2 Point:
Offspring 1: 1 2 3 4 E F G 8 9
Offspring 2: A B C D 5 6 7 H I

Uniform (Any Point):
Offspring 1: 1 2 C 4 E F 7 H 9
Offspring 2: A B 3 D 5 6 G 8 I

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What is the 1/5th Success Rule

A

Helps balance exploration vs exploitation
If < 20% of mutations are good, increase strength to increase exploration. (complete Randomization mutation). Exploring too much

If > 20% of mutations are good, then decrease the strength to get a more fine-tuned solution. (Random Walk Mutation “incremental changes”) Exploiting Optima too much

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is a Mutation? What is a Tweak. Why use mutations?

A

The process of randomly altering an individuals representation (genotype). The alteration is called a tweak

The operator injects diversity in to the search process by making random small changes to candidate solutions

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

ES (Evolution Strategies) Style evolutionary loop. Tweak Operations? Selection Operations

A

Eval -> Resample -> Repeat
Group of parents selected (truncation selection) and produces offspring (can be only mutation too)
Full or partial replacement can occur
emphasizes adaptive mutation and self adaptation

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

GA (Genetic Algorithm) Evolutionary Loop. What are its 3 knobs. Teak Operation? Selection Algo?

A

Eval -> Resample -> Repeat
- Selection: choose individuals based on tournament selection or fitness-proportionate for reproduction. SUS
- Breeding: 2 Selected parents do crossover (+ mutation) to make new offspring
- Replacement: Generational Replacement (new replace old) or Merged with old
Iterative and Individual based.
Knobs: population size, selection pressure, mutation variance

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

ES (Evolution Strategies) VS GA (Genetic Algo)

A

Both Eval -> Resample -> Repeat and are Generational and have Elitist mechanisms available

GA Resampling: select parents iteratively and use crossover/mutation for children until new population filled in

ES Resampling: Done in bulk, parents selection (truncation selection) and usually only use mutation, then each make kids.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Fitness Proportionate Selection

A

Similar to roulette, Normalize fitness values of all individuals and random number selects individual proportional to normalized value (probability)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Tournament Selection

A

Set number of individuals with replacement (tournament size) are selected at random, best among them are chosen.

With replacement means individuals can be selected multiple times, making it more likely fit individuals are often selected.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Gradient Descent / Ascent

A

x←x + α∇f(x)
uses only the first derivative to update parameters.
Repeatedly add portions of slope, if slopes positive x increases, if slopes negative then it decreases

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

Newtons Method

A

Uses first and second (Hessian in n-d) to update parameters. Near a max, the 2nd deriv is negative
x←x−α ( ∇f(x) / f’‘(x) )
x₍ₙ₊₁₎ = xₙ–f ′(xₙ)/f ″(xₙ).

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

Newton’s Method vs Gradient Descent / Ascent

A

Newtons method converges at optimum more rapidly and lessens overshoot bc it adjusts step size on curvature

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

How would you perform genetic mutation on a variable length list, how can you change the length of the list? What crossover algo can do the same thing? How to do a mutation based on gene duplication

A

Delete a random subeq of the list. Or inspired by gene duplication, select a subseq of the list make a COPY of it and insert the copy somewhere else in the list. For
crossover, you might pick a random index N where N is the minimum length of two lists, and do one-point crossover at that point, trading the tails of the lists.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How would you do tournament selection with REAL VALUED tournament
sizes >= 1.0?

A

Since they are not integers, find the probability by subtracting floor[t] then run either the ciel or floor value according to that probability.
for a real-valued tournament size T >= 1.0,
with probability T - Floor[T]
do a tournament of size Ceiling[T]
else
do a tournament is size Floor[T]

20
Q

Simulated Annealing

A

Uses the annealing schedule (temp starts high and lowers over time). At high temp times, the algo accepts both better and worse moves, allowing movement out of optima. As temp decreases, it mostly only accepts increases.

If better always replace, else sometimes replace with:
Prob(t, R, S) = e ^( (quality of child- quality of original) divided by time )

Starts with high exploration then moves to only exploitation. Knob = Temp. Can be global opt if high enough temp

21
Q

Hill Climbing with random restarts

A

Hill Climbing gets stuck in local optima a lot, random restarts overcome this. Restart from a new starting point.
Hill Climbing:
Start with a sample and compute its fitness. Make a small tweak and compute its new fitness. If higher then tweaked version replaces original and repeat. Very low exploration and high exploitation. Knob = replacement period time

22
Q

(1+1) Evolution Stratergy, how is mutation usually found

A

Uses single candidate solution and uses mutation to make 1 offspring at a time.
1. start with candidate solution S and set init mutation strength for gaussian convulsion operator
2. Mutate one offspring R by applying Gaussian convulsion to S
3. Compare fitness R to S, update S only if R is more fit
4. Optionally adjust the mutation strength using the 1/5th rule
5. Repeat
Knob is mutation strength, Bc mutation can be anything
Similar to hill climbing and guaranteed to be a global optimizer

23
Q

(μ+λ) Evolution Strategy (ES). (1+λ).

A

Populate of μ generates λ offspring. Both parents and children compete to make next generation.
1. Generate initial population (can be random), larger λ means more exploration
2. Compute fitness for all individuals
3. combine parents w/ children and select μ best individuals
4. Parents make equal number of children using gaussian convolution for mutation
5. Loop
Knob is μ, λ
Similar to steepest ascent hill climbing but uses gaussian convulsion and guaranteed to be a global optimizer

24
Q

Gaussian Convultion

A

with a probability p, choose a random number n from the normal distribution of σ^2 with a mean of 0
add value to respective element in vector to be convolved so new element is greater than min and less than max
Do this for every v in V

25
Hamming Distance
For boolean vectors to find distance. Counts number of times 2 genes are different. d(i,j)= ∑ (i ⊕ j ) for i_k and j_k
26
Euclidean Distance
d = √ (∑(yi - vi1)^2
27
homologous crossover
When you do a crossover with yourself and get yourself: 1 point , 2 point, and uniform are good examples Standard GP Subtree Crossover is not homologous
28
Standard GP Subtree Crossover
1. Select random Subtree from parent 1. 2. Select random subtree from parent 2 3. swap them Non-homologous
29
Steepest Ascent Hill Climbing
provide n, which his the number of tweaks per iteration 1. choose a random solution s 2. loop until time over or solution reached 3. make R <- Tweaked s 4. for n-1 iterations set W = tweaked S and compare to R, replace if its better 5. finally compare r to s, setting r value if its better 6. After 2 loop end. Basically just hill climbing but sampling all around s instead of just choosing 1 new location
30
Steepest Ascent Hill Climbing With Replacement
Adds the global S variable, allowing the best to be stored there Don't compare r to s after, just store instead compare S at the end of each loop to store the best
31
What is Truncation Selection
Used in Evolution Strategies. (1) Select many starting samples. (2) Access fitness of all (3) Delete all from population but the top n samples
32
Evolutionary Programming
Similar to Evolution Strategies except for 2 things: (1) Only used (μ+λ) where (μ =λ) so half population removed and then replaced by children (2) Can be applied to any representation hence mutate can be adding/deleting/modifying an Edge or node
33
epistatis
when genes are functionally intertwined
34
Stochastic Universal Sampling
Variant of Fitness Proportionate Selection, biased so fit individuals always picked at least once. 1. Compute total fitness (sum of all) 2. Step = Total / #individuals 3. pick starting point in range [0,step] 4. place pointers from start point + xStep 5. take respective individuals as selected
35
What does it mean to have a similar genotype
Near each other in space with respect to a tweak operation
36
Encoding and Decoding
Phenotype -> genotype is encoding. reverse is decoding
37
Hamming Cliff
Where to a make a small change in phenotype, you must make a large change to genotype. Binary example 7-> 8 is same as 0111 to 1000
38
Gray Code
Each phenotype encoding differs by only 1 bit. v = original binary encoding w = v from i = 2 -> length if v-1 is true then set it w's respective bit to false
39
Integer mutation when the numbers correspond to labels
Change change the int value to another legal one, when it hits the probability check
40
When mutating integers that represent metric space
If probability check: either add or subtract 1 / -1 from value until coin toss prob < rand value 0 -> 1
41
point mutations
pick 1 or n random genes and just mutate those, not a global operator, can only move horizontally
42
recombination, Genetic algo, lines, genetic programming
similar to crossover line recombination takes a linear combination of corresponding genes. Draws line between 2 parent points and picks a new one on that line. Takes in 2 vectors and constant p α and β are random values from -p to 1+p the 2 offspring are calculated by:   t = α·v_i + (1–α)·w_i   t = β·v_i + (1– β)·w_i
43
How to draw a Mealy Machine
Label each internal state as a node Label each edge with the external state ( in a) and action ( do something). Edge goes from starting int state to ending
44
Moore Machine
If internal state is (NODE) -> and external state is (EDGE) -> Change to this internal state and do this action (NODE2)
45
Phenotype
How individual operates during fitness assessment
46
Kohonen's Algo
Competitive Learning, winning + nearby updated as opposed to winning only. delta Wi*j = alpha(Xj - Wi*j)
47