Chapter 4 - Search in complex environments Flashcards
Define a local search
A local search is a search that start from some initial node, and search to neighboring states without keeping track of paths. It does not keep track of previously reached states either, that means no “reached table”.
As a result, they dont know if they have explored the entire state space or not. They are not systematic. Therefore, they are not complete.
However, they use very little memory, since we dont really need any data structures of significant size.
Also, they often work really well in scenarios where using a regular search that is systematic will be hugely useless in terms of time.
Define n-queens
n queens place initial state of n queens on chess board of size n*n. Each queen is placed on its own column.
Then the goal is to achieve a state where no queen attacks any other.
Every action consists of 56 successors. Why? We can move one of the 8 queens, and each of these 8 queens can move to one of the tiles that is in its current column. Therefore, 8*7 = 56.
We can use heuristic function that give value equal to the number of queens attacking each other at every state.
What is heuristic of the n queens problem
The number of queens attacking each other at a given state. It counts multiple times if several queens are in the same line.
Define hill climbing search
Hillclimbing is an algorithm that, from a state, considers the neighboring nodes, and progress to the neighbor-state that is the best state out of all the neighbors according to an objective function.
What is the objective function?
The objective function is a usually mathematical function that represents the scores of different states. meaning, the objective function maps states to some value that measure the “score” of the state
What is greedy local search? name example of algo
Local search that utilize a greedy choice. Hill climbing is considered a greedy local search because it always picks the BEST neighbor. This greed makes hill climb quite fast.
What is the problem with hill climbing?
There are a variety of ways it can become stuck.
1) Local maximas
2) Ridges
3) Plateus (flat area on the curve)
What can we do do battle the problems with hill climbing?
One answer is to keep going whenever we have values that are equally good as the current one if there is no other state that is better. However, to avoid sideways movement into infinity, we include some limit of say 100 steps.
When adding 100 consecutive moves at most in sideways direction, we actually increase the 8-queens win-percentage from 14% to 94%. However, at the cost of going from average 4 moves to 21 moves.
What is stochastic hill climbing?
Stochastic hill climbing chooses at random from the actions that move into a “higher” or “better” state. We can use a higher probability according to the original “goodness” of the certain action (how much more steep it is than the others).
The randomness could make this hill climb variation useful for the random restart hill climbing because it will explore a larger part of the state space.
What is random restart hill climbing?
Random restart hill climbing conducts a series of hill climbing searches from randomly generated initial states. It stops when a goal is found. This is complete, as it will continue running until a solution is found.
What is the objective function in n-queens?
The heuristic function, which is the number of queens being attacked by other queens at any given state.
How can we implement the heuristic function in n-queens?
The naive approach loops though each queen, and checks for attacks.
This sucks.
We would much rather keep a summary structure. Then, if we move queen x from row i to j, we would update only this information.
However, the general idea is to check each queen for attacks for every possible state check.
why do we not solve the hill climbing problem by traversing?
It is ridiculously inefficient
What is simulated annealing?
Simulated annealing tries to combine some of the properties from random walk with some of the properties from hill climb.
Say we change from hill climb to gradient descent.
Analogy: Ping pong ball, we want it to be in the global minima. This we do by performing gradient descent, and “shaking” the surface just enough so that the ball jumps out of local minima and can progress to some other (hopefully) path to global minima.
The algorithm works like this:
We start from some initial state. Then we loop to infinity, but with the hope of finding the solution earlier.
Instead of picking the best move (like hill climb would do), we now pick a neighbor node by random. IF the random choice of node/state improves our current state, it is always accepted. If it does not immediately improves our state, we accept the move with a certain probability less than 1. This probability will decrease accordingly to how “bad” the move looks. The probability is exponentially reduced in according to the badness of it.
The probability of accepting the move is also decreased as the ball goes further down. Bad moves are therefore more likely to happen at the start when the ball is high up.
To be precise we accept “bad” moves with probability e^(deltaE / T)
where deltaE is the change in value between current state and the next state / state of interest.
T is the value of the current time.
Explain detailed the simulated annealing algorithm
The simulated annealing picks the next state by random. If the random choice is better than the current state, it will pick this choice no matter what.
If the random choice is worse, it will pick it based on probability. This probability is dependent on 2 things:
1) The schedule input and schedule function
2) The energy difference
The schedule function takes the time/step number as input and gives the “temperature” as output. This temperature should decrease with time.
The energy difference is the difference between current state value and next state value (based on the choice that was random).
Then the random choice will be used with the following probability: e^(deltaE / T)
This is between 0 and 1 because it will only be used if the choice is “bad”. If it is bad, the current state is better than the next state. Therefore, current - next is a negative number. Because it is negative, the probability becomes 1/(e^(abs(deltaE/T)))
The algorithm terminates when the schedule function returns 0 based on the input t.
What is local beam search?
Local beam search use k random starting points, and explore each of their nodes. Then it will use the complete list of all the new states/nodes, and select the k best ones. From these k new nodes, it will repeat the process.
This is better than random restart search because local beam search passes useful information from each “restart”.
Local beam search suffer from handling badly distributed space. However, we can improve this by using “stochastic beam search”, which operate like the stochastic hill climbing. It will choose k nodes based on probabilities rather than just pick the best ones, as picking the best ones can center all the k nodes quickly.
What is an evolutionary algorithm?
An evolutionary algorithm is an algorithm that is motivated by natural selection in biology. Meaning:
There is some population of individuals (states), in which the fittest (highest values) produce offspring that populate the next generation. this process is called recombination.
There are many ways of evolutionary algorithms. They depend largely on the inputs and mechanisms they are based on.
What are the variables in an evolutionary algorithm?
The size of the population.
The representation of each individual. Sometimes the individuals are represented as a string (DNA).
The mixing number, p. P represents how many individuals are included in determining the offspring. Typically 2 (2 parents).
The selection process for selecting the individuals who are to produce an offspring. We usually assign probabilities based on fitness score.
The recombination procedure. One common approach, given 2 parents, is to randomly select a crossover point. This point (same for both strings) splits the strings. Then we create 2 children. One of them will have the first part from A and second of B. the other child will have the second part of A and first part of B.
The mutation rate.
Culling? Culling is the process of discarding offspring that is under a certain limit. This will lead to speedups in the development.
How can we represent the n-queen problem with evolutionary algorithm?
We can consider one instance of the game as an individual. We can represent the individual by an n-length string consisting of numbers. For 8-queens, we use a string of 8 numbers to represent a game.
Ex: 23748552
The j-th number represents the row number which the queen in column j is currently in. Therefore, we only need 8 numbers to represent 8 queens. Since the chess board is square, we dont need more information.
So, we can use for instance 4 instances of individuals (8 queen games). THEN we use a fitness function to rate each of the individuals (states). We could use the reverse heuristic of number of queens being attacked. Or we could use the number of non attacking pairs of queens.
Say we have 4 individuals with fitness scores 24, 23, 20, 11. THEN we take these fitness scores and assign them probabilities according to their value. Meaning, higher fitness score gives a higher probability of being chosen for reproduction. For instance, 31%, 29%, 26%, 14%.
The next phase is to select two parents based on the probabilities we assigned. We create pairs. In this particular instance, we allow individuals to be picked several times. This means someone might not even be picked. Anyways. we create 2 pairs of individuals.
Then we find the crossover points. When we now create 2 new offspring from both pairs, they become like this:
The first kid gets the first part from A and second part from B.
The second kid gets the first part from B and the second part from A.
We could also add random mutation. Say we add a one-digit change with 10% probability, meaning it would be 10^(-8) to mutate all digits in the string.
What is important regarding the usefulness of evolutionary algorithms?
The strings that represent individuals must contain blocks that mean something. I mean, the strings that represent individuals cannot just be randomly permuted with a combination that doesn’t really matter. The permutations must matter so that combining blocks together can actually provide useful combinations. Or else it will just be meaningless combinations.
What is a schema?
A schema is a substring where some of the positions can be left unincluded, like thus:
246*****
This schema describes all 8-queens states in which the first three queens are in the positions 2, 4 and 6 respectively. Strings that match this schema, meaning that they have those 3 numbers in common (at the same position) are called INSTANCES OF THE SCHEMA.
What is an instance of a schema`
An instance of a schema is a string that match the schema.
If the schema is 246*****, then the following is an instance of it: 24645714
It can be shown that IF the average fitness of the instances of a schema is above the mean, then the number of instances of the schema will grow over time. What does this mean? It means that if we want some offspring to have a specific schema, the average fitness of the schema must be above the mean. It also means that the schemas with high fitness score will survive. This is nothing more than because of the probabilities we assign in the early stages.
What can we say about the progress of steps in evolutionary algorithms?
The big steps happen early on, when the population is more diverse.
as the population become more uniform, smaller steps will be made.
Explain the actual algorithm that performs the evolutionary algorithm
The algorithm takes a population and a fitness function as input. The goal is to return an individual of good qualities.
The main part of the algorithm is to be REPEATED UNTIL INDIVIDUAL IS FIT ENOUGH (or time restriction runs out).
So, what’s inside the main part?
First of all, the input “population” is a list of individuals. Then we use this list, as well as the fitness function, to create a new list of “weights”. Weights contains corresponding fitness values for each individual in the population.
Then we create a new, empty list which holds the new population.
Then we go into a for loop of the population size. inside this loop, we extract 2 parents based on probabilities from the weights list. Ideally this is used by another function called WEIGHTED-RANDOM-COICES(population, weights, p) which returns p parents.
Then we call the method called REPRODUCE(parent1, parent2) which returns a child with the combined strings (we’re now part the crossover stage).
Then we add in a simple if(probaiblityRequirement) then mutate(child) in a certain way.
Then we add the child to the NEW population.
Now we go out of the for loop.
We now set the old population list to be the new one. This will be of the same size as before, but with new children.
A note on the REPRODUCE(parent1, parent2) function:
It is basic substring + substring action based on some crossover point.
What do we call the set of states that some agent considers possible states that it can reach by doing some action?
The belief state.
What is the belief state?
The belief state is the set of states that an agent considers to be possible states by performing an action. It is related to the fact that in non-deterministic environments the agent cannot be sure of the transition model, so it has to guess.
What can we say about the solution of a problem that operates in a partially observable, non deterministic environment?
The solution will no longer be a path of actions. It will now be a conditional plan, also called contingency plan. It is a more complicated solution that includes a tree-like structure of if this percept then do that or if the other percpet then do this other thing.
Like I said, this will change the way of the transition model. We can’t use a RESULT(s, a) anymore. We have to use RESULTS(s, a) because it needs to return the set of all possible states that the action a can lead to.
Say we are in a partially observable environment. How does this affect the agent compared to fully observable environment?
In a prtially observable, we might only be able to see some part of the state. In a vacuum world, this could mean that we only have “access” to the part of the environment that is the tile we are currently on.
Therefore, we lack knowledge of the entire state.
Therefore, whenever we perform some action, we dont actually know what we are going to. Say we choose to move right. We dont know whether there will be dirt there or not. Had we known in advance, we could calculate the perfect route from the source state. now, we cannot do this. Therefore, we must create a contingency plan.
How does non-determinism affect the agent?
Non determinism means that the agent doesn’t know for sure what state it will transition to. For instance, if dirt could appear while we’re moving to the tile, it would be non-determnistic.
What do we mean by “AND” nodes in a tree?
AND nodes represent how actions lead to uncertainty in regards to result state. We could say that and-nodes represent actions, but I dont think this is very fitting as a regular tree would represent actions by its edges, which definitely could be done in AND OR trees as well. I feel like AND nodes serve as splitters, where actions lead to multiple possible outcomes.
What makes up a solution for an AND-OR search problem?
The solution for an AND-OR search problem is a subtree of the complete search tree.
It has the following properties:
1) The solution subtree has a goal node at every leaf node
2) It specifies one action at each of the OR nodes.
3) Includes every outcome branch at each of its AND nodes
What is an OR node in AND-OR tree?
OR nodes represent states. Therefore, we can say that OR nodes represent decision points as well.
consider the AND-OR-SEARCH(problem) algorithm. What does it return?
It returns a contingency/conditional plan, or failure.