Chapter 4 - Search in complex environments Flashcards

1
Q

Define a local search

A

A local search is a search that start from some initial node, and search to neighboring states without keeping track of paths. It does not keep track of previously reached states either, that means no “reached table”.

As a result, they dont know if they have explored the entire state space or not. They are not systematic. Therefore, they are not complete.

However, they use very little memory, since we dont really need any data structures of significant size.

Also, they often work really well in scenarios where using a regular search that is systematic will be hugely useless in terms of time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Define n-queens

A

n queens place initial state of n queens on chess board of size n*n. Each queen is placed on its own column.

Then the goal is to achieve a state where no queen attacks any other.

Every action consists of 56 successors. Why? We can move one of the 8 queens, and each of these 8 queens can move to one of the tiles that is in its current column. Therefore, 8*7 = 56.

We can use heuristic function that give value equal to the number of queens attacking each other at every state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is heuristic of the n queens problem

A

The number of queens attacking each other at a given state. It counts multiple times if several queens are in the same line.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Define hill climbing search

A

Hillclimbing is an algorithm that, from a state, considers the neighboring nodes, and progress to the neighbor-state that is the best state out of all the neighbors according to an objective function.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What is the objective function?

A

The objective function is a usually mathematical function that represents the scores of different states. meaning, the objective function maps states to some value that measure the “score” of the state

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

What is greedy local search? name example of algo

A

Local search that utilize a greedy choice. Hill climbing is considered a greedy local search because it always picks the BEST neighbor. This greed makes hill climb quite fast.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

What is the problem with hill climbing?

A

There are a variety of ways it can become stuck.

1) Local maximas
2) Ridges
3) Plateus (flat area on the curve)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

What can we do do battle the problems with hill climbing?

A

One answer is to keep going whenever we have values that are equally good as the current one if there is no other state that is better. However, to avoid sideways movement into infinity, we include some limit of say 100 steps.

When adding 100 consecutive moves at most in sideways direction, we actually increase the 8-queens win-percentage from 14% to 94%. However, at the cost of going from average 4 moves to 21 moves.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

What is stochastic hill climbing?

A

Stochastic hill climbing chooses at random from the actions that move into a “higher” or “better” state. We can use a higher probability according to the original “goodness” of the certain action (how much more steep it is than the others).

The randomness could make this hill climb variation useful for the random restart hill climbing because it will explore a larger part of the state space.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

What is random restart hill climbing?

A

Random restart hill climbing conducts a series of hill climbing searches from randomly generated initial states. It stops when a goal is found. This is complete, as it will continue running until a solution is found.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

What is the objective function in n-queens?

A

The heuristic function, which is the number of queens being attacked by other queens at any given state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

How can we implement the heuristic function in n-queens?

A

The naive approach loops though each queen, and checks for attacks.
This sucks.

We would much rather keep a summary structure. Then, if we move queen x from row i to j, we would update only this information.

However, the general idea is to check each queen for attacks for every possible state check.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

why do we not solve the hill climbing problem by traversing?

A

It is ridiculously inefficient

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is simulated annealing?

A

Simulated annealing tries to combine some of the properties from random walk with some of the properties from hill climb.

Say we change from hill climb to gradient descent.

Analogy: Ping pong ball, we want it to be in the global minima. This we do by performing gradient descent, and “shaking” the surface just enough so that the ball jumps out of local minima and can progress to some other (hopefully) path to global minima.

The algorithm works like this:
We start from some initial state. Then we loop to infinity, but with the hope of finding the solution earlier.

Instead of picking the best move (like hill climb would do), we now pick a neighbor node by random. IF the random choice of node/state improves our current state, it is always accepted. If it does not immediately improves our state, we accept the move with a certain probability less than 1. This probability will decrease accordingly to how “bad” the move looks. The probability is exponentially reduced in according to the badness of it.

The probability of accepting the move is also decreased as the ball goes further down. Bad moves are therefore more likely to happen at the start when the ball is high up.

To be precise we accept “bad” moves with probability e^(deltaE / T)
where deltaE is the change in value between current state and the next state / state of interest.
T is the value of the current time.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Explain detailed the simulated annealing algorithm

A

The simulated annealing picks the next state by random. If the random choice is better than the current state, it will pick this choice no matter what.
If the random choice is worse, it will pick it based on probability. This probability is dependent on 2 things:
1) The schedule input and schedule function
2) The energy difference

The schedule function takes the time/step number as input and gives the “temperature” as output. This temperature should decrease with time.
The energy difference is the difference between current state value and next state value (based on the choice that was random).

Then the random choice will be used with the following probability: e^(deltaE / T)

This is between 0 and 1 because it will only be used if the choice is “bad”. If it is bad, the current state is better than the next state. Therefore, current - next is a negative number. Because it is negative, the probability becomes 1/(e^(abs(deltaE/T)))

The algorithm terminates when the schedule function returns 0 based on the input t.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is local beam search?

A

Local beam search use k random starting points, and explore each of their nodes. Then it will use the complete list of all the new states/nodes, and select the k best ones. From these k new nodes, it will repeat the process.

This is better than random restart search because local beam search passes useful information from each “restart”.

Local beam search suffer from handling badly distributed space. However, we can improve this by using “stochastic beam search”, which operate like the stochastic hill climbing. It will choose k nodes based on probabilities rather than just pick the best ones, as picking the best ones can center all the k nodes quickly.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is an evolutionary algorithm?

A

An evolutionary algorithm is an algorithm that is motivated by natural selection in biology. Meaning:
There is some population of individuals (states), in which the fittest (highest values) produce offspring that populate the next generation. this process is called recombination.

There are many ways of evolutionary algorithms. They depend largely on the inputs and mechanisms they are based on.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

What are the variables in an evolutionary algorithm?

A

The size of the population.
The representation of each individual. Sometimes the individuals are represented as a string (DNA).

The mixing number, p. P represents how many individuals are included in determining the offspring. Typically 2 (2 parents).

The selection process for selecting the individuals who are to produce an offspring. We usually assign probabilities based on fitness score.

The recombination procedure. One common approach, given 2 parents, is to randomly select a crossover point. This point (same for both strings) splits the strings. Then we create 2 children. One of them will have the first part from A and second of B. the other child will have the second part of A and first part of B.

The mutation rate.

Culling? Culling is the process of discarding offspring that is under a certain limit. This will lead to speedups in the development.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

How can we represent the n-queen problem with evolutionary algorithm?

A

We can consider one instance of the game as an individual. We can represent the individual by an n-length string consisting of numbers. For 8-queens, we use a string of 8 numbers to represent a game.
Ex: 23748552
The j-th number represents the row number which the queen in column j is currently in. Therefore, we only need 8 numbers to represent 8 queens. Since the chess board is square, we dont need more information.

So, we can use for instance 4 instances of individuals (8 queen games). THEN we use a fitness function to rate each of the individuals (states). We could use the reverse heuristic of number of queens being attacked. Or we could use the number of non attacking pairs of queens.

Say we have 4 individuals with fitness scores 24, 23, 20, 11. THEN we take these fitness scores and assign them probabilities according to their value. Meaning, higher fitness score gives a higher probability of being chosen for reproduction. For instance, 31%, 29%, 26%, 14%.

The next phase is to select two parents based on the probabilities we assigned. We create pairs. In this particular instance, we allow individuals to be picked several times. This means someone might not even be picked. Anyways. we create 2 pairs of individuals.

Then we find the crossover points. When we now create 2 new offspring from both pairs, they become like this:
The first kid gets the first part from A and second part from B.
The second kid gets the first part from B and the second part from A.

We could also add random mutation. Say we add a one-digit change with 10% probability, meaning it would be 10^(-8) to mutate all digits in the string.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

What is important regarding the usefulness of evolutionary algorithms?

A

The strings that represent individuals must contain blocks that mean something. I mean, the strings that represent individuals cannot just be randomly permuted with a combination that doesn’t really matter. The permutations must matter so that combining blocks together can actually provide useful combinations. Or else it will just be meaningless combinations.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is a schema?

A

A schema is a substring where some of the positions can be left unincluded, like thus:

246*****
This schema describes all 8-queens states in which the first three queens are in the positions 2, 4 and 6 respectively. Strings that match this schema, meaning that they have those 3 numbers in common (at the same position) are called INSTANCES OF THE SCHEMA.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

What is an instance of a schema`

A

An instance of a schema is a string that match the schema.
If the schema is 246*****, then the following is an instance of it: 24645714

It can be shown that IF the average fitness of the instances of a schema is above the mean, then the number of instances of the schema will grow over time. What does this mean? It means that if we want some offspring to have a specific schema, the average fitness of the schema must be above the mean. It also means that the schemas with high fitness score will survive. This is nothing more than because of the probabilities we assign in the early stages.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

What can we say about the progress of steps in evolutionary algorithms?

A

The big steps happen early on, when the population is more diverse.

as the population become more uniform, smaller steps will be made.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

Explain the actual algorithm that performs the evolutionary algorithm

A

The algorithm takes a population and a fitness function as input. The goal is to return an individual of good qualities.

The main part of the algorithm is to be REPEATED UNTIL INDIVIDUAL IS FIT ENOUGH (or time restriction runs out).

So, what’s inside the main part?
First of all, the input “population” is a list of individuals. Then we use this list, as well as the fitness function, to create a new list of “weights”. Weights contains corresponding fitness values for each individual in the population.
Then we create a new, empty list which holds the new population.
Then we go into a for loop of the population size. inside this loop, we extract 2 parents based on probabilities from the weights list. Ideally this is used by another function called WEIGHTED-RANDOM-COICES(population, weights, p) which returns p parents.
Then we call the method called REPRODUCE(parent1, parent2) which returns a child with the combined strings (we’re now part the crossover stage).
Then we add in a simple if(probaiblityRequirement) then mutate(child) in a certain way.
Then we add the child to the NEW population.
Now we go out of the for loop.
We now set the old population list to be the new one. This will be of the same size as before, but with new children.

A note on the REPRODUCE(parent1, parent2) function:
It is basic substring + substring action based on some crossover point.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
Q

What do we call the set of states that some agent considers possible states that it can reach by doing some action?

A

The belief state.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
26
Q

What is the belief state?

A

The belief state is the set of states that an agent considers to be possible states by performing an action. It is related to the fact that in non-deterministic environments the agent cannot be sure of the transition model, so it has to guess.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
27
Q

What can we say about the solution of a problem that operates in a partially observable, non deterministic environment?

A

The solution will no longer be a path of actions. It will now be a conditional plan, also called contingency plan. It is a more complicated solution that includes a tree-like structure of if this percept then do that or if the other percpet then do this other thing.

Like I said, this will change the way of the transition model. We can’t use a RESULT(s, a) anymore. We have to use RESULTS(s, a) because it needs to return the set of all possible states that the action a can lead to.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
28
Q

Say we are in a partially observable environment. How does this affect the agent compared to fully observable environment?

A

In a prtially observable, we might only be able to see some part of the state. In a vacuum world, this could mean that we only have “access” to the part of the environment that is the tile we are currently on.

Therefore, we lack knowledge of the entire state.

Therefore, whenever we perform some action, we dont actually know what we are going to. Say we choose to move right. We dont know whether there will be dirt there or not. Had we known in advance, we could calculate the perfect route from the source state. now, we cannot do this. Therefore, we must create a contingency plan.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
29
Q

How does non-determinism affect the agent?

A

Non determinism means that the agent doesn’t know for sure what state it will transition to. For instance, if dirt could appear while we’re moving to the tile, it would be non-determnistic.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
30
Q

What do we mean by “AND” nodes in a tree?

A

AND nodes represent how actions lead to uncertainty in regards to result state. We could say that and-nodes represent actions, but I dont think this is very fitting as a regular tree would represent actions by its edges, which definitely could be done in AND OR trees as well. I feel like AND nodes serve as splitters, where actions lead to multiple possible outcomes.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
31
Q

What makes up a solution for an AND-OR search problem?

A

The solution for an AND-OR search problem is a subtree of the complete search tree.

It has the following properties:
1) The solution subtree has a goal node at every leaf node
2) It specifies one action at each of the OR nodes.
3) Includes every outcome branch at each of its AND nodes

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
32
Q

What is an OR node in AND-OR tree?

A

OR nodes represent states. Therefore, we can say that OR nodes represent decision points as well.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
33
Q

consider the AND-OR-SEARCH(problem) algorithm. What does it return?

A

It returns a contingency/conditional plan, or failure.

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
34
Q

explain the general points regarding the and-or-search

A

We call orSearch from each state. This allows us to treat each state as a possible route. From each state, we loop through the actions(state) set, and for each action we call and-search. We call and-search because we have to handle each action in a way that allows for multiple possible states. We use the results transition model to get the belief state. This set of states are treated by orSearch. this process continues until the search backtracks.

The search backtracks when a cycle is found or if goal is found.

The key part to understand is the nature of the contingency plan.

35
Q

How does the solution of and-or search look?

A

A possible solution is this: [suck, {if s2 then goal, if s3 then [right, {if s4 then [suck, {if s5 then goal}]}}

Nested fuckery.

From a root node: perform an action. this action leads to one of several possible states (2 in this case). If s2 is the state achieved, then we have reached the goal. If s3 was the state we reached by performing suck, then we we use the further plan of first going right, and then checking states again.

36
Q

Elaborate on AND-OR trees

A

AND OR trees are used to represent non-deterministic environments and their transitions.

It consists of AND nodes and OR nodes.

OR nodes are simply states. The logic here is that from a given state, we can perform action A or action B or action C or…

AND nodes comes from the fact that there is uncertainty regarding the outcome of all actions. Therefore, AND nodes capture this fact.

When drawing an AND OR tree, we use squares for OR Nodes, and circles as AND nodes. The OR nodes (squares) will be the states. The circles (AND nodes) will be some point that ONLY signifies that there are multiple possible outcome states. Therefore, the AND nodes dont represent anything other than a “splitter” that show us there is uncertainty.

Now, an AND node doesn’t require multiple actions. It is common in AND OR trees to use AND nodes even though the specific action will not create uncertainty.

37
Q

What is the solution of an AND OR search problem?

A

A subtree of the complete search tree that has:

1) A goal node at every leaf
2) Specifies one action at each of its OR nodes
3) Includes every outcome branch at each of its AND nodes

38
Q

Elaborate on the AND-OR-SEARCH algorithn

A

AND OR SEARCH returns a contingency plan or failure.

It consists of 3 parts:
1) Main part
2) or search part
3) and search part

the main part is only responsible for calling or-search with the original root state. This kickstarts the search.

the or search is called for every state. Or search received 3 inputs:
1) Problem
2) State
3) Path (path to reverse when solution is found)

Because we call the orSearch with each state, this is where we check for goal-node. If this specific node/state is the goal, then we return with an empty plan. The empty plan signifies that “when we reach this state, there is nothing else to do”. In a sense, we are done. No more contingency.
Since we call orSearch with a state, we also check for cylcles - meaning we check whether or not the state has been reached before. If so, then we return failure. Because of the recursion, failure will be returned all the way back to the point where it was first (the node) reached.

After checking for both isGoal and isCycle, we can continue the orSearch body. This will be exploring the possible moves that can be done from this specific state. therefore, we loop through the state’s action-set. For each action, we run the following:
plan = andSearch(problem, results(state, action), path)

Notice the resultS. We send the set of possible state-outcomes to the and search.
When the search recurse back, it will return the [action] + [plan].

So, what happens in the andSearch?

The AndSearch receives the set of possible outcomes. Because each outcome is a state, we need to treat them as or nodes. Therefore, we loop through the belief space, and call:
plan = orSearch(problem, state, path)
IF failure was not returned, and the loop is done, the andSearch will return:
return [if s_1 then plan_1, if s_2 then plan_2 …..]

39
Q

Consider a vacuum cleaner in 2-tile grid world horizontally. Create a contingency plan

A

Let us say we have a vacuum cleaner in a world consisting of 2 tiles (horizontally). The agent starts at the left tile. Both tiles are dirty. If the agent suck/clean, it can either clean just the tile it is on, or it may actually clean both.
Some of the possible states:
s1 = both tiles are clean, agent is on the left tile
s2 = both tiles are clean, agent is on the right tile
s3 = both tiles are dirty, agent is on the left tile
s4 = both tiles are dirty, agent is on the right tile
s5 = left tile is clean, agent is on left tile, right tile dirty
s6 = left tile is clean, agent is on right tile, right tile dirty
s7 = right tile is clean, agent is on left, left tile is dirty
s8 = right tile is clean, agent is on right, and left tile is dirty

Say the agent begins in s3. Then a contingency plan be represented like this:

[suck, {s1:[], s5:[right, {s6: [suck, {s2: []}]}]}]

Where {} indicates belief state and [] indicates plans

We use [] around all plans, and {} around all belief states.

40
Q

What is the “tell sign” of an environetmnet that is partially observable

A

We say that an environment is partially observable if its percepts that it gets through its sensors are not enough to specify the exact state.

41
Q

What is a conformant problem?

A

A conformant problem is a problem involving an agent that doesn’t have sensors.

Consider a vacuum cleaner that knows the layout, but is sensorless. It will not be able to correctly determine its current state, but it can still operate. By performing a specific sequence of actions that are specific to the layout, we can make the agent “coerce” the world into some goal state.

42
Q

What do we mean by “coercing” an agent into a state

A

Coercing means taking actions that will lead the world into a reduced set of states, and eventually only a single possible state, by only doing actions that are not based on percepts. This is common with sensorless agents. In the vacuum world, this would for instance be [Right, Suck, Left, Suck]. After this series of actions, it does not matter whether we use sensorless agent or not. We have coerced the world into this particualr state.

Therefore, coercing is a strategy to use when we consider partial observability.

43
Q

What is the “solution” or “solution form” of a sensorless agent?

A

A sequence of actions. A path. NOT a contingency plan.

The reason why it will be a sequence of actions, and not contingency planb, is because of the fact that a sensorless agent will never perceive anything. Therefore, it doesnt make sense to talk about different actions given different states. With sensorless agents, we want to find a sequence of actions that makes a solution state necessarily reached.

44
Q

Why could sensorless agents by useful?

A

They dont rely on percepts to operate.

The sensorless plan can save both time and money. Often, we really dont need anything more complicated.

45
Q

Consider the vacuum world. If it was to be sensorless, what would the initial belief state be?

A

The initial belief state could be any of the possible states. We cannot know this. Therefore, if the grid has 1x2, there will be 8 possible states to be in. Therefore, the belief state consists of 8 states.

46
Q

What can we do to reduce the belief state of a sensorless agent? Use the vacuum world as example

A

Initially, the belief state will be {1,2,3,4,5,6,7,8}. However, if we move RIGHT, we reduce the belief state:
{2,4,6,8}.
This illustrates one way we can actually perceive something, gain knowledge, without actually receiving any percepts. We manipulate the knowledge.

IF we perform [right, suck] then we know that we are in the belief state {4,8}. Finally, after [right ,suck, left, suck] we know we are in state 7.

47
Q

What can we say about sensorless problems in regards to search?

A

We are searching in the space of belief states rather than physical states.

A properly coded agent knows its own belief state. Therefore, as long as we are operating in the belief state, we can treat the problem as fully observable.

48
Q

Transform the chapter 3 problem into a sensorless problem

A

Recall:
States
Initial state
Actions
Goal state
Transition model
Action cost

States: The belief state space contains every possible subset of the physical states. If P has N states, then the belief state problem has 2^N

Initial state: The initial state will typically be the belief state containing every possible state in P. In some cases though, the agent might know more than this.

Actions: This is problematic. Since we are using belief state, we dont know where we are and what we might go to. So, what actions are available?
if we assume that all actions are available at every state, the actions set is simply the union. If not, the actions set is the intersection.

Transition model: Tricky. If the belief state we are currently in all have possible actions that all lead to a new belief state, it can be quite large.

Goal test: Now we differ between POSSIBLY and NECESSARILY achieving the goal. we say that an agent possibly achieves the goal if any state s in the belief state satisfies the goal. However, the agent necessarily achieves the goal if every state in the belief state satisfies the goal test. We therefore aim to achieve goals necessarily when regarding sensorless agents.

Action cost: This is a problem. Since we cannot be sure of the state we move to, and the state we moved from, we cannot say with certainty what the cost is. That is, unless if every action has the same cos.t

49
Q

For a partially observable problem, how do we treat the percept?

A

We specify a PERCEPT(s) function that returns the percept received by the agent at a given state. You could also use PERCEPTS(s) if the environment is non-deterministic.

For instance, we could you use the PERCEPT(s) on an 8-puzzle game to view the content of a single tile, which is actually enough information to create an internal representation of the board and then solve as if it is fully observable.

In other words, the PERCEPT(s) returns information on the current state based on information we receive. If the environment happened to be fully observable, we would have PERCEPT(s) = s for every state s.

50
Q

What is meant be PERCEPT(s) = null

A

this is the case for non observable environments/sensorless agents.

51
Q

Consider a 1x2 grid vacuum world. Consider partial observability. How many states yield [L, dirty]?

What does this really mean?

A
  1. We could have this with both tiles dirty and with only left dirty.

This means that we can often have multiple states that will create the same percept.

In partially observable problems, we can think of the transition model between belief states as occurring in 3 stages:

1) The prediction stage.
2) The possible percepts stage
3) The update stage

The prediction stage:
The prodiciton stage computes the belief stage resulting from some action RESULT(b, a). b is the belief state from the parent. We say b* = result(b, a) to highlight the fact that we are estimating/predicting.

The possible percepts stage:
The possible percepts stage computes the set of percepts that could be observed in the predicted belief state.

The update stage:
In the update stage, we compute, for each possible percept, the belief stage that would result from the percept.

52
Q

What is the general point about this chapter?

A

We want to search in more complex environments. This means, partially observable, non-deterministic etc.

53
Q

Are there any cases where you dont care about the path in a search?

A

Yes, when we use the search to find some configuration. For instance, finding a setup for some integrated circuit. In cases like this, we can make use of the fact that we dont need to remember the path.

54
Q

Are there any advantages of the local search?

A

The local search has the main advantage of requiring very little memory.

Also, it runs surprisingly well.

55
Q

Elaborate on the objective function in regards to local search

A

An objective function is a function that measure how good some state is. In a local search, the objective function will determine the quality of each state in order to match it for fitness, potential solution.

In Hill climbing (2D) the objective function will measure the height of the state. We say that the search use the objective function to GUIDE the search further. In the case of hill climbing, the objective function will test for maxima. If not maxima, the search will continue further.

56
Q

What is meant by the statement that the “objective function will guide the search”

A

Illustrate will hill climbing: The function will check for local maxima. If not maxima, the objective function will signal this so the search can continue etc.

Guiding refers to using the OBJ func to make decisions on whether to continue or not etc.

57
Q

What is gradient descent?

A

Gradient descent is hill climbing, but with reversed cost. Meaning, we want to find lowest point.

58
Q

What is hill climbing really?

A

Hill climbing is an optimization algorithm. It has very little to do with the basic examples of climbing hills or functions. Instead, it serves as a general concept of thought on how one could find locally optimized states. it is based on the idea of exploring better and better states (according to some objective function) and stop when we reach the local top. Therefore, a lot of emphasis is placed on 2 things:

1) the objective function

2) The strategy of determining what is good enough of a maxima, should we keep looking etc

59
Q

Say we consider basic hill-climbing. Let us say we are at some state S and that there are mullitple other states we can reach from one action. How do we progress?

A

We choose the neighboring state with the steepest ascent.

60
Q

What is a “complete state formulation”?

A

A complete state formulation is a description of a state that includes all components of a solution state, but the components may or not may have the correct values.

61
Q

Is hill climbing greedy or not?

A

Yes it is GREEDY because simple hill climbing always grabs the best neighbor. This is actually one of the great things about hill climbing.

62
Q

What is the branching factor of 8-queens?

A

56

7*8=56.

63
Q

Can we use hill-climbing for n-queens?

A

Of course

64
Q

Recall what makes an environment non-deterministic

A

The agent will not know for sure what state it will transition to.

65
Q

Recall belief state

A

Belief state is the set of states an agent considers as possible states when considering some transition.

66
Q

What is the solution to a problem when ocnsidering partially observable, non deterministic problems?

A

It is no longer a sequence of actions, but it will be a conditional plan.

67
Q

Do we make any changes to the transition model and “result function” when considering non-deterministic environemtns?

A

Instead of using RESULT, we now use RESULTS. RESULTS indicate that there is not a single state that we consider as the result of performing an action, but rather multiple possible new states.

68
Q

In regular search from chapter 3, solutions were sequences of actions. What can we say about the solutiton when considering non-deterministic environment?

A

The solution will be a tree. Every node indicates a possible state, and then from these nodes we repeat the thought.

69
Q

elaborate on and-or trees

A

AND-OR trees are search trees for constructing conditional plans.

AND nodes represent the possible actions.

OR nodes represent the states.

So, if we consider an OR Node as an initial node, then this node will have children that are the possible actions that we can perform from this specific state. Then, for each of the AND nodes that represent the actions, We get their children which are the belief state that results from considering the action.

The AND-OR tree will typically branch down until a leaf node is a goal node. OR, in the case of cycles, it will terminate with failure, indicating that a particular path is not going anywhere.

70
Q

What is a weakness of AND-OR search?

A

It does not check for repetition of states, which is bad for efficiency.

71
Q

Elaborate on cyclic solutions

A

A cyclic solution is a solution that includes cycles. This can happen if the cause for the non-determinism is some sort of stochastic event. Therefore, if we continue trying some action, we know that it will work eventually. In these cases, we will never actually terminate with failure (if we allow for cycles) because all leaf nodes will be solutions.

As a minimum requirement for cyclic solutions to work, we must know that all leaf nodes are solutions/goal states. Also, every leaf node must be reachable.

In terms of the solution-plan/conditional plan, it would look sort of like this:

[Suck, while State=5 do Right, Suck]

this says that while the state remains the same, we try to move right. Therefore, we keep trying until we actually manage to move right.

72
Q

Define what we mean by partial observability

A

Partial observability refer to the case where some agent cannot pin down the exact state from its percepts.

In the case of partial, or no, observation of the state, we might use actions to gather information.

73
Q

what is a conformant problem?

A

A conformant problem is a sensorless problem. the agent has no sensors at all.

74
Q

Is there anything good about sensorless agents?

A

They are surprisingly simple which can be used in our favor. Point is, no all tasks require us to use sensors at all. Often it is enough if we have some knowledge about the environement.

75
Q

What is the term for “forcing an agent into a reduced set of possible states”

A

We use “coerce”

76
Q

Define a sensorless problem as a search problem

A

The state space will be the belief state. If the physical state space has N states, then the belief state has 2^N states.

The initial state is typically the belief state consisting of all states in P.

Actions. This is a little tricky. If the actions have no impact on the environment, we can use all actions at all times. however, if some actions are dangerous to perform at some cases, we have to be more restrictive. As a general rule, we use the intersection of actions that are legal at all states.

Transition model: We differ between deterministic and non-deterministic.

77
Q

Describe with one sentence each, what a non-deterministic environment does to an agent, and what a partially observable environment does to an agent

A

If the environment is non-deterministic, the agent will not know what state it transitions to.

If the environemnt is partially observable, the agent does not know what state it is in, and thus struggle to know the transition as well.

78
Q

In partially observable, or non-deterministic environments (or both), what is the solution to a problem?

A

A contingency plan. We need to use belief states.

the conditional plan will specify what to do given possible outcomes. Thus, when the agent is released into the wild, it has the plan ready, and knows what to do given possible percepts.

79
Q

IMP: When considering an agent operating in a determinstic, fully observable, known environment: But then the environment change to a non-deterministic one, what do we need to do with our problem formulation?

What happens to the solution?

A

It is still very much a search problem, but we need to change the transition model. since non-determinism leads to a transition model with uncertainties, we must reflect this. Therefore, it is no longer RESULT(s, a), but rather RESULTS(s, a)

This also means that the solution must be a conditional plan.

80
Q

When constructing conditional plans as a result of non-determinism, what is actually meant by the “if’s” etc? When do we perform the different possible paths?

A

The entire conditional plan is made to be executed during run-time. the point of the IF’s and WHILE’s etc is to give the agent understanding of what to do at RUNTIME.

81
Q

IMP: Elaborate deeply on the AND-OR search

A

Consists of 2 parts:

AndSearch

OrSearch.

OrSearch is called from a single state.
AndSearch is called from “a set of states”, the set of states that results from applying a single action on a state, given by the transition model RESULTS(state, action). So, the AndSearch gets called with parameter/argument (problem, states, path).

The algorithm begins with the OrSearch, applied from the root node/state. The OrSearch will loop through the actions-set. For each action, it will run AndSearch with the resulting states from each single action.

IMPORTANT: OrSearch returns either: empty plan if GOAL, failure if loop, or [action]+[plan].

AndSearch returns either failure (if plan == failure) or the plan, consisting of nested fuckery of if this else that etc

82
Q

IMP: What is the solution of a sensorless problem?

A

It is a sequence of actions, NOT a conditional plan. This is because the agent is not able to perceive anything anyways. Therefore, a sensorless problem will be concerned about finding a sequence of actions that coerce the belief state into a set of goal states only.

83
Q

Define a sensorless problem as a search probekm

A

Initial state: Could be a belief state. if we know nothing, the belief state in the set of ALL possible states. Very large belief state (cannot be larger)

Actions: Slightly tricky. If we know that some actions are actually ILLEGAL to perform at certain states, we cannot use them at all. Regardless, we can consider only the intersection of all legal actions.
However, if we know that every action is fine to perform anywhere, we can consider the union of all actions at every state.

Transition model: ResultS(state, action).

GoalTest. this is key. The ultimate goal of sensorless problems is to NECESSARILY achieve the goal (achieve it no matter what, coercion). We say that a goal is possible reached if the belief state we’re currently in contains a goal state. but this is not enough. We aim to necessarily achieve the goal.

Action cost: This is also tricky. We dont know what state, thus action is difficult to measure. We could assume that all actions regardless of transition cost the same.

84
Q
A