Week 3 - Planning Flashcards
(22 cards)
summary read ( of slide 1 intuition)
so is the summaey i took from this correct
planning is given an initial state a goal state and a set of actions ( with preconditions and effects) coming up with a plan (sequence of actions ) that takes you from initial state to goal state
to do this we must define the domain:
the objects involved
relations between objects
define the actions ( listing their precondictions and effects)
we then can pass a specific problem (from domain) by:
passing an initial state
passing a goal state
im assuming a planning algorithm would then take all of this and generate a plan
what does PDDL stand for
planning domain definition language
what does a planning problem do
allows us to write definitions of a planning domain and problem
what are 2 files in a pddl model
domain file
problem file
what is domain file
general description of the world
contains the predicates and actions
type of objects involved
problem file
the specific instance (problem) that you are trying to solve
contains the inital state( List all objects of our problem and all true relationships among the objects)
goal state ( list all relations we want to be true)
limitations of classical planning
actions are instantaneous (time not taken into account)
planner decides what actions to apply in what order ( some orders may not make sense )
plans are sequential (not simultaneous)
actions don’t define quantity or resources so goals cannot define resource constraints (resources could be some sort of numerical value like fuel etc)
Pdl 2.1 introduced …
Numbers, resources, costs (“metrics”)
Time
what do fluents allow
check a certain resource is in limit and to increase / decrease resources
what does requirements tag do
requrement tag lets us know the features that we are going to use ie typing and fluents
what does typing feature do
typing allows the use of types ( allowing us to create custom types )
fluents
allows us to define functions that map objects to numbers ( that we will use to manage resources usually)
where are metrics defined?
in the goal of a problem
why does use of durative actions change terminology from preconditions to conditions
precondition -> action is instantanoeus -> check only occurs once
condition -> durative actions -> checks occur across a span of time
How does durative actions affect conditions and effects
conditions can hold true at different times
ie condition may hold true at start
conditon may hold true at end or it may need to hold through
for effects tho:
you can only have effects at start and affects at end ( as a result of a durative action)
How does forward chaining work
how forward chaining works -
get initial state (S)
see the applicable actions ( actions whose preconditons satisfied by S) that can be applied to inital state and apply them to get a new successor state S’ where S’ = S + add effects - delete effects
keep applying actions from state to get successor states until you reach a state that contains all your goal conditiosn
whats the problem with forward chaining
Forward chaining (aka progression search) explores states blindly
At each step, it applies all applicable actions, but:
it doent know which actions are ‘good actions’ ( ones that bring closer to goal ) as we dont have heuristic to guide it towards a goal
leads to us exploring redundant paths wasting time
also we DONT WANT TO GIVE A HEURISTIC FOR a specific problem
what are domain independent heuristics
we can use domain independent heuristics like RPG that allow us to guide planner to a goal and dont need knowledge of the specifc ( domain and by extension problem)
what does rpg do
finds a plan from S to G by ignoring the delete effect of each action and uses the length of relaxed plan as a heuristic
how does it work in depth ( RPG)
Uses a fact layer and action layer
First fact layer is the current state, rpg creates the first action layer which is all possible actions that can be obtained from initial state
Fact layer (n) determines action layer a(n +1) and f (n + 1) = f(n) + all add effects of the actions in a(n + 1) fact layer gets bigger and bigger
We start with an initial state and a goal state.
We list all the facts of initial state
List the first action layer (all possible actions that can be done from fact layer 0)
Generate second fact layer (only add add effects not delete)
Compute the next action layer and generate the next fact layer until all goal facts are true
how do we get sequence of actions when we have the goal
We then start with goal(n) which contains all the problem goals
for each fact in g(n) if it was in the previous fact layer we can add it to goal(n -1) otherwise we must pick the action from the previous action layer and add its preconditions to g(n -1)
The number of actions used during this is the heuristic value