Next: Q learning Algorithm
Up: Reinforcement Learning
Previous: Q Learning
- still need
- iterative approximation or recursive definition -
, so
-
, the learner's estimate of
, is stored in
a big table which is Initially filled with random values or zero - The agent starts in some state,
, and chooses some action,
, and observes the result reward,
, and the new state,
- It then updates the table,
- doesn't need to know functions
or
just executes
the action and observes
and
so just sampling
these functions at the current values of
and
Patricia Riddle
Fri May 15 13:00:36 NZST 1998