Next: Q learning Algorithm
Up: Reinforcement Learning
Previous: Q Learning
- still need - iterative approximation or recursive definition
- , so
- , the learner's estimate of , is stored in
a big table which is Initially filled with random values or zero
- The agent starts in some state, , and chooses some action,
, and observes the result reward, , and the new state,
- It then updates the table,
- doesn't need to know functions or just executes
the action and observes and so just sampling
these functions at the current values of and
Patricia Riddle
Fri May 15 13:00:36 NZST 1998