Next:
Grid-world
Up:
Reinforcement Learning
Previous:
Learning Task
Optimal Policy
We require the agent to learn the optimal policy,
whose value function is
or for simplicity
Patricia Jean Riddle
Wed Jun 23 13:06:34 NZST 1999