Next: Q learning Algorithm
Up: Reinforcement Learning
 Previous: Q Learning
 
-  still need  
  - iterative approximation or recursive definition -   
 , so  
  -   
 , the learner's estimate of  
 , is stored in
a big table which is Initially filled with random values or zero -  The agent starts in some state,  
 , and chooses some action,
 
 , and observes the result reward,  
 , and the new state,
 
  -  It then updates the table,  
  -  doesn't need to know functions  
  or  
  just executes
the action and observes  
  and  
  so just sampling
these functions at the current values of  
  and  
  
 
Patricia Riddle 
Fri May 15 13:00:36 NZST 1998