Next: Arbitrary Acyclic Networks
Up: Neural Network Learning
 Previous: Termination Conditions for Backprop
 
-  Making the weight in the nth iteration depend partially on the
update during the (n-1)th iteration
 -   
 , the momentum is represent by,  
  -  the gradient search trajectory is analogous to a momentumless
ball rolling down the error surface, the effect of  
  is to keep the ball rolling in the same
direction from one iteration to the next -  the ball can roll through small local minima or
along flat regions in the surface where the ball would stop without momentum
 -  It also causes a gradual increase in the step size in regions
where the gradient is unchanging thereby speeding convergence
 
 
Patricia Riddle 
Fri May 15 13:00:36 NZST 1998