Next: Arbitrary Acyclic Networks
Up: Neural Network Learning
Previous: Termination Conditions for Backprop
- Making the weight in the nth iteration depend partially on the
update during the (n-1)th iteration
- , the momentum is represent by,
- the gradient search trajectory is analogous to a momentumless
ball rolling down the error surface, the effect of is to keep the ball rolling in the same
direction from one iteration to the next
- the ball can roll through small local minima or
along flat regions in the surface where the ball would stop without momentum
- It also causes a gradual increase in the step size in regions
where the gradient is unchanging thereby speeding convergence
Patricia Riddle
Fri May 15 13:00:36 NZST 1998