Overfitting and Stopping Criteria

Next: Error Plots Up: Neural Network Learning Previous: Backprop in Action

train until the E on the training examples falls below some level
- why does overfitting tend to occur in later iterations?
- Initially weights are set to small random numbers, as training proceeds weights change to reduce error over the training data & complexity of the decision surface increases, given enough iterations can overfit
- Weight decay - decrease each weight by a small factor on each iteration - intuition keep weight values small to bias against complex decision surfaces - do complex decision surfaces need to have high weights???
stop when you reach the lowest error on the validation set
- keep current ANN weights and the best-performing weights thus far measured by error over the validation set
- training is terminated once the current weights reach a significantly higher error over the validation set
- Care must be taken to avoid stopping too soon!!
- If data is too small can do k-fold cv (remember to use this just to determine the number of iterations!) then train over whole dataset (same in decision trees)

Patricia Riddle
Fri May 15 13:00:36 NZST 1998