Next: Genetic Algorithms
Up: Neural Network Learning
Previous: Advanced Topics
- practical method for learning real-valued and vector-valued
functions over continuous and discrete-valued attributes
- robust to noise in the training data
- Backprop algorithm is most common
- hypothesis space: all functions that can be represented by
assigning weights to fixed network of interconnected units
- feedforward networks containing 3 layers can approximate any
function to arbitrary accuracy given sufficient number of units in
each layer
- networks of practical size are capable of representing a rich
space of highly nonlinear functions
- Backprop searches the space of possible hypotheses using
gradient descent (GD) to iteratively reduce the error in the network to fit
the training data.
- GD converges to a local minimum in the training error with
respect to the network weights.
- Backprop has the ability to invent new features that are not
explicit in the input
- hidden units of multilayer networks learn to represent
intermediate features (e.g., face recognition)
- Overfitting is an important issue (caused by overuse of accuracy
imho).
- Cross-validation can be used to estimate an appropriate stopping
point for gradient descent.
- Many other algorithms and extensions.
Patricia Riddle
Fri May 15 13:00:36 NZST 1998