where
is a training instance and the kernel function decreases inversely
with the distance
is a user defined constant that specifies the number of
kernel functions
the kernel functions are usually Gaussian centered at point
with some variance
a common Gaussian kernel function,
shown that this can approximate any function with arbitrarily
small error provided a sufficiently large number and that the
width of each kernel can be specified separately
two layer ANN which is trained separately
how to get right?
one for each training instance: good but slow and will overfit
k less than size of training set: 1 uniformly, 2 randomly, 3
clustering algorithm (EM algorithm), target has no affect
train much more efficiently then regular ANN, because only 1
layer at a time
Patricia Jean Riddle
Wed Jun 23 13:06:34 NZST 1999