Next: Locally Weighted Regression
Up: Instance-Based Learning
Previous: Distance Weighted Nearest Neighbor
- distance weighted k-Nearest neighbor is a highly effective
algorithm for many practical problems
- robust to noisy data if the training set is large enough
- bias is that the classification of an instance is most similar
to other instances that are nearby in Euclidean distance
- because distance is calculated on all attributes - irrelevant
attributes are a problem - curse of dimensionality
- some approaches weight attributes to overcome this - stretching
the Euclidean space - determined automatically using cross-validation
- alternatively eliminate the least relevant attributes - they
used leave-one out cross-validation - ideal for IBL
- could locally stretch an axis...but more degrees of freedom...so
more chance of overfitting...so much less common
- efficient indexing of instances can be done with kd-trees
Patricia Jean Riddle
Wed Jun 23 13:06:34 NZST 1999