Bayes Optimal Classifier

Next: Gibbs Algorithm Up: Bayesian Learning Previous: Minimum Description Length

What is the most probable classification of the new instance given the training data?
Could just apply the MAP hypothesis, but can do better!!!
Intuitions: Assume three hypothesis whose posterior probabilities are .4,.3 and .3 respectively. Thus is the MAP hypothesis. Suppose we have a new instance which is classified positive by and negative by and . Taking all hypothesis into account, the probability that is positive is .4 and the probability it is negative is .6. The most probable classification (negative) is different than the classification given by the MAP hypothesis!!!
We want to combine the predictions of all hypotheses weighted by their posterior probabilities.
where is from the set of classifications
Bayes Optimal Classification:
No other learner using the same hypothesis space and same prior knowledge can outperform this method on average. It maximizes the probability that the new instance is classified correctly.

Patricia Riddle
Fri May 15 13:00:36 NZST 1998