Next: A General Example
Up: Bayesian Learning
Previous: Introduction
- Learning - we want the best hypothesis from some space ,
given the observed training data . Best can be defined as most
probable given the data plus any initial knowledge about prior
probabilities of the various hypotheses in .
- This is a direct method!!! (no search)
- - the initial probability that hypothesis holds
before we observe the training data - prior probability - if we
have no prior knowledge we assign the same initial probability to them
all (it is trickier than this!!)
- - prior probability training data will be observed
given no knowledge about which hypothesis holds
- - the probability of observing data given that
hypothesis holds
- - the probability that holds given the training
data - posterior probability
- Bayes Theorem -
- probability increases with and and decreases
with - this last is not true with a lot of other scoring
functions!
- so we want a maximum a posteriori hypothesis (MAP) -
- if we assume every hypothesis is equally likely a priori then we
want the maximum likelihood hypothesis -
- Bayes theorem is more general than Machine Learning!!
Patricia Riddle
Fri May 15 13:00:36 NZST 1998