Next: -means Problem Visualization
Up: Bayesian Learning
Previous: EM Algorithm
- The data
is generated by a probability distribution that
is a mixture of
distinct Normal distributions. - Each instance is generated by
- One of the
distributions is chosen at random. - A single random instance
is generated according to the
selected distribution.
- To simplify our discussion, we will assume the Normal
distributions are chosen at each step based on uniform probability and
each of the
Normal distributions has the same variance
and
is known. - The learning task is to output
which describes the means of the
distributions. - We would like to find the maximum likelihood hypothesis (i.e,
the
that maximizes
). - Finding the mean for a single normal distribution is a special
case of the sum of squared errors formula.
- But we have
different Normal distributions, and we
cannot observe which instances were generated by which
distributions. This is a prototypical example of a problem involving
hidden variables!!! - So each instance can be seen as
where
is the observed value of the
th instance and
has the value 1 if
was created by the
th Normal
distribution and 0 otherwise. - Note if
and
were observed we could use
the sum of squared errors formula above instead of EM. - In a nut shell, EM repeatedly re-estimates the expected values
of
given its current hypothesis
then recalculate the maximum likelihood hypothesis using the
expected values for the hidden variables. - This instance of the EM algorithm is
- Calculate the expected value
of each hidden variable
assuming the current hypothesis
holds. - Calculate a new maximum likelihood hypothesis
assuming the value taken on by each
hidden variable
is its expected value
calculated in Step 1. Then replace the hypothesis
by the new hypothesis
and iterate.
Next: -means Problem Visualization
Up: Bayesian Learning
Previous: EM Algorithm
Patricia Riddle
Fri May 15 13:00:36 NZST 1998