Maximum Likelihood for Predicting Probabilities

Next: Gradient Search to Maximize Up: Bayesian Learning Previous: Maximum Likelihood & Least-Squared

we wish to learn a ANN whose output is the probability that the target is 1 (assuming a boolean target)
( dropped because independent of )
generalization of Binomial distribution - not all coins have identical probabilities
standard trick of using log of likelihood
similar to earlier entropy formula so called cross entropy

Patricia Riddle
Fri May 15 13:00:36 NZST 1998