Next: Information Gain
Up: Decision Tree Learning
Previous: ID3 Algorithm
- Entropy (from information theory)
- measures the impurity of an arbitrary collection of examples
- for a
boolean classification where is the proportion of
positive examples in and is the proportion of
negative examples in .
- In all calculations involving entropy we define 0log0 to be 0.
-
- If all members of are in the same class
- If there is an equal number of positive and negative instances
in then
- Entropy specifies the minimum number of bits of information
needed to encode the classification of an arbitrary member of
- Generally,
- For example if there are 4 classes and the set is split evenly, 2
bits will be needed to encode the classification of an arbitrary member
of S. If it is split less evenly an average message length of less
then 2 can be used.
Patricia Riddle
Fri May 15 13:00:36 NZST 1998