Next: Information Gain
Up: Decision Tree Learning
Previous: ID3 Algorithm
- Entropy (from information theory)
- measures the impurity of an arbitrary collection of examples
-
for a
boolean classification where
is the proportion of
positive examples in
and
is the proportion of
negative examples in
. - In all calculations involving entropy we define 0log0 to be 0.
-
- If all members of
are in the same class
- If there is an equal number of positive and negative instances
in
then
- Entropy specifies the minimum number of bits of information
needed to encode the classification of an arbitrary member of
- Generally,
- For example if there are 4 classes and the set is split evenly, 2
bits will be needed to encode the classification of an arbitrary member
of S. If it is split less evenly an average message length of less
then 2 can be used.
Patricia Riddle
Fri May 15 13:00:36 NZST 1998