Header Ads Widget

ENTROPY MEASURES HOMOGENEITY OF EXAMPLES

To define information gain, we begin by defining a measure called entropy. Entropy measures the impurity of a collection of examples.

Given a collection S, containing positive and negative examples of some target concept, the entropy of S relative to this Boolean classification is


Where,

p+ is the proportion of positive examples in S

p-  is the proportion of negative examples in S.


 Example:

Suppose S is a collection of 14 examples of some boolean concept, including 9 positive and 5 negative examples. Then the entropy of S relative to this boolean classification is 

  • The entropy is 0 if all members of S belong to the same class
  • The entropy is 1 when the collection contains an equal number of positive and negative examples
  • If the collection contains unequal numbers of positive and negative examples, the entropy is between 0 and 1

INFORMATION GAIN MEASURES THE EXPECTED REDUCTION IN ENTROPY

  • Information gain, is the expected reduction in entropy caused by partitioning the examples according to this attribute.
  • The information gain, Gain(S, A) of an attribute A, relative to a collection of examples S, is defined as
 Example: Information gain

 Let, Values(Wind) = {Weak, Strong

S            = [9+, 5−]

SWeak          = [6+, 2−]

SStrong          = [3+, 3−]

Information gain of attribute Wind:

Gain(S, Wind)           = Entropy(S) − 8/14 Entropy (SWeak) − 6/14 Entropy (SStrong)

= 0.94 – (8/14)* 0.811 – (6/14) *1.00

= 0.048

An Illustrative Example

  • To illustrate the operation of ID3, consider the learning task represented by the training examples of below table.
  • Here the target attribute PlayTennis, which can have values yes or no for different days.
  • Consider the first step through the algorithm, in which the topmost node of the decision tree is created.

Day

Outlook

Temperature

Humidity

Wind

PlayTennis

D1

Sunny

Hot

High

Weak

No

D2

Sunny

Hot

High

Strong

No

D3

Overcast

Hot

High

Weak

Yes

D4

Rain

Mild

High

Weak

Yes

D5

Rain

Cool

Normal

Weak

Yes

D6

Rain

Cool

Normal

Strong

No

D7

Overcast

Cool

Normal

Strong

Yes

D8

Sunny

Mild

High

Weak

No

D9

Sunny

Cool

Normal

Weak

Yes

D10

Rain

Mild

Normal

Weak

Yes

D11

Sunny

Mild

Normal

Strong

Yes

D12

Overcast

Mild

High

Strong

Yes

D13

Overcast

Hot

Normal

Weak

Yes

D14

Rain

Mild

High

Strong

No

  • ID3 determines the information gain for each candidate attribute (i.e., Outlook, Temperature, Humidity, and Wind), then selects the one with highest information gain.

  • The information gain values for all four attributes are
Gain(S, Outlook)                  = 0.246
Gain(S, Humidity)               = 0.151
Gain(S, Wind)                      = 0.048 
Gain(S, Temperature)          = 0.029
  • According to the information gain measure, the Outlook attribute provides the best prediction of the target attribute, PlayTennis, over the training examples. Therefore, Outlook is selected as the decision attribute for the root node, and branches are created below the root for each of its possible values i.e., Sunny, Overcast, and Rain.




SRain = { D4, D5, D6, D10, D14}

Gain (SRain , Humidity) = 0.970 – (2/5)1.0 – (3/5)0.917 = 0.019
Gain (SRain , Temperature) =0.970 – (0/5)0.0 – (3/5)0.918 – (2/5)1.0 = 0.019
Gain (SRain , Wind) =0.970 – (3/5)0.0 – (2/5)0.0 = 0.970


 





Post a Comment

0 Comments