Saturday, January 21, 2017


Introduction to Decision Tree

a Decision Tree is a classifier in the form of a tree structure with two types of nodes:
     • Decision node: Specifies a choice or  test of some attribute, with one branch for each outcome
        
       • Leaf node: Indicates classification of an example.

The problem that occur is constructing a decision tree is:
  • Given a training examples what type of decision tree should be generated?
One of the proposal to over come such type of problem is to prefer the smallest tree which is consistent with such kind of data (i.e bias).


The possible method to do this is to search the space of the decision trees for the smallest decision tree that fits the data.

Let's see the example for constructing the decision tree for playing tennis:

For this we have:

Attribute and their values:
    1. Outlook : Sunny,Overcast,Rain
    2. Humidity: High,Normal
    3. Wind : Strong,Weak
    4. Temperature : Hot,Mild,Cold


Target Concept : Play Tennis- yes,no


The Decision Tree for playing Tennis :


In this decision tree :


If outlook is sunny,temperature is hot.humidity is high and wind is weak then there is chances that there is no game of tennis.




Decision tree is representation of disjuction of conjuction:
As in the case of above if we want to classify the target concept for YES.Then it's disjunction of conjuction will be :
(Outlook=Sunny Ù Humidity=Normal)   Ú        (Outlook=Overcast)  Ú     (Outlook=Rain Ù Wind=Weak)


For constructing a good decision tree:
1. First we stop and
      i)  Return a value of target feature or
      ii) A distribution over a target feature values
2. Choose a test(e.g an input feature) to split on.
      i) For each value on test,build a sub tree on those example with this value of the test 

We can use Top down induction of decision tree algorithm or ID3 algorithm for constructing a good decision tree.This algorithm proceeds as Follows:
1. We start with node  A¬ the "best" decision attribute of the next node.
2. Then we assign A as a decision attribute of the next node.
3. For each value of A we create a new Descendant.
4. Then we sort the training example to leaf node according to the attribute value of the                branch.
5.  If all the training examples are properly classified(i.e the same value of target attribute)          stop,else iterate over new leaf nodes         


We come across issue while constructing the decision tree regarding to the choices like:
1. When to stop:
     we stop when:
       i)   When there is no more input features.
       ii)  When all the examples classified.
       iii) When to few examples make an informative split.
2. Which test to split on:
        i)  On that split which gives the smallest error.
        ii) With multi valued feature 
        iii) When there is split on all values or
        iv) When the values split into half.


Given below is the video in which I have explained about decision tree:

                                      

Hope you enjoy reading this article.In next article I will be discussing more about decision tree.Till then enjoy learning!!!





No comments:

Post a Comment