futech

Introduction to Decision Tree

a Decision Tree is a classifier in the form of a tree structure with two types of nodes:
• Decision node: Specifies a choice or test of some attribute, with one branch for each outcome

• Leaf node: Indicates classification of an example.

The problem that occur is constructing a decision tree is:

Given a training examples what type of decision tree should be generated?

One of the proposal to over come such type of problem is to prefer the smallest tree which is consistent with such kind of data (i.e bias).

The possible method to do this is to search the space of the decision trees for the smallest decision tree that fits the data.

Let's see the example for constructing the decision tree for playing tennis:
For this we have:

Attribute and their values:
1. Outlook : Sunny,Overcast,Rain
2. Humidity: High,Normal
3. Wind : Strong,Weak
4. Temperature : Hot,Mild,Cold

Target Concept : Play Tennis- yes,no

The Decision Tree for playing Tennis :

In this decision tree :

If outlook is sunny,temperature is hot.humidity is high and wind is weak then there is chances that there is no game of tennis.

Decision tree is representation of disjuction of conjuction:

As in the case of above if we want to classify the target concept for YES.Then it's disjunction of conjuction will be :

(Outlook=Sunny Ù Humidity=Normal) Ú (Outlook=Overcast) Ú (Outlook=Rain Ù Wind=Weak)

For constructing a good decision tree:

1. First we stop and

i) Return a value of target feature or

ii) A distribution over a target feature values

2. Choose a test(e.g an input feature) to split on.

i) For each value on test,build a sub tree on those example with this value of the test

We can use Top down induction of decision tree algorithm or ID3 algorithm for constructing a good decision tree.This algorithm proceeds as Follows:

1. We start with node A¬ the "best" decision attribute of the next node.

2. Then we assign A as a decision attribute of the next node.

3. For each value of A we create a new Descendant.

4. Then we sort the training example to leaf node according to the attribute value of the branch.

5. If all the training examples are properly classified(i.e the same value of target attribute) stop,else iterate over new leaf nodes

We come across issue while constructing the decision tree regarding to the choices like:

1. When to stop:

we stop when:

i) When there is no more input features.

ii) When all the examples classified.

iii) When to few examples make an informative split.

2. Which test to split on:

i) On that split which gives the smallest error.

ii) With multi valued feature

iii) When there is split on all values or

iv) When the values split into half.

Given below is the video in which I have explained about decision tree:

Hope you enjoy reading this article.In next article I will be discussing more about decision tree.Till then enjoy learning!!!

futech

Saturday, January 21, 2017

No comments:

Post a Comment