Hello everyone,
I am (name) from LearnVern. (6 seconds gap)
You all are welcome in the Machine Learning Course.
This tutorial is the continuation of the previous tutorial only.
So, let's see ahead.
Today we will learn about Decision Tree Practically, as to how we can do classification through Decision Tree Algorithm.
So, let's begin.
Firstly I will Import the relevant library so here I will first import Pandas, so, import Pandas as pd.
And I will introduce you to the data set.
And I will click over here,so you can see that it is still showing us running time so it is connecting. So we will let it connect. It is connected now.
Here, I have a file golf dot csv, so I will upload this now ok!.
Ok this is the file golf dot csv, now I will read this file.
For that, I will give the name, data is equal to pd dot read underscore csv.
And here I will paste its path. Copy path, and here I have pasted the path.
Now we will see the data set as to how it is?
Here, you can see the data set, has the column as days such as day 1, day 2 and next we have how to outlook, next we have temperature with a missing t, then we have humidity,next is wind, then play ball in that we have yes or no, lastly we have one extra column as unnamed having NA.
So this is our data set, here we need to understand,
So here days are days if you want, you can consider them as input, however it is not going to affect that much.
So we have 4 columns as input such as Outlook, temperature, humidity, wind as X1, X2, X3 and then wind as X4 and on the basis of playball the decision will be made whether to play or not.
So here it is Sunny and hot, humidity is high and wind is weak so the decision made was no we will not play.
So the dataset is of this type.
So, with the help of this our algorithm will learn and create a decision tree. So this decision tree will be able to make any decision in any related inputs that will come in future.
So let's do some preprocessing, as we have already done data wrangling previously, so here, also we will do a little data wrangling and then will deal with the data properly.
Ok let's go,
so firstly we will delete this last column of unnamed because it is of no use for us.
So, here I will delete with the help of DEl, data off and put the name of this column and after deleting, below that only I will check by putting DATA, data dot head.
To confirm if it is deleted.
So here you can see that it has been deleted, so this is a good thing, now let's move ahead.
So, I can do one more thing in this column over here,I will correct the name first.
By data dot COLUMNS, columns, we will copy this list of the columns, so I will copy this list and re-initialize it.
What will happen with this?
So data dot COLUMNS,columns, we will be able to change the spelling and give the proper name temperature and play ball this word also I will bring it together as one because many times it is difficult to deal with spaces. So we will be able to deal with it very easily,
Now you can see that we have proper data with the corrections.
Temperature and playball both came properly.
Let's move ahead in our previous algorithm itself I was telling you all that when we were having this sunny, overcast, rain, the ones which you are able to see, if i show you here and
If I write data off and write OUT- Outlook with two quotes.
And here dot ( 10 seconds gap) will run a 'unique' function.
So, it is sunny, overcast and rain.
Here, in the output I have how many types of categories or groups? So one is Sunny, one is overcast and one is rain, three types right?
So, I had told you earlier that this categorical data needs to be converted to numerical.
To convert it into numerical, for that I will create a temporary variable as 'out'.
Out is equal to, first is sunny, ok, so I will copy directly and then will edit
So, I have made a dictionary and in that I will edit.
So, for Sunny we will give zero,
And for overcast we will give 1,
And for rain we will give 2.
This way.
What we will do, data off outlook, data off outlook is equal to data off Outlook again, and here, dot map function we will use. And in this map function we will use lambda function so LAMBDA, X Colon, we will substitute, out of x
So for instance if we have Sunny and we will do sunny out off x, then we will get out of sunny as zero.
So in this way,in ‘out of’ whatever values we will have will change into numericals as 0 1 and 2.
Instead of sunny, it will be zero.
Instead of overcast, it will be one and two in the rain.
So after this I will once show you by putting a data dot head.
So, you can see the data has been changed.
So, in the same way, we will change the temperature column, so we will add one more cell, and let’s write ‘data of’ over here. Let's use temperature.
So in ‘data of temperature’, we are able to see this is how it is showing.
And if i use map function here,
wait I think so, we have executed twice therefore this error is coming.
So, I Temperature column is already converted, and this step is done.
You do not have to worry about this error.
Now, I will add one more code cell separately, and check with the data dot head.
So, here you can see it has already been converted.
Ok! Let's move ahead and we will go for humidity and then proceed ahead.
Now next we have data off humidity, after that ' dot unique'.
so in humidity we have only two things: high and normal.
So let us treat this also,
Data off we will use automatic suggestion,that's better, is equal to data of humidity dot map function in that we will write LAMBDA lambda x colon 0 if x is equal to is equal to high else 1.
I hope you are able to understand me…
If the value of x is high then we will keep it as 0 other than that we will keep it as one. Because we have only two, then we can do this much, let's go,
Ok now we will again see by putting DATA data dot head. This is also done!
Last we have wind here, also I believe there are only two, but we will still check it.
Data off wind dot unique. Sometimes it takes time to give suggestions then we have to type.
So, in this also we have two.
So,In this also we will use the above function and Copy and paste it again and here I will replace this with wind, WIND, W capital.
Here also we will add wind.
Wind is weak and strong so here I will write WEAK, weak and then treat.
so this is also done.
Now the Play wall column is remaining.
It is also the same way, so I will copy and paste it.
We will put 'playball', so data of Playball, dot map, so in play ball also we have yes or no, so for 'no' we will give zero and for yes it will become one…
I hope you are able to understand me.
Now the playball is also changed.
Ok so this was a very long process, but I will tell you some tools, some automatic ways later through which you can automatically convert the data into numerical, this was a manual way because we are just beginning with it so we need to learn manually also.
So, uptil now it's done.
We have got the data all ready, in numerical format.
We know that the meaning of 0 is no and one is yes.
So,In the case of wind we know that the meaning of zero is weak and one is strong.
So, we can match this whenever needed as to what their actual meanings are from the dictionary.
Ok let's move ahead,
From SK learn dot tree ok, import decision tree classifier, so we have regression as well as classifier but we are right now just looking at classifier.
So, we have Imported this now.
Now, we will create a model.
You can write a DT model or model.
Created a model.
So the model is equal to the decision tree classifier, so here I have kept everything as default and have not passed any parameters.
Now, I have to give this model data for training
So, with model dot fit method i can give data to it for training, so here I have to insert the X that is input data, that I will do it through data dot iloc method, so data dot iloc and inside this in a square bracket i am considering all the rows, comma so first column is for rows starting to end and for columns i just have to leave one, so I can start with one and we do not want the last one so we will write 5 and it will take up till 4, so 0,1,2,3,4 so we need this 1,2,3,4, it will start from one and leave till 5 so uptil here I have taken x , so as this is supervised then we will have to also give y.
13:50
again, data dot iloc, after that here colon comma 5, ok!
So we added colon comma five ok?
In this we just put one column as it is the output.
So, this is the way we are giving training to the model.
So, we are giving the model the ability to learn
So, let's move ahead.
So, we are done with fit, now I will prepare a tree and show you.
As to how you can represent a tree, we trained the model.
From SKlearn import tree. (typing)
So we will import tree from SK learn.
After importing the tree.
Now, I will have to import one more library “from matplotlib import pylot as plt”.
So, this also I have imported.
I think I will not need any other libraries apart from these two.
So, plt dot FIGURE,figure, here we will give figure size so fig size is equal to 12 by 8, and, here I will print, tree dot plot tree and here we will pass the model.(read slow, typing)
So, I have passed the learned model over here.
So, in fixed size I should have given a single argument, so I am changing this into a single argument.
And enter,
So, here you can see X2, this means what X tool it has taken.
Here is X0.
firstly the root node that is formed is done through X2, because its gini is high, when the gini was computed, its gini was highest.
Then after gini was computed at this level, the Xnot’s ginie was high.
So, whose gini is high it is chosen as first.
Now what will we do next?
You can see these X2, X0 and if you want you can give them names because it is not easily understandable.
So, to give them names, we will have to import one more library to do that. From sklearn dot tree import, there is a library called export graphviz , we will use export graphviz library.
And here we will provide feature names. Ok.
Feature name is equal to, what all feature names do we have here, i have outlook, after outlook what do we have next is temperature, after temperature we have humidity and wind. (typing read slow)
Here humidity and next is wind.
Humidity and wind, ok. (typing)
So, here we will write the output also, which is either yes or either the output is no. (typing, read slow)
output is either yes or no.
Here dot is equal to export graphviz we will use, export grahviz (typing) and in that I will give DECISION,decision tree, then model which is our trained model, then we will make an OUT- outfile which will save one file, so outfile is equal to we have golf data so GOLF,golf dot and save it in DOT format, and in FEATURE, feature names we will put FEATURE,feature names, so they are same.
So we are putting feature names here.
After adding the feature name, what will we do next, is anything else remaining?
Done with feature names so we will put through class names also, so CLASS underscore NAMES, and in class names I will put OUT-outfile. ok?
So, we will execute this much.
And now with the help of this dot I will execute this.
Exclamation DOT hifen T we are going to print PNG file, and here I will refresh and here a file should be created.
So, golf dot dot.
This is the file that is formed.
So here GOLF,golf DOT, dot hifen O TREE tree dot png
An image of a tree should be formed with its help.
Here, my hifen T is written in small letters. Let me make it in capital letters.
My file name spelling is changed over here. I'm sorry for that, let me correct it.
So here tree dot png has been created.
And here you can see our tree dot png.
So, in this you can see the names such as humidity and in class we are also getting yes or no.
So, it is also showing us the output of it is formed in this way.
So, this is the way the decision tree is formed.
I hope you all must have understood it
Try to practice.
So friends, we will stop today’s session here and we will continue further in the next session.
Thank You very much.
If you have any questions or comments related to this course.
then you can click on the discussion button below this video and post it there.
So, in this way you can discuss this course with many other learners of your kind.
Share a personalized message with your friends.