Hello I am (name) from LearnVern. (6 seconds gap)
In our machine learning course up till now we have studied about how decision trees work and in this we know that while making decision trees a problem can occur of high variance, so about this you will be studying in optimization videos as to what is a variant and what is a bias.
Now, today we will see a next level of decision tree classification whereby we will learn how to use many decision trees together by using an assemble technique and create a random forest out of it.
So this is known as random forest.
Now in this supposingly we have trees such as tree1, tree2, in this way we have n number of trees. (read slow, typing)
So how will these trees help us?
What are we going to do in these trees?
Output of this, Output of second, Output of third, all of their output, we will merge them together, and after applying them a final output will be created which we will come out as a strong learner because we believe that this individual tree has come from a weak learning, because the variant is too high and their output which will come out of their merging or combining will be more accurate than the individual output.
So this is the way random forest works. It works both for regression and for classification as well.
So let's implement this and see, for this we will be using an iris dataset we all know about the iris dataset, on this only we will implement a random forest and see.
So now I am proceeding ahead and here: from SK learn import dataset. (read slowly, typing).
And from here iris is equal to data set dot load iris, we will import from here. (read slowly, typing).
Now we want the data for that x is equal to iris dot data.
In the same way y is equal to iris dot target. (read slow, typing)
So, both our x and y are set.
Now, on this we will apply random forest.
So, to apply random forest we will import it from sklearn's ensemble model, from that we will import random forest.
So, from the sklearn dot ensemble, from here we 'll import random forests. (read slow, typing)
So, here you can see we have in our suggestions random forest classifier, regressor, as well as random tree embedding, but we need it for classification so we import this random classifier.
Now, we will create an object of random forest.
RF is equal to random forest classifier, so we have made this object.
And at the type of object making in this you can pass parameters also, let me show you.
Here you can pass the number of estimators, so by default the estimator is 100 and the criterion is gini which you can change it through editing if you want, and here for the beginning we can initialise random state in this and rest of the things we will keep it as it is.
Now I have both x and y, so here random forest dot fit and pass both x and y.
So, here I have got a random forest model trained.
So, I did not split my data earlier, so here I can pass a particular value or we will split a certain amount of data and then do testing on it.
From sklearn dot model selection import train test split. (read slow, typing)
And from here we will divide our data a little bit.
So, x underscore TR,train, x underscore TEtest, y underscore TRtrain and y underscore TEtest and here we will pass the data for train test split.
So, here I have X and y and then we will have to mention the test size as 0.20, and here also we can out random state but for now I am leaving it as default.
So, here for the test data.
rf dot predict and pass X underscore TE. (read slow, typing)
So, you can see the prediction from here.
And we also have the previous prediction YTE, which is our actual observed prediction.
So, to set the accuracy.
From sklearn dot matrix import accuracy score. (read slow, typing)
Through this we can evaluate the accuracy score.
And here we will have to pass both the actual and predicted values in this accuracy score, so y underscore TE, this is our actual.
And this we can store it in y underscore pr, so that we can pass this as a variable.
And here we will pass y underscore pr.
Now, you can see accuracy is one.
You can understand over here that the accuracy is one because I have done this splitting after training.
So, you can apply this or take it as an assignment, where you split the data first and then check the accuracy.
So, this way we can apply random forest, which performs better as compared to individual trees.
So, friends, we will end our session here for today.
Now we will continue in the next session.
Thank you very much.
Keep Learning and remain motivated.
If you have any questions or comments related to this course.
then you can click on the discussion button below this video and post it there.
So, in this way you can discuss this course with many other learners of your kind.
Share a personalized message with your friends.