Namaskar I am (name) from learnvern. (6 seconds pause, music)
In the continuation to the tutorial on machine learning we will continue ahead in today’s tutorial. In today’s tutorial; on machine learning we will see underfitting and overfitting although during discussions on algorithms many times you have heard about this concept and sometimes we have even touched upon this. Now, we will understand in proper manner as to what overfitting and underfitting is.
So when we talk about overfitting,over means more and under means less. From this also you can take a hint that over means something more is happening and under means something less is happening. Now what this more and less is that I shall tell you. Now in overfitting our model or the algorithm tries to cover all the data points and now if we collect data about all the data points then there arises a problem. Now the problem is that our training data has some noise and some inaccuracies and the model learns these noise and inaccuracies also during overfitting and the difficulty that arises is if you pass from this training data some data for prediction then you will get the correct output but if you pass any test data or new sample data then there will be a problem because accuracy there will be very less. So this is what happens in overfitting.
Now if we talk about underfitting then underfitting you would have already understood that the model here is not able to understand the data in the same way that it identifies a trend or creates a good mapping function between input and output. So this is the incapability of this model and it creates a lot of difficulty because even if you give training data, then too the accuracy will be poor and testing data also will not be generalized in a proper way and correct outputs won’t be displayed. So underfitting and overfitting both of them are a kind of problem for us.So for this reason we should identify such a fit that is optimum and which we can call a better or best fit.So in sklearn, let us refer to this document also and try to understand .
So here you can see the official document of sklearn when we applied a function which has a degree 1 which is a polynomial function, so here you can see that the degree is 1.So what happened in this particular function, in this particular function the blue line that you see is the line of our model and the true function is this orange line and sample are our data points. So this first example that we see here is underfitting because here the prediction on the training data which is plotted is going to be wrong and on the test data points it is obviously going to be wrong.
3:42
So this blue line is not ready in any way to give the output meaning that the accuracy will be very poor. Now when this degree was increased to four , at that time you will see that this blue line has taken the form of a curve and in a good manner it is matching with the true function, so we can say that this is optimized or its optimal fit. Now When the degree was further increased, meaning it was made 15 , in this case you can see that overfitting has happened and it is trying to cover every data point and because of this you will get accuracy on the training set but whenever new data is given then your output won’t be correct and accuracy will remain low. So let’s execute it once and see for the same example how underfitting , overfitting and optimal fitting is taking place.
So here I am executing and see here we have calculated the mean square error also . MSC, so scores dot mean, scores dot standard deviation also we have calculated , so come on let us see, so here the same implementation has been displayed with degree1, degree 4 and degree 15. So tell me which is better. So the center one with degree 4 is best. So this way through cross validation you can also check the one which is better and provides optimized fitting , that model, that particular mapping function , and by putting it through training you can choose the model and optimize it. So friends let us conclude here today , today’s session will end here, and the parts ahead we will see in the next session. So keep learning, remain motivated, thank you.
If you have any queries or comments, click the discussion button below the video and post there. This way, you will be able to connect to fellow learners and discuss the course. Also, Our Team will try to solve your query.
Share a personalized message with your friends.