I am (Name) from LearnVern… (pause 6 sec)
In our Machine Learning tutorial, previously, we saw about Logistic Regression, and today we will begin with K Nearest Neighbours.
So, as the name suggests it is talking about nearest neighbours, that means, to check those data points, who are nearest, to the new data that has come up, so they are the nearest neighbours.
So, let's understand this in depth as to what this algorithm is.
Here, you can see in this diagram, that I have some shapes, such as square, circle and triangle, along with that I have an input circle also.
So, this input circle over here, is nearest related to which shape over here?,
Is it related to the square? No,
Is it related to the circle? The answer is yes.
Is it related to the triangle? No.
So, here you can see, its nearest neighbour is a circle.
So, the class which it will belong to, will also be a circle.
This way, the class of the nearest neighbour, will be the class of the new input data that we have.
Now, let's learn in more detail.
So , KNN or K Nearest Neighbour is a supervised machine learning algorithm.
With the help of it, we can do both, classification and regression.
So, both are possible.
Now, this algorithm is believed to be a non parametric algorithm.
Non parametric algorithms, meaning the features or parameters of the data such as X 1, X 2 that we have, upon them it is not based on any assumptions made on them.
Because, whenever it has to do any prediction, it computes some calculation, rather than assumption, and only thereafter gives its prediction.
Whereas, there are many algorithms which give predictions by forming some assumptions and then give predictions and do not use calculation for prediction.
So, this is not like them, so we call it non parametric.
Also, this is known as lazy learner.
Now you might be wondering how it can be lazy?
But it is lazy!
Because, as you gave the data for training, it didn't do that, but it instead stored it, and didn't do any training.
Until you give a test data, to make predictions as input, during that time, it will start performing actions. That's why it is called lazy.
And its mechanism is also of that sort, as it has to find the nearest neighbour.
So, when any new data comes, then only it can find the nearest neighbour.
So, this is the reason this algorithm is known as lazy learner.
Now, we will learn about the different names, that this is popularly known as,
K Nearest Neighbour.
Memory Based Reasoning.
Example Based Reasoning.
Instance Based Reasoning.
So, these are the different names, by which it is known, so don't get confused, as these are all the same.
Now, moving ahead, what is this K?
So, K here means, number of neighbours.
Now, how can we decide the value of K, that how many neighbours we have to check, so for that.
K is equal to the square root of N, here 'N' meaning number of records.
For example, if I have 1 thousand records or if I have 10 thousand records, then 100 into 100 is equal to 10 thousand, so it's square root is 100.
So, the square root of N is equal to K, so k is such a value… that will tell us if, suppose the value of k comes out as 5, then check out 5 neighbours.
From them, the values that are coming in maximum, then consider them as the input class.
So, this is how the K value is decided.
There are many other ways to decide the value of k, that also we will learn in our upcoming algorithms, as it is also there in unsupervised algorithms.
Till then we will move ahead with this concept only.
Now, we will step by step understand, what actually happens in KNN.
- So the first step that is there, it is, select the number of K neighbours.
- We saw one method just now, that is, if we do the square root of n, then we will get the number of Ks, so this is selected.
- Second is, calculate the Euclidean (pronunciation: yoo-kli-dee-uhn) distance of k number of neighbours, that means, we will have to take out the Euclidean distance of K.
- After doing that…
- Next, those that are closest to K, that means if K is 5, the closest of those 5, take them out separately.
- Next, those 5 that we have, in them figure out which are in large numbers, so “count the number of data points in each category”.
- Here, each category, meaning in each label and class, so… in them where the count is more, that same will be its class or label, of the input.
- Next step is that, we assign these new data points that we received to those category, which is coming the maximum number of times
So, in this way our data model or machine learning model is completely ready.
Now it is ready to give predictions.
Let's move ahead…
Now we will learn through an example.
Here you can see that we have an input data set and we have to make the predictions.
So we have an input dataset of cats and dogs, in an image format, and then it has to do a prediction, so it will get a new input of an image of cat or a dog, and then it has to predict if it is a cat or a dog.
So, when a new input is received then we have to find the similarity, and then match it with K neighbours. Okay, so, after finding the similarity, we will pick the top most similarity, those matching the most, will be finalised, and it would be predicted as an output.
So, this was an example of KNN.
Now, we will look at the application of KNN…
- KNN is helpful in pattern recognition, if you want to do theft protection then this pattern recognition is helpful,
- Or, pattern recognition is useful even in banking, as during transactions there are patterns that are used, which are analysed.
- Even used in recommendation systems for OTT platforms or e-commerce.
So, KNN is used in several places.
We will look at KNN practically also.
But, let's conclude here for today.
So, we will stop for today and it's other parts we will continue in the next session.
Till then keep learning and remain motivated.
If you have any questions or comments related to this course.
then you can click on the discussion button below this video and post it there.
So, in this way you can discuss this course with many other learners of your kind.
Share a personalized message with your friends.