Interested in Personalized Training with Job Assistance? Know More

Complete Machine Learning Course in English > Unsupervised Learning Algorithms

K-Means Clustering

16.7k

Start a new search

To find content from modules and lessons

Overview

Hello,

I am (name) from LearnVern.

In our Machine Learning's previous tutorial we studied distance based learning, that means What is the distance based metrics that we used inside the clustering.

And in that, we saw, K means Algorithm, or FP growth algorithm, or hierarchical clustering, these are different types of algorithm, with which we can do clustering or groupings.

So, let us begin with K means Clustering,

Now its name is K mean, in statistics we have studied about mean, that means to take out the average, so this is the work that is done by K means, and here k means number of means, so if I say two means clustering then it will find two centre points, and on that basis it will start making the cluster, and if I say three means, then it will choose three centre points, and on its basis will start making the clustering.

This is an iterative algorithm, meaning the logic that it uses upon the data, it uses them again and again, and tries to meet its expected conditions. So that the learning of the algorithm gets completed.

So, let's understand this in more detail.

So, k means clustering is a part of unsupervised learning only, and the dataset that we have here is not labelled, and in that data only we have to do the grouping.

And this can be applied on those data that are numerical and in continuous format,

This is a really fast algorithm,

And it can be very easily understood also, which I will do also here.

And we can use this in banking and insurance fraud detection, and can be used in image segmentation and customer segmentation.

Now, I will explain to you as to How it works.

So, let us go here and see, here I will take some data points such as here I took one data as 2, then I took 3 and then I took 50, and next I took 51.

So, you can see I have 4 data points.

So, if we assume these 4 only because if we had 4 lakhs then it would have been really complex.

So, I have these 4 data points and I want to run K means on this.

So, to perform k means, first I should know the number of groups that we have to create.

So, for now I assume k as 2, meaning we have to create 2 cluster, so if I have to create 2 cluster, then I will take any two data points from these, so I will make them as two centre points, and they are called as centroid,

So, let's do it in this way…

So, here I took 2 as one centroid, and 3 as other centroid, so C1 is centroid 1 and C2 is centroid 2,

And on the basis of 2 I will find the distance of all of these.

So, everyone's distance I will find on the basis of 2.

So, these 2 and 3 should be here,

And let's find the distance here, is equal to, this minus this, enter, so here you can see the distance found is 0, so in the same way here we will have distance for everyone here.

So, maybe there is some mistake here, let me just check….here this is working for A5, A6, so this is working for A5, so this ok?

What has happen over here, that it has come to C5, so we will have to freeze this C4, so I have freezed this, and we will find the rest, so C is freezed, and thereafter we will put dollar for 4,

Ok!

So, this is actual and in the same way, we will have to minus this with 3 and again we will put dollars,... we should have selected this, minus this. Dollar, and dollar here also.

In this way..

So, we have got the correct values.

So, we want the absolute for this also, so we will remove the negative value by ignoring it, so this C1 and C2, meaning what is C1 and C2 here?

So, for C1 we have this 2, and for C2 we have 3, on the basis of them we can see all the calculations.

Now, we will move ahead,

So, we want to remove the absolute value,is equal to, so we will put this entire thing into the bracket, and will take the absolute, so ABS, and here we will put brackets, and in the end also we will put brackets.

So, in this way it's absolute value is found, in the same way for everyone, we will just have the absolute values. (6 sec pause)

So, we took the absolute values,

Here also let us find the absolute value…

So, here we have got the absolute values.

Now, in this we will see which are the values that are less, these are distance actually.

So, we will take C1 and C2 again over here, here we took C1 and C2, so we will see that, between 0 and 1, which is smaller, so it is zero, so 2 will come in this cluster,

After that you will see, 3 is close to 1 in 1,0 so it will come here.Next value 50 is close to this, so it will come here, next value 48, is close to this, so it will come here.

So, enter.

So, in this way we got our segregation,

So, here you will see we will have to create a new centroid, so to do that, these data points that we have, so, the old ones are of no use so we will keep them as it is…

Ok!

So, from this we will find the new centroid,

How are we going to do that?

So, this is 2, so if we find the centroid for 2, then I will click on this sum over here, so we got two here.

In the same way, if we click sum over here.. then we get 101,

So, we got these two sums.

So, 2 became the centre value because there is only one element,

Then, this 101, what are we going to do with that, so we will basically divide this, so here we got the average, this is the average.

So, this is our one more average..

So, we have these two averages.

Now, this will become our new centres, 2 and 33,

So, you make one of 2, and the other of 33.

Now, with this again find the distance,

So, we will play these iterations, until we get the same values inside the cluster.

So, here we have the new centroid, so with this new centroid we will subtract the original value,

So, this minus this value, and enter.

And then we will freeze this by putting a dollar here, and here.

And here we will also use ABS, along with ABS function…. In this way.

And we will bring this up till eighth, same thing we will do here also,

So, we will remove this, so it is equal to ABS absolute, and after that this minus this, and in bracket close, enter, and after this we will freeze this.

Dollar, dollar, so we freezed this.

And after this.. we will fill it till the end.

Now, you will see as to How you will make the cluster.

So, we have new C1 and C2.

And those are 2 and 33, these are our C1 and C2…..

Yes, now see, in between them 2 is small, so it will come here, next 3 is small so, it will come here, next 50, 51 will come here.

So, this is how it's going to be.

So, this is our new cluster.

Here, 2,2,3 is one cluster, 50, 51 is the second cluster.

So, we got two clusters here.

And we got them on the basis of distance.

Again, you take their average, so if you will take their average… it will come 2.5 for this, and here again if you will take the average, then your average will be 50.5

Ok!

Now, if you perform one more iteration for it, then you will see that you will get exactly the same cluster, one of 2 and3 another of 50 and 51.

When your cluster is repeating, then you will stop there, as it means your algorithm has formed the groups and created their clusters.

Or otherwise you will continue this process.

So, this was the simple process with which you can run the K -means algorithm.

So, friends, let's conclude here for today.

We will stop today's session here, and it's further parts we will see in the upcoming sessions.

So, keep learning and remain motivated.

Thank you.

If you have any queries or comments, click the discussion button below the video and post there. This way, you will be able to connect to fellow learners and discuss the course. Also, Our Team will try to solve your query.

See More

Learner's Ratings

4.4

Overall Rating

71%
14%
0%
7%
8%

Reviews

D

Dinesh Kumar

4

Your screen is very blur and it doesn't has clarity even in 720P.Please make sure that will not happen again.

D

DOGALA UDAYKUMAR

5

bettor

N

Naresh Kulunge

4

good learning but the content titles are jumbled up, like first title of this module is decision tree dichotomiser which is practical part ahead of theory part. Same with the SVM practical 1 title has

E

Eswar Veeranki

5

good

I

Isakki Alias Devi P

5

Wonderful course

S

sushma Yadla

5

yes, i am happy to learning for machine learning in LearnVern.it i s easily understanding for Beginners.

P

Prabhat Yadav

5

Superb and amazing 😍🤩 enjoyable experience.

M

Muhammad Nazam Maqbool

5

Absolutely good course... will suggest it to everyone. has superb content that is covered in a fantastic way.

S

sushma Yadla

5

super course and easily understanding and Good explaned

R

Ruturaj Nivas Patil

5

Very well explained in entire course. Great course for everyone as it takes from scratch to advance level.

Show More

Recommended Courses

Free हिन्दी

Excel For Data Analysis

50028

3.7 Enroll For Free

Free हिन्दी

SQL For Data Analysis

18642

3.8 Enroll For Free

Course Content

Getting Started with Machine Learning

How to use LearnVern

Introduction to Machine Learning

Environment Setup Part 1

Environment Setup Part 2

Environment Setup Part 3

Data Wrangling

Importing Libraries and Dataset

Handling Missing Data

Handling Missing Data - Practical

Encoding Categorical Data

Encoding Catergorical Data - Practical

Splitting Dataset

Splitting Dataset - Practical

Normalizing the Data - Part 1

Normalizing the Data - Part 2

Finding Machine Learning Datasets

Exploratory Data Analysis

Plotting Graphs - Part 1

Plotting Graphs - Part 2

Distribution Models - Part 1

Distribution Models - Part 2

Assignment : Data Preprocessing for Machine Learning

Machine Learning Paradigms

Assignment : Machine Learning Paradigms

Decision Tree Iterative Dichotomiser 3

Random Forest

Support Vector Machine Classifier

Support Vector Machine Classifier - Practical 1

Support Vector Machine Classifier - Practical 2

Naive Bayes Classifier

Naive Bayes Classifier - Practical 1

Naive Bayes Classifier - Practical 2

Evaluating Classification Models Performance

Evaluating Classification Models Performance - Practical

Overview of Classification

Logistic Regression

Logistic Regression - Practical - 1

Logistic Regression - Practical - 2

KNN

KNN Practical - 1

KNN - Practical 2

Decision Trees for Classification

Decision Trees for Classification - Practical 1

Decision Trees for Classification - Practical 2

Assignment : Supervised Learning Algorithms

Simple Linear Regression

Simple Linear Regression - Practical

Salary Prediction using Linear Regression

Multi-Linear Regression

Startup Prediction using Multiple Regression

Support Vector Regressor

Support Vector Regressor - Practical 1

Support Vector Regressor - Practical 2

Decision Tree Regressor

Decision Tree Regressor - Practical 1

Decision Tree Regressor - Practical 2

Regressor Model Selection

Evaluating Regression Model Performance

Evaluating Regression Model Performance - Practical

Assignment : Regression Algorithms

Distance Metrics

K-Means Clustering

K-Means Clustering - Practical

Mall Customers Prediction using K Means Clustering

Hierarchical Clustering - Agglomerative , Divisive

Agglomerative Clustering - Practical

Divisive Clustering - Practical

DBscan Spatial Clustering

Mall Customers Prediction using Hierarchical Clustering

Assignment : Unsupervised Learning Algorithms

Association Rule Learning - Apriori, FP Growth

Association Rule Learning - Apriori Practical

Market Basket Analysis using Apriori

FP Growth

Market Basket Analysis using FP Growth

Assignment : Association Rule Mining

Reinforcement Learning Theory - Multi Armed Bandits

Upper Confidence Bound - Practical

Thompson Sampling - Practical

Q Learning

Assignment : Reinforcement Learning

Overview of Dimensoionality Reduction

Princinpal Component Analysis

Principal Component Analysis - Practical

Linear Discriminant Analysis

Linear Discriminant Analysis - Practical

Assignment : Dimensionality Reduction

Basics of Regularization and Optimization

Cross Validation

Hyperparameter Tuning

Sampling Methods

Underfitting and Overfitting in Models

Variance and Bias

Assignment : Regularization and Optimization

Advance Trends in Machine Learning

Introduction to Keras and Deep Learning

Practical Demonstration -Keras

Reinforcement Learning Project - Teach a Taxi Part 1

Reinforcement Learning Project - Teach a Taxi Part 2

Reinforcement Learning Project - Teach a Taxi Part 3

Reinforcement Learning Project - Teach a Taxi Part 4

Loan Prediction Project Part 1

Loan Prediction Project Part 2

Course Summary

Interview Questions Part 1

Interview Questions Part 2

Interview Questions Part 3

Career Guidelines

Enroll For Free

Complete Machine Learning Course in English Code

Free

Full Course, No Certificate

With Ads
No Certificate

₹999/-

No Ads

Full Course, with NSDC Certificate

Ad Free
Globally Recognized NSDC Certificate