hello! I am (name) from learnvern. ( 6 seconds pause ; music )
In the continuation to the tutorial on machine learning, we will further watch this tutorial today, so let us watch it.
And in the tutorial on machine learning today, we will see the FP growth algorithm.
FP growth, meaning frequent pattern growth algorithm.
We have already conceptually understood this algorithm earlier and,
Now with the help of a project use case , a dataset, we will understand it in more detail.
So first of all I will introduce you with the data set,
So this data set of transactions means that customers have from a particular place, from a shop bought items and this is their dataset.
So in this a single record, one record will mean that some customer has bought those items,
so in every record this shows the customers buying history.
So let us first import some libraries in it.
So, import numpy (pronounce: num pie) as np, this is the first library required,
The second library we will import is panda as pd , let us begin with these two.
So let me first introduce the dataset, so the dataset is equal to ps dot read underscore csv. (read slowly, typing).
and here we will load our dataset,
I will copy the path and paste it inside these brackets and our dataset will load.
This dataset that we are loading has become a dataframe,..
you can see it here , dataset dot H E A D head,
so here we can see that this has become a dataframe.
Here we will do another thing, we will write a comma and H E A D E R header is equal to None. Because in this dataset, the header which means the column names are not there.
So,it will take the column names as 0,1 ,2,3 , in this way.
Now, you can see that, this first person whose data is at 0 id,
He has bought these many items, OK..
while the second one has bought only three items, he has bought only one, he bought two and he has 1,2,3,4,5,bought five.
So how many items one has bought, that list we will get , and rest we will get as NA.
And this is how our dataset has been loaded.
Understood, up till now ?
Now upon this dataset, a little preprocessing is required ; meaning of that, we have to enter it in a list because the algorithm will expect this in a list.
So, I will make a list by the name transactions, with the name transactions I will make an empty list and in this list only we will insert all the data.
See here for our data, dataset lets see the shape, dataset dot shape.
So in the dataset shape, 20 items are maximum and 7501, seven thousand five hundred one total items are in it.
Now what we will do, here for i in range and here zero two seven five zero one, so for all the rows , I will run here, transactions dot append.
here, I will append data for each row in transactions so transactions dot append,
And how will I append is in a list, in that also in a list , in string format I will dataset dot VALUES values , and values of i comma j,
so i will become the column,
i will become row and after that.. j will become columns,
so i will get j followed, for j in R A N G E range and what will that range be , 0 to 20,
so j in range 0 to 20 , in this range, alright.
Now, we will see that the transactions that we have are in what format,
See these are transactions that we have, T R A N S A C T I O N transactions,
You must be able to see that all transactions are visible to you now.
Wherever there is NA , NA has been displayed and for the rest frozen smoothie , and rest of the products will all be displayed.
Here, we have data for transactions.
This is what we wanted, we wanted the transaction's data separately.
Okay guys! I hope you are following me..
Now what we will do, now basically we will do is pip install,
Here, we will install the F P Growth algorithm,
and for installing that what we will use is, we will py-fp-growth, pyfp (pronounce :pie f p ) G R O W T H , pyfpgrowth.
So this pyfpgrowth package will give us FP algorithm, the FP Growth algorithm in an implemented way.
Here, it has been installed.
Now, you import it, import pyfp G R O W T H, pyfpgrowth so, I have imported pyfp growth.
And if you want to give it a short name then “as F P G” , with this name we will import it.
Now inside f p g, you will see by adding a dot, we have one ‘find frequent patterns’ and ‘generate association rules’,
These two functions, we are going to use,
Let’s use them one by one.
When we want to find patterns, patterns because the association is patterns.
When we want to find patterns then f p g dot find patterns formula will be used,
Find frequent patterns,
Now this requires transactions.
We will pass it transaction data,
It has the name transactions only and the support threshold we have to decide,
that is how much threshold support we want to give,
so here support threshold we will give two for this dataset and no we will execute it .
Now we will get the patterns,
Here it is executing on that complete dataset and after this executes.
Then, we will form rules which are our association rules, alright.
Here I will write A S S O C I A T I O N association and will take out rules, just now we were finding patterns so let it find patterns OK.
Hash P A T T E R N pattern,
To get the rules till the time it is finding patterns,
I will write the next step to specify rules , so the name is f p g dot.
Now what will come here is generate rules,
so generate association rules and in this the patterns identified above will be passed.
P A T T E R N S patterns.
And for the confidence, I will give confidence of 70% or for 60 %,
So, 60% of confidence has been given
So in this way, rules will be generated and let's see these rules.
So these rules when we see after displaying then we will understand which all items we should keep together if we are a seller,
if you are a seller and you basically manage the outlet then how would you be keeping things together, that we will see here now.
So, it's taking some time, let it take some time, it doesn't matter.
These patterns, these patterns we get it in a dictionary format,
and here we will get the rules and then we will watch them.
This FP Growth algorithm is better than apriori.
Because in this algorithm tree-like structure is used and this time and gain does not iterate the complete data set.
Because in iterating a complete dataset apriori takes a lot of time and has to do a lot of computations.
So here FP Growth is better.
My suggestion would be that you should use FP Growth in place of apriori whenever you want to do a market basket analysis.
Let's see what are the rules being displayed, so here you can see what are the rules displayed, OK. clear !
So here you see in the rules, the first rule we will see from the beginning , so let's see it, so after comma the moment this finishes,
ok let’s see it from the beginning,
so let’s begin, so from here.
See N a N are also being considered here, it could have been removed,
NaN, water spray was repeated many times , then shrimp , water spray, whenever it was bought , at that time N a N was being displayed OK…
so this let’s discard, we will discard this,
but watch here, low fat yogurt and napkins.
Whenever these two items were bought spaghetti was also bought, then again low fat yogurt, napkins were bought and spaghetti was also bought.
Here, we can see what items were purchased and with them what is the possibility of buying the other item , OK.
This is what we are getting here.
Like here someone bought herb and pepper and napkins and then ground beef was also purchased..
In this way, we get a combination here, association rules and on the basis of these association rules we can place these items in the outlet where these are being sold.
If the same thing was for an online website then when you do product positioning there or advertise then this will also help you there.
This is how apriori or FP Growth algorithms help us in doing market basket analysis meaning they help us in finding association rules where we can place one product with another, make recommendations.
We can make offers and strategies for the customers.
You must apply this on more datasets and practice.
Friends, let's conclude today.
Today’s session we will stop here and the rest of the parts we will see in the next session.
So keep learning, remain motivated, thank you.
If you have any queries or comments, click the discussion button below the video and post there. This way, you will be able to connect to fellow learners and discuss the course. Also, Our Team will try to solve your query.
good learning but the content titles are jumbled up, like first title of this module is decision tree dichotomiser which is practical part ahead of theory part. Same with the SVM practical 1 title has
Isakki Alias Devi P
yes, i am happy to learning for machine learning in LearnVern.it i s easily understanding for Beginners.
Superb and amazing 😍🤩 enjoyable experience.
Muhammad Nazam Maqbool
Absolutely good course... will suggest it to everyone. has superb content that is covered in a fantastic way.
super course and easily understanding and Good explaned
Ruturaj Nivas Patil
Very well explained in entire course. Great course for everyone as it takes from scratch to advance level.