Hello ! I am (name) from Learnvern .( 6 seconds pause ; music )
Welcome to the course of machine learning and this tutorial is the continuation of the previous session.
So let us see ahead.
And today we will watch association rule mining or we can say association rule learning.
I have already prepared a data set here and you can see that this data set is of what type.
So this data set is a transnational data set .
I have made this dummy data set here so that it is easy to calculate it , or compute it.
So here you can see that the dataset is equal to and one list has multiple lists.
What this means is that the first customer bought all these things, then there's this second customer who bought all these things, the third customer bought this, means that customer to customer we are maintaining a record of which customer bought what, so this is that data set.
Well, we have the data set and now let us move ahead and import our libraries.
Import pandas as pd and then we will import. After this we will use a library called MLXTEND, ml-xtend. (read slow, typing)
So this ml-xtend library will help us to encode and for data also it will help us and with the help of this I will explain you ahead like here customer bought milk but second customer did not buy so how we will represent that, we will have to bring full uniformity.
This customer only bought one, two, three, four things, the customer above has bought six, six, seven, seven things so we have to bring one kind of uniformity.
let us see how that will come, so mlxtend dot preprocessing import transaction encoder. (read slow, typing)
Here we are using a transaction encoder. Ok, clear !
Now, let's make an object of the transaction encoder, TE is equal to transaction encoder.
and now let us use TE underscore ARRY. In transaction encoder we will fit TE dot and what will we fit, we will fit our dataset and our dataset is list and
let me put my dataset here, so I have put the data set here…
let us see what happened to this te_arry so here we will after fitting it transform it here itself and this way you can do pipelining.
Till now, I was showing you that if we fit differently and predict differently whereas fit and predict or multiple functions can be run one after another and that works,it really works.
So transform and the data that is input is that of the dataset only, OK…
TE array now you see that TE underscore ARRY, so this is our te-arry,fine.
Now this te-arry I will make it into a dataframe, because it is not looking good in a list, right.
here we will make it DF a dataframe , dataframe pd dot in capitals D A T A, Dataframe and in this dataframe we can put this te, TE underscore ARRY. .
Now, you can see our df is ready and here we can see the df very nicely.
Now, there is one thing more that when I was making it a dataframe then I did not take the column names here
let me take the column names too ok, so that it becomes easy for us.
So here columns equal to TE dot COLUMN.. te dot columns.
So this way the dataframe that will be made,
see now , so you can see that apples are false.
So the first one didn't buy apples, didn't buy corn , didn’t buy Dill, bought eggs, didn’t buy ice cream, and bought kidney beans.
So whatever is bought is shown as true and whatever is not bought is shown as false.
So you can see that our data has been displayed in the form of true and false and at the sametime in all we have an equal number of columns and corresponding data we got.
So this is the data frame we created.
Now, you will see that from here we will import from ML xtend dot frequent underscore patterns ok, we will import frequent patterns, apriori .
So here we have imported apriori.
Now, in this apriori algorithm, we will pass this dataframe.
So apriori and in this, we will pass DF, dataframe.. (read slowly, typing)
and we will define it as minimum support and how much minimum support will be,
it will be 0.6, 60 % of support will be there and you will see that with 60 % support ,
see this is with minimum 60% support ,
so items like item number 3, item number 5 ,
so these are no combinations ,
what we want is association means at least we want two, so like it can be this one, this one, this one ,
so from here to here there is a strong association among them 60%, 80% of support we can see , right.
In this way , I will in the same way use this command and extend it to display with the name of columns so, use underscore col-names equals TRUE true,
by this way you will even understand it in a better way.
Now see, if you want to take out associations here, or tell associations then, you can tell eggs and kidney beans have 80% support , eggs and onion and all after that have 60% support.
If you want to keep something together then keep eggs and kidney beans together , keep eggs and onions, keep kidney beans and milk because the person who is buying kidney beans is buying milk also and the one who is buying onions is buying kidney beans also.
This is the evidence that we get here.
In this way, we can use the apriori algorithm and this is very helpful in recommendations , so you can use it.
You can choose some other data set and apply on that and let us know how you feel.
Today’s session we stop here only and the parts ahead we will see in the next session , so keep learning and remain motivated.
If you have any queries or comments, click the discussion button below the video and post there. This way, you will be able to connect to fellow learners and discuss the course. Also, Our Team will try to solve your query.
Ruturaj Nivas Patil
Very well explained in entire course. Great course for everyone as it takes from scratch to advance level.