Hello, everyone
I am Kushal
From LearnVern.( 6 seconds pause ; music )
In our previous session, we learned about What is Data Wrangling, or, what is Data Preprocessing.
So, now we will move ahead and see,
How this Data Wrangling is actually performed.
But to do this,
We require certain libraries.
So we will first see What are these Libraries?
How to import them?
And to execute Data Wrangling, we need Data
So we will see,
how we can import data and load it.
So we are going to cover all these things.
Now, first we will see
What actually is a Library? (Repeat)
As we have libraries in our school, colleges or maybe even at home,we might have a small library.
So whenever we have any doubt or there is lack of clarity in certain concepts then, we go into a particular section of that Library and start reading books, and after completing one or two books, we achieve the clarity on that concept.
Similarly, here in software also whenever we are doing programming, we require a lot of functions.
And at that moment, do we start writing those functions?
No!, we don't sit and start creating those functions, if we do that then, it will take years and years to write down the functions only.
So from years of effort there are different set standards or different sets of features, that are made available to us in the form of libraries.
You can see on your screen some examples of these libraries they are:-
NumPY (pronounce : Num pay),
Matplotlib (pronounce : mat plot lib),
Pandas,
Keras,
Scikit-learn (pronounce : sai kit learn),
Tensorflow (pronounce : tensor flow),
Supposingly for your software, you have to work with a lot of numbers or numerical data.
For that, you can start with NumPY or Panda.
They help in handling your numerical data.
And even more they will help you in your EDA and also Data Wrangling.
Alright ! Okay !
Next, if you want to do any Visualisation, then you can go with matplotlib.
And if you want to implement a lot of algorithms and test them and find out the metrics of it Then, Scikit-learn will be a fantastic Library for that.
So, further if you are interested in deep learning then keras and tensorflow will help you a lot.
All these Libraries provide you with ready-made functionalities and through them you can perform your tasks which will help you in depicting your programming skills, analytical abilities and machine learning capabilities.
(02/57)
So, now let's start and see, how can it be set up ?
For that, you can start from any of the two tools, that is, Jupyter notebook or Google Colab.
Here, I have started using a jupyter notebook.
In this I will show you how you can set up a library.
Here,
Step no.1 is installation, and if you don't have that. then, we can not even import it.
So,
Step 2 is import.(typing)
Now, I will show you how installation and import is done.
When you want to install then you have the command
‘pip install pandas’.
In front of this pip, we will put an exclamation mark.
Pip is a keyword of a software
Install is a command, that we are telling the pip to install it,
what it has to install that we are writing at last,
that is Pandas, pandas is the name of Library.
There anything can come like keras, tensorflow, NumPY.
So in place of pandas, you can write the name of any of the pandas.
(04/26)
Let's see, how we can do the Installation.
So the exclamation pip installed P A N D A S pandas.(typing)
Control + enter.
Here, you can press control+enter as well as
Shift+enter.
With shift +enter a new cell also gets created.
So, now with the star mark over here, you will know that it was in the execution stage.
And as soon as it's executed then it gets its number, then on the second number it has got executed.
Now, here you can see requirements already satisfied, which means pandas is already installed.
So if it is installed,
The next step will be import
So to import it we write,
Import, the name of the library, then next as which is optional, import and libraries name is compulsory then as and a short name which we call as alias.
So, here you will see import pandas as pd
Here I will show you on the notebook
(Writing program ; be a little slow)
Import pandas pd , this is also right.(Typing)
Import pandas as pd , this is also right.(Typing).
(done)
So, now you might be wondering. Why did I do it in two ways?
Because whenever the name of a library is really big then, we shrink it down in this way.
So, in future when you have to use this library then you don't need to write pandas completely, you can just type pd and you will get all the suggestions of different functions of pandas in front of you.
So by giving an alias name you can shorten the name and use all the functions by using that alias.
So, this is all about installation.
Clear up till now !
After installation, next is about data sets.
So, if you want to load data sets, then the pandas library is the best to load data sets.
Now, to load a file, there is one function of pandas that is,
Read c s v, by using it you can load a file.
So, how is the syntax?
(Writing program ; be a little slow)
Data equal to pd dot read underscore c s v and in the bracket you will write, path of the data. In this way, you can read it.
(done)
Now, let's try to execute this.
So, here I am opening a file in front of you for the demo.
So, here there is a file called Data dot csv, so now I will edit this file to show you the content in it.
So, here you have the content
1 kunal 3 5. (speak digits individually)
1 Rehan 5 7.
1 Rohan 4 7.
Meaning,
Here, we have serial number, name,quantity and then day.
we have these many things.
So, now I want to load this file itself.
So how can I do that?
First, I will take the path of this,
For that I will go to the properties of it, and from here I will copy its path…
Copy.
Now, I will use this path over here,
(Writing program ; be a little slow)
Data is equal to pd dot read csv.
(Pause 15 seconds ; Typing)
I am reading data dot c s v, now.
(done)
Let me execute this,
So, here you can see I was putting backslash, so I will have to put the backslash everywhere, then only it will be correct.
In Windows, you will have to put double backslash and in Linux you can just put single.
Now, if I show you the data.
So by just typing data.
We can see the data and it has perfectly picked up the data well with s number, name, quantity and day.
So, this is how you can load a file which should be in c s v or t s v format.
Understood , Perfect!
Now, if you want to load an inbuilt Datasets.
So, how can you do that?
For that, first we will have to import a library.
(slow)
For example, from sklearn import datasets. (typing )
So, from sk learn import datasets.
(done)
Datasets is that package, where a lot of built in or pre built data is available.
Here, I will show you,
(Writing program ; be a little slow)
if I type here datasets dot, and here tab, then L O A D load, load boston, breast cancer, diabetes.
We have so many data samples present.
Ok, so, from so many data samples, we will select iris. (pronounce ; ai rish)
Now, after choosing iris let me execute this.
So, now you will see all iris data has been loaded.
So datasets, which is a built in or pre built package, from there also we can load data.
Now, we will put this data in a variable
So data is equal to ,then I will copy the code from here.
Now, here you will see that data has come inside the data.
We have many other things in data, here we will type targets…. so we will see all the targets.
(Done)
So, we saw two different ways to load data
One is pd dot csv by which we can load the data, and,
Second method is from sklearn import datasets where data is equal to datasets dot load iris.
In these two ways we can load the data.
Now, as we know how to load the data, we know how to import libraries, and we even know that inside these libraries there are different functions that can be used in data analysis also.
But before analysing the data, we will make it healthy for its analysis.
How can we do that?
It's one step in handling the Missing values.
So, in our upcoming session we will see,
How we can handle missing values.
So keep watching and remain motivated.
Thank you.
If you have any queries or comments, click the discussion button below the video and post there. This way, you will be able to connect to fellow learners and discuss the course. Also, Our Team will try to solve your query.
Share a personalized message with your friends.