Interested in Personalized Training with Job Assistance? Know More

Complete Machine Learning Course in English > Data Preprocessing for Machine Learning

Distribution Models - Part 1

19.6k

Start a new search

To find content from modules and lessons

Overview

Hello! I am (name) from LearnVern

You are welcome to Machine Learning course

And this tutorial is in the continuation of previous session

So, let’s begin

Today we are going to talk about Distribution Models

which means how statistics and probability

help us in machine learning, we will try to understand.

So, let’s begin

To start with, I’ll take an example,

Suppose that you are a professor at a college or university

Recently exams were conducted

And you are supposed to check exam papers

So, while checking papers you have done marking on this basis.

Let’s take a demo here,

We have serial no. here or we can take roll no. here

We have roll no. After roll no. we have scores

And the way making was done, for example

The first roll no. got 23 marks

2nd got 34

3rd roll no. got 54 and so on

But the 4th one didn’t get marks, okay

And the 5th one got 28 and the 6th roll no. got 45 marks.

This basically, you will see, is our data

And in this data if you see there is a missing value

This is possible, it's normal, we can forget sometimes and these cases can emerge.

So, I am selecting this data and we have various options here

I go to the ‘insert’ option and click on ‘chart’.

After clicking on ‘chart’ we wait a bit and see

So, as you can see, a chart appears here

Now, what are these charts showing?

The charts are basically representing scores and roll no.

as, how much score 1st one got, and so on.

So, this line chart is there.

Now, we transfer this line chart and convert it to column chart

So, here you can see, a column chart appears,

in which, as you can see, at one place there is a distortion, why?

Because we don’t have value here, so there is a distortion.

So, in this way, you will see, the data points here and their graph plot we can determine what type of distribution is this.

If we say in simple terms, we can determine how data is distributed.

So, let’s work on this and build more understanding of what distribution is and in how many ways we can work with data distributions.

We first need to understand and then we can utilise it.

So, let’s begin,

Firstly, I will tell you what the common data types are.

I am not talking about programming data types,

The data types, I am talking about, are basically the types of data,

which means, one type we have is discrete..and the second one is continuous.

So, I am talking about discrete and continuous data types. Ok

If we give example of discrete data, discrete means,

If I have any item which can take value as, 1, 2, 3. So, we have only three allowed values as 1, 2, and 3.

I will call them discrete values, but at the same time if I take another example, like,

I went to buy cooking oil, so in cooking oil, it is not necessary to take a quantity in whole numbers i.e.1 litre or 2 litres.

In this case, you can take 1 litre, 1.5 litre, or 2 litre or even 2.3 is ok.

then, how much you want, you will get that, so, we call it continuous data.

So, we need to know first that our data can either be discrete or continuous.

Our data can be of both types.

I have given you examples, and even you can think of more examples like,

Day or night, it is discrete data, if we give 1 value to day, and 0 to night. So, either it will be day or night. Right?

Another is evening, then we will give value 1 to day, 2 to evening, and 3 to night. So, it is 1,2,3 these are also discrete.

So, discrete means it will have fixed no. of values. In this, for example we have floating points, and they can increase continuously to any extent, that happens continuously.

For example, the price of any item, or litre or kg, these types of things are continuous.

Now, we start our discussion, Bernoulli, So, B E R N O U L L I distribution..

What is this Bernoulli distribution? Ok

We take an example here, suppose that, we’ll take an example of Cricket.

So, in the example of cricket, ok, in the example of cricket, you tell me that

What will be its result?

Either win or lose, isn’t it?

Can it be different from winning or losing?

Ok, there is a tie, but after a match tie, they make them play again, and after playing again, there is win or lose.

So, if we see in normal conditions, there is win or lose.

Or there can be multiple scenarios where there is success or failure. Only two things are there.

So, in Bernoulli distribution, the output of the data is either yes or no. Yes or no will be its output.

Or we can represent it in binary terms 1 or 0, binary means- two.

So, in Bernoulli also, we have only two items.

Now, if you see, if I say that what is the probability of win, so, the probability of win is , what are the total no. of events, total no. of events are either win or loss, means two. Isn't it?

And what is no. of wins, 1, so 1 divided by 2 (1/2) will be the answer

Which is equal to 0.5. So, 0.5 is the probability of winning.

So, what is the probability of loss? It will be 1 minus probability of win, right? Hence 1 minus probability of... win so if I write here, 1 minus probability of win, it will be is equal to 1 minus 0.5. So this is will be a loss.

So, what is happening here, it can be either win or loss.

And what are the probabilities of both? The probabilities of both are coming out to be 0.5, is it not?

There is a 50-50% probability.

So, if we want to write its mathematical function, its mathematical function will be like, we will write

We also call it probability mass function.

So, probability of x = equal to, I’ll write it in words here.

Here it will be 1 minus P, for what?

1 minus P when X is equal to 0, if X is equal to 0 then it will be 1 minus P.

So, in this way, two things can happen

The second case will be, P when X is 1

So, just this much is the condition of it. Nothing more.

What will be its probability, either 1minusp, as I have calculated here, 1 minus P.

p is the probability when x=1 and the second case was of losing then it will be 1minusp.

So, which distribution is this? It is a Bernoulli distribution.

So after Bernoulli, we can take an example,

We can take an example as there is T E A M team 1, and there is S T A T U S status.

So we have team 1 and status. And we have M A T C H E S matches here.

We have match 1, match 2. ctrl z, we make it 2.

So, I have matches here, how many, suppose we have 8 matches.

Now, we take from win and lose, we take = R A N D, we can take a random array or random text.

For now, we write win and here lose. Ok,

So, win and loss we have, here we put win and lose here. I randomly put win or lose here, here I put lose, and I put win for the rest. Suppose, win here, here & here.

Now, you take frequency count of this, for this distribution output is either it is win or lose.

Take frequency count, here make a table and take frequency count, I highlight it with any colour. -10:43

Now, in frequency count, I have one win and another loss. Ok L O S S, loss.

So, the frequency count of win is coming out to be 1, 2, 3, 4 - it is coming 4, and loss is also 4.

Because for a win it is 4, it will be 4 for a loss also.

So, its distribution will be like,

I put a chart here, its distribution is like, you can see, this is how its distribution is, understood? See.

So this is Bernoulli distribution, where we have only two possibilities or outputs, either success or failure.

Now, we move forward, and understand the next distribution which is Uniform distribution, ok

I give it the name Uniform distribution.

So, what does this uniform distribution mean?

We take an example here, you must have played carom sometime, or we take a better example of a snake and ladders game where we need to roll the dice. We take that example.

In that we have many options as, 1,2,3,4,5,6

Now, for example of rolling dice we have 1,2,3,4,5,6, which means we have 6 possibilities.

So, when we have a total of 6 possibilities, we will find out what is the probability of getting 1 or 2 or 3. We will find out the probability for each one.

What is the probability? Yes, just tell me.

Is it the case that the probability of getting 1 is more, or 2 have more chances of coming?

When you keep on rolling the dice, can it happen that the number which came in the present roll will not come in the next roll? This is not the case.

It can also happen that the same number 1 is coming again and again by chance or 4 is coming again and again.

So, it is a possibility, anything can happen here.

So, in this example of rolling a dice, it is equally likely that any number can come, it is called “equally likely” any number of times. It is not like that some have less chances and some have more.

Also, the result of one is not affecting the other.

So, we call it a uniform distribution. Uniform Distribution. ok

Now, what is its function?

Its function is, we write it, as F of X, so F of X = 1 divided by B minus A. ok

Now, here, what is this B minus A?

So, for A greater than minus infinity, means A is greater than minus infinity

Also, A less than equal to X, means A is smaller than or equal to X.

Similar way, B greater than equal to X, but it is less than plus infinity,

So, this is less than positive infinity.

So, this is the formula it has, I’ll highlight it. Ok

Now, what will happen to this, it is uniform?

If you throw the dice 4 times, what will be the possibility of getting 1.

The possibility of getting 1 is, here, 1,2,3,4,5,6 so possibility of getting 1 is 1 divided by 6,

This 1 divided 6 is the possibility of each one.

If I plot a graph of it, we’ll plot a graph, look, it is purely uniform, so, we call it Uniform distribution. Okay?

Now this is the second one after Bernoulli distribution.

Now, let’s move ahead, the next distribution is- Binomial distribution. B I N O M I A L.

So, now for the first one, the Bernoulli distribution

we took example of cricket-win or loss

So, let’s move back to the example of cricket and what we are doing here?

We are tossing a coin.

Now, what will happen in this? Either you will win or lose. Isn’t it?

So, we represent this scenario in which we can get either head or tail in toss.

These are the two possibilities. We can call head as success… and tail as failure. Ok

So, here also there is success and failure.

Now see, here only two possibilities are there- success or failure, head or tail.

You can take a number of examples.

Where there are two outputs possible, we call it binomial.

Bi, B I means two, so we call it binomial.

So, in this case, the output can be equally likely or even not equally likely to happen. It is not a compulsion.

It means that the chances of occurrences of success and failure events are not necessarily equal.

For example, if I, as you can see, I am not such a strong person, and if I challenge a bodybuilder person for fighting.

Then basically my chances of winning will be very less, maybe its probability is 0.2.

Then what will be the chances of my failure? It will be is equal to 1 minus 0.2

That comes out to be 0.8. So, chances of my failure will be 0.8

So, it is not equally likely.

So, in binomial, it is not necessary that the probability of both events is equal.

It is ok, anyone can have less probability and other one can have high. Ok

But whatever trial you are doing in this, it is not affected by outputs of previous trails. It is independent.

For example, if I have five balls. Like 1, 2, 3, 4, 5

if I pick up one ball from here and put it somewhere else.

Now, you don’t have five balls, so the probability you will be calculating will be affected. Right?

But in this case, it is not affected because it is independent. Ok

So, you need to keep it in mind that every trial is independent from previous trials and it will not be related with previous ones.

Also, there are only two possible outputs not more than that i.e., win or lose, success or fail, head or tail. These outputs can come.

After this, we will conduct N number of trials, and in each trial, the probability of success or probability of win or probability of failure, will remain the same across N trials. It will not change.

So, this is binomial distribution.

Its formula is P of X equals to N factorial divide by, okay, divided by,

I'll put a symbol of “divide” and take it in the center, ok, I took it in the center and put a symbol of “divide” here.

So, N factorial and here comes in bracket N minus X, factorial multiplied by X factorial.

Let me merge them. I have merged them, center and center.

This whole formulation will be multiplied by P to the power X and Q to the power.. N minus X. So, this is basically the formula for binary distribution. OK

Now, what type of graph the binary distribution has?

I’ll draw the frequency graph of this distribution.

Let me take some data, I’ll take some values here i.e., 1,2,3,4,5,6,7,8,9,10. OK

So, in this case, if you see, what is the probability of success and what is the probability of failure?

If we can see, it is not equal in this case.

Means, probability of success is not equal to probability of failure.

So, our distribution would be in such way, or our data would be like,

Suppose here, the values are 0.5, next value is 1, 2,3,3,2, after that is 1, 0.5 , 0.5, 0.4.

I’ll convert this distribution into a chart.

Here what we can understand?...As we can see, the probability of success is not equal to the probability of failure.

We will get this type of graph when the probabilities of success and failure are not the same.

But what if the probabilities are the same?

So, the first column will be as it is. But in the second column if this one is 0.5, it will move symmetrically. That is First and last would be the same.

If the 2nd row is 0.8 then it will be 0.8 for the 9th row also.

For 3rd and 8th row it will be 1; for 4th and 7th row it will be 2; for 5th and 6th row it will be 3.

If we see a graph of this, the probability of failure and success will be the same and the graph will be symmetrical.

So, this is the Binomial distribution.

Next, we will move to Normal distribution.

What is the normal distribution?

Why was it named as a normal distribution?

It is because most of the situations/events happening in our surrounding follow this distribution.

For example, if we are praising someone, the number of times we are praising someone will eventually follow Normal distribution.

I will explain what this actually means.

So, the natural phenomena in our surroundings, what we say are the same throughout. You will understand from its characteristics.

The normal distribution (pause 4 sec) has equal mean, median and mode. It is equal in this case. It coincides at the same place. It is the first characteristic.

Now, the second thing is, the shape of normal distribution is bell shaped. It is very popular.

It has a bell shaped curve. It is symmetric about the mean. symmetric about the mean.

Let’s move ahead.

Now, here you can check that the area under the curve, is 1 for standard normal distribution.

So, these are the properties of normal distribution.

Suppose, let's take an example, suppose we have days here, Monday, Tuesday, Wednesday, Monday, Tuesday, Wednesday, Thursday, Friday and Saturday. Sunday also, Hence we have a total of 7 days.

And here we have values of 2, 4, 6, 8, 6, 4, 2.

You can see this distribution. I am adding a chart to this.

As you can see, how this distribution is.

And the first characteristic I told you, mean, median and mode. We can find them.

We take the average of these values for mean which is 4.57.

If we want to find the mode, the mode would be 2.

Next is median, which comes out to be 4.

So, as you can see, mean, median and mode are coinciding. Basically, the mode can be 4 also, as 2,4,6 have come two times. So, it can be 4 also. This and this, both can be 4.

So, as you can see these values are coinciding and they are similar.

I have taken an example in this way. 4, 4 and 4.57. So, this is a normal distribution.

If you see the curve here, it is a bell shaped curve as it looks like the shape of a bell. OK

We can select another type of chart but that is not required as we can see the shape of the bell clearly.

Now, I will explain another distribution that is Poisson distribution, P O I S S O N.

To understand this distribution, we take an example,

Suppose you work in a KPO or BPO and provide customer service.

While providing customer services you record the number of emergency calls.

Let’s say it is 30.

Likewise, the number of hot issues, let’s say it is 23.

Similarly, many things can be. Another one can be the number of late reporting. Let’s say it is 50 today.

As you can see, there are different types of events. These events are successfully happening.

It means, 30 emergency calls were received, 23 hot issues were reported. But these two are not correlated. Similarly, late reporting and the number of hot issues are not correlated.

So, here, any successful event is not correlated with others.

And this is the first property of Poisson’s distribution that the number of events taking place successfully will be considered.

Now, if we see the number of emergency calls in the long term, their probability must be the same. For example, if we see for 4 or 5 days or any short or long term, we will receive on an average near 30, sometimes we may get 29, sometimes 27 or 31.

So, this will be near to 30.

If you keep on reducing the interval and minimise it then its probability will be zero.

For example, if I take 1 minute, there are no calls, but there are 3 calls in 1 hour.

So, in 10 hours of office work there are 30 calls. But if we take one minute or one second interval the probability of getting a call will be very small or zero.

So, these are the properties of Poisson’s distribution.

Here as you have already seen it is all about frequency. We are talking about frequency of data.

If I try to plot this data, I insert a chart here.

So, here is the plot of Poisson’s distribution. OK

So, this was all about different types of distribution.

I hope you understood it well.

They have mathematical formulas, some I explained already and for rest you can read.

We will use these learnings further. Its understanding is very important.

Now, we will stop this session for now and we will continue in the next session.

If you have any queries or comments, click the discussion button below the video and post there. This way, you will be able to connect to fellow learners and discuss the course. Also, Our Team will try to solve your query.

So, keep learning and remain motivated.

Thank You very much.

See More

Learner's Ratings

4.2

Overall Rating

71%
10%
0%
10%
9%

Reviews

M

Mohammad Faruk M

5

Where can i get the datasets for assignment

A

Aryan Ambat

5

Yes

Z

zeyana Fathima

5

thanks for giving this wonderful course in a understandable way please provide the details from where can i get the datasets

L

Losika Nicholas

5

were can i get the dataset

K

Kumar Madduru

5

Thanks for giving this course

D

Dinesh Kumar

4

Your screen is very blur and it doesn't has clarity even in 720P.Please make sure that will not happen again.

D

DOGALA UDAYKUMAR

5

bettor

N

Naresh Kulunge

4

good learning but the content titles are jumbled up, like first title of this module is decision tree dichotomiser which is practical part ahead of theory part. Same with the SVM practical 1 title has

E

Eswar Veeranki

5

good

I

Isakki Alias Devi P

5

Wonderful course

Show More

Recommended Courses

Free हिन्दी

Excel For Data Analysis

56806

3.7 Enroll For Free

Free हिन्दी

SQL For Data Analysis

21798

3.8 Enroll For Free

Course Content

Getting Started with Machine Learning

How to use LearnVern

Introduction to Machine Learning

Environment Setup Part 1

Environment Setup Part 2

Environment Setup Part 3

Data Wrangling

Importing Libraries and Dataset

Handling Missing Data

Handling Missing Data - Practical

Encoding Categorical Data

Encoding Catergorical Data - Practical

Splitting Dataset

Splitting Dataset - Practical

Normalizing the Data - Part 1

Normalizing the Data - Part 2

Finding Machine Learning Datasets

Exploratory Data Analysis

Plotting Graphs - Part 1

Plotting Graphs - Part 2

Distribution Models - Part 1

Distribution Models - Part 2

Assignment : Data Preprocessing for Machine Learning

Machine Learning Paradigms

Assignment : Machine Learning Paradigms

Decision Tree Iterative Dichotomiser 3

Random Forest

Support Vector Machine Classifier

Support Vector Machine Classifier - Practical 1

Support Vector Machine Classifier - Practical 2

Naive Bayes Classifier

Naive Bayes Classifier - Practical 1

Naive Bayes Classifier - Practical 2

Evaluating Classification Models Performance

Evaluating Classification Models Performance - Practical

Overview of Classification

Logistic Regression

Logistic Regression - Practical - 1

Logistic Regression - Practical - 2

KNN

KNN Practical - 1

KNN - Practical 2

Decision Trees for Classification

Decision Trees for Classification - Practical 1

Decision Trees for Classification - Practical 2

Assignment : Supervised Learning Algorithms

Simple Linear Regression

Simple Linear Regression - Practical

Salary Prediction using Linear Regression

Multi-Linear Regression

Startup Prediction using Multiple Regression

Support Vector Regressor

Support Vector Regressor - Practical 1

Support Vector Regressor - Practical 2

Decision Tree Regressor

Decision Tree Regressor - Practical 1

Decision Tree Regressor - Practical 2

Regressor Model Selection

Evaluating Regression Model Performance

Evaluating Regression Model Performance - Practical

Assignment : Regression Algorithms

Distance Metrics

K-Means Clustering

K-Means Clustering - Practical

Mall Customers Prediction using K Means Clustering

Hierarchical Clustering - Agglomerative , Divisive

Agglomerative Clustering - Practical

Divisive Clustering - Practical

DBscan Spatial Clustering

Mall Customers Prediction using Hierarchical Clustering

Assignment : Unsupervised Learning Algorithms

Association Rule Learning - Apriori, FP Growth

Association Rule Learning - Apriori Practical

Market Basket Analysis using Apriori

FP Growth

Market Basket Analysis using FP Growth

Assignment : Association Rule Mining

Reinforcement Learning Theory - Multi Armed Bandits

Upper Confidence Bound - Practical

Thompson Sampling - Practical

Q Learning

Assignment : Reinforcement Learning

Overview of Dimensoionality Reduction

Princinpal Component Analysis

Principal Component Analysis - Practical

Linear Discriminant Analysis

Linear Discriminant Analysis - Practical

Assignment : Dimensionality Reduction

Basics of Regularization and Optimization

Cross Validation

Hyperparameter Tuning

Sampling Methods

Underfitting and Overfitting in Models

Variance and Bias

Assignment : Regularization and Optimization

Advance Trends in Machine Learning

Introduction to Keras and Deep Learning

Practical Demonstration -Keras

Reinforcement Learning Project - Teach a Taxi Part 1

Reinforcement Learning Project - Teach a Taxi Part 2

Reinforcement Learning Project - Teach a Taxi Part 3

Reinforcement Learning Project - Teach a Taxi Part 4

Loan Prediction Project Part 1

Loan Prediction Project Part 2

Course Summary

Interview Questions Part 1

Interview Questions Part 2

Interview Questions Part 3

Career Guidelines

Enroll For Free

Complete Machine Learning Course in English Code

Free

Full Course, No Certificate

With Ads
No Certificate

₹999/-

No Ads

Full Course, with NSDC Certificate

Ad Free
Globally Recognized NSDC Certificate