Interested in Personalized Training with Job Assistance? Know More

Statistics For Data Science Course in English > Inferential Statistics

Normal Distribution

14k

Start a new search

To find content from modules and lessons

Overview

00:03 - 00:17 (music)

Now, we will cover the second and most important distribution which is normal distribution.

So, normal distribution is a distribution which is mostly used distribution.

Normal distribution is such kind of a distribution, which is a symmetrical distribution, which means mean, median and mode lies at its centre and my left and right part of the distribution are equal.

So, what is the one property of normal distribution? In probability density function, whatever is the area under the curve that gives me the probability like we had seen in the old cases that if I have a distribution and in that entire distribution I take out the area, then with that I can get to know the probability.

Since the normal distribution is a symmetrical distribution, so the area under the curve is always 1 or 100%.

That is the reason why we use the normal distribution widely because easily with the area under the curve we can find its probabilities.

We also call the normal distribution is as Gaussian distribution or a bell curve.

Why? Because, if you see its curve, it is a bell shaped curve because of its shape we also call it as bell curve.

Let’s see the probability density function of normal distribution, which is in this way, fx is equal to one upon sigma under root 2 pie multiplied by e to the power minus x minus mew squared upon two sigma squares.

We have discussed in detail about whichever values are present here.fx is my normal probability density function, x is my random variable or it is a value of any variable, mew is my mean, sigma is my standard deviation and sigma square is my variance.

So, if you put the values in this formula, then we will get the normal distribution.

Now, we will see a few examples of normal distribution.

Normal Distribution is such a distribution which we see in a lot of places in our day to day life.

Like if we create a distribution of male’s heights, which means we have plotted the heights of different men and we got to know that the average height of every person, of every male forms a normal distribution.

Now, if I see the blood pressure curve, this means that mostly my blood pressure lies between 80 to 120.

Correct? So, the most of the value lies on 80 which means, my median and mean form on 80.

That's the reason why it forms a normal distribution.

Same if I plot the shoe size, and I see its frequency, then even that would be a curve of normal distribution.

Normal distribution can be defined by two values, the formula that we had seen, in that I had one important value mean, which we defined by mew and one value was sigma which is my standard deviation.

So, these two values if we know then we can easily define the parameters of normal distribution.

We call mean as the locator parameter and standard deviation is called the scalar parameter.

So, now if I take the example of mean, if I have changed the mean in my distribution different times, then how will my curve be affected by it, let's see that once.

This is a curve that I have drawn.

There are normal distributions for different means.

Now, if on the x axis I take my score on any subject, CAT and SAT exam that we give.

I have plotted its score on the x axis and probability density is on my y axis.

So, if you see here, I've drawn three curves.

One is denoted with blue, where my mean is 900.

One is denoted with red, in which my mean has increased to 1100, one which is showing the distribution with yellow that is defined for 1300 mean.

If you see this as soon as my mean increases, my curve starts moving to the right-hand side.

So, this means that the mean defines where my curve’s peak lies.

Which means, if I have increased the mean, then it will shift towards the right.

If I decrease the mean, then it will shift to the left side.

So, mean is an important parameter which defines normal distribution.

Second, let's see how the standard deviation affects the normal distribution.

Now, in the same scenario, if I choose the change the standard deviation and my mean in this particular scene is the same.

If I see the blue curve, where my standard deviation is the lowest, which means it is 80.

In the red curve, the standard deviation is 120.

And in the yellow curve, the standard deviation is 190.

So, as and when my standard deviation keeps increasing accordingly my curve is getting widened, which means how much more is your standard deviation that much wider would be your curve.

This means that our values are widely spread which means those values are that far away from the mean.

And if you see the blue curve, its peak is narrow, it is flatter in that case.

Which means of less standard deviation is our curve is closer to the mean.

So, standard deviation basically tells us whether our curve is flat or wide.

So, in this way, we have seen different properties of normal distribution.

There is one more very important property of normal distribution.

We call it as 1-2-3 rule.

We have already learnt it which we also call as empirical rule.

So, what is the meaning of this rule, if any of my data…if I have a normal distribution and if my curve is following a normal distribution.

Then only with the help of standard deviation and mean I can tell where most of my values lie in that distribution and simply if there is any normally distributed curve, you can apply 1-2-3 rule in it.

What is the 1-2-3 rule? Suppose if I have a graph of probability density versus x.

In which mew is my mean.

If I go right or left of the first standard deviation then there is 68% of probability that variables would lie in the first standard deviation, it defines that.

In the same way if I go two standard deviations right from the mean and two standard deviations left, then the distribution that I will get and if any value lies in it.

Since it is following a normal distribution, I can simply say that there will be a 95% probability of that variable.

In the same way the third standard deviation of the mean holds 99.7% probability So, this is our very important property which we also call the 123 rule and it is very useful in the coming data science field.

Why? Because we have to know which value holds how much probability and if it follows a normal distribution then we can simply follow this rule and find its probability.

So, now we have seen a lot of different properties, for once we will sum them up and see.

Now, we will see the summary of all those properties.

What is exactly the normal distribution? Normal Distribution is a symmetrical distribution whose mean, mode and median lies at the centre.

Second property, the total area of the curve is always 1.

This means that there is 100% area under the curve in our normal distribution.

Third property is that, my mean defines that my curve’s centre, where will it's peak lie.

If my mean is more, it will lie on the right, if my mean is more it will lie on the left.

The standard deviation tells me whether my curve will be flatten or it will be wider.

Fifth property which is very important is that if any data is normally distributed then we can apply 1-2-3 rule on it or we can apply 68-95-99.7 percent rule on it.

Here we have basically covered what normal distributions are.

If you have any comments or questions related to this course then you can click on the discussion button below this video and you can post them over there.

In this way, you can connect with other learners like you and you can discuss with them.

See More

Learner's Ratings

4.7

Overall Rating

83%
0%
17%
0%
0%

Reviews

A

Abhishek Srivastava

5

Awesome

V

Vrushali Kandesar

5

This course is really nice, just have one question in empirical rule explanation , SD deviation example trainer is saying mean however mean (20+30+40+50+60+70/6) value is different kindly confirm than

P

Prabhat Yadav

5

Superb and amazing 😍🤩 enjoyable experience.

K

Kesavaraman Balakrishnan

5

wow... Teaching and voice is good

P

Prashant Dadhania

5

Good Course

Recommended Courses

Free हिन्दी

Python Programming Course

254976

4.3 Enroll For Free

Free हिन्दी

Excel For Data Analysis

58502

3.6 Enroll For Free

Free हिन्दी

SQL For Data Analysis

22627

3.7 Enroll For Free

Free हिन्दी

Complete Machine Learning Course

22206

4.3 Enroll For Free

Course Content

Introduction to Statistics For Data Analysis and Data Science

Fundamentals of Statistics

Basics of Descriptive Statistics

Measures of Frequency

Measures of Central Tendency - Mean

Measures of Central Tendency - Median

Measures of Central Tendency - Mode

Measures of Variability - Mean Absolute Deviation

Measures of Variability - Variance

Measures of Variability - Standard Deviation

Measures of Position

Measures of Shape - Skewness

Measures of Shape - Kurtosis, Box and Whisker Plot

Assignment : Descriptive Statistics

Basics of Inferential Statistics

Introduction to Probability

Basics of Probability

Probability Distribution

Discrete Probability Distribution

Continuous Probability Distribution

Uniform Distribution

Normal Distribution

Standard Normal Distribution

Log Normal Distribution

Exponential Distribution

Methods to Detect Outliers

Methods to Treat Outliers in Python

Feature Scaling - Normalization V/S Standardization

Sampling Methods

Sampling Distribution

Central Limit Theorem

Assignment : Inferential Statistics

Hypothesis Testing in Data Science

Null and Alternative Hypothesis

Making a Decision - Reject or Fail to Reject Null Hypothesis

Type I & Type II Errors

Covariance and Correlation Coefficients

Types of Correlation Coefficients

Assignment : Hypothesis Testing

Types of Statistical Tests

Z-Test: Critical Value Method

Examples of Z-Test: Critical Value Method

Z-Test: P-Value Method

Examples of Z-Test: P-Value Method

T-Test: One Sample Mean Test

T-Test: Paired Two Sample Mean Test

T-Test : Unpaired Two-Sample Mean Test

Two Sample Proportion Test

Chi-Square Test: Independence Test

Chi-Square Test: Goodness of Fit

ANOVA Test or F-Test

Assignment : Types of Hypothesis Testing

Basics of Linear Regression Analysis

Assumptions of Linear Regression

Multiple Linear Regression Analysis

Course Summary

Interview Questions

Career Guidelines

Enroll For Free

Statistics For Data Science Course in English Code

Free

Full Course, No Certificate

With Ads
No Certificate

₹1249/-

No Ads

Full Course, with NSDC Certificate

Ad Free
Globally Recognized NSDC Certificate