Interested in Personalized Training with Job Assistance? Know More

Statistics For Data Science Course in English > Descriptive Statistics

Measures of Variability - Mean Absolute Deviation

14.1k

Start a new search

To find content from modules and lessons

Overview

In this chapter, we will learn about Measures of Variability.

First, What is Variability? “Variability describes how far apart data points lie from each other and from the centre of the distribution.”

So, suppose I have a particular data, in that data I have one centre, which I have defined.

So, all my data points, how far are they from that centre, that is told to me by Variability.

What was there in Central Tendency? We had seen where most of the data points lie.

Okay, but what will variability tell us? How far do the data points lie from each other? And amongst themselves, how far do they lie? Even this is told to us by variability.

This is one important measure of dispersion because it tells us that where all the data points are clustered around the centre or how wide spread it is, okay?

Suppose I have low variability.

Low variability is ideal because it means that “you can better predict information about the population based on the sample data”.

If you have low variability, it means that my data is clustered, if you have high variability, it means that the values are very less consistent, which means that our values lie far apart from each other and we cannot create the predictions easily if I have a high variability case.

Let's take an example, if we test the data of the amount of time spent on phone in three different groups of people.

Okay.

So, in a day, how long can one person use the phone? We have checked three different groups for this.

Okay.

Suppose, there is one sample of high school students, second sample I have taken of college students, third sample I have taken of adults who are into full time employment which means of the corporate employees.

In that particular scenario, I have drawn a curve of all three samples, the blue data that you can see is my sample of “A” data, “B” is represented with Green.

Yellow data is my sample “C”.

I have drawn one curve, probability versus minutes used on the phone, okay.

In that I've got three different curves.

Now you can see that all three curves are not alike.

But there is one thing that is common in all three curves.

What is that? Their average, if you see average in all three curves, if you see the mean which is lying at the centre.

It is 195 minutes, ideally, each person has used the phone for 3 hours, which I am getting to know from the average.

But all three have different spreads.

And what is there in it? “A” is a highly variable curve, “C” has the lowest variability.

So, in this way, through different curves spread, we can get to know which one of my measures is better.

Which sample is better and which sample is not that good for the predictions? Okay.

Now, we will come on ‘different types of measures of variability’.

We have seen in the last chapter there are four types of variability.

First is the mean absolute deviation, variance, standard deviation and range.

We will see about all four one by one.

First is my Mean Absolute Deviation.

Mean absolute deviation, these are three different terms.

In this, we just saw what Mean means, Absolute means suppose I have any particular value.

If you apply mod to it and you keep that value positive always, that would be called absolute.

Suppose you put mod against -4 value.

So, my converted value becomes 4 which is plus 4.

Deviation, which we were seeing right now, is: How much is the difference? How much is the dispersion? How much is the Variability?

Combining all these three terms together, what will be the definition? “Means Absolute Deviation of a data set is the average distance between each data point and the mean”.

So, if we take the difference of each data point with its mean.

So, that will be called our Mean Absolute Deviation.

And it's all about the variability in the data set.

How can it be calculated? It can be calculated like this. You'll first calculate the mean, then you will subtract it from all the data points, then you will take out its mod and divide it by the number of samples.

Let's describe this formula through one example.

Suppose I have data of how many likes you have got on the six pictures that you have uploaded on Instagram.

Suppose the first picture got 10 likes.

The second picture got 15 likes.

The third picture got 15 likes.

The fourth picture got 17 likes.

The fifth picture got 18 likes and the sixth picture got 21 likes.

So, what will be the first step that you will take, to find mean absolute deviation? First of all we will find its ‘Mean’. How will we arrive at the Mean?

Mean would be sum of likes divided by total number of pictures, which is nothing but 96 divided by 6 which is 16.

My second step would be, I will calculate the difference of all the data points from the mean.

Basically, What does this mean? Whatever will be my distance from the mean, how will it be calculated? 16 was my mean, 10, which is the data point, if I subtract it from 16 and then take its absolute value which is 6.

In the same way, I calculated the different values.

If there's 15 Then 15 minus 16 which is equal to -1.

I took it as my mod and the value that I got was one.

Next, how much ever is the distance from mean, I have added all those values.

From that, my final sum was 16.

Now my last step will be, to calculate Mean Absolute Deviation, to this particular sum, the total sum that we have calculated, I will divide it by the total number of data points, which is 6 which comes as 2.67 likes.

So, ideally, we can say that, through mean absolute deviation, on an average, we can say that every picture is three likes away from the mean.

So, if you have any comment or question related to this course, you can post by clicking on the discussion button given at the end of the video.

In this way, you can connect and discuss with more learners like you.

See More

Learner's Ratings

4.7

Overall Rating

83%
0%
17%
0%
0%

Reviews

A

Abhishek Srivastava

5

Awesome

V

Vrushali Kandesar

5

This course is really nice, just have one question in empirical rule explanation , SD deviation example trainer is saying mean however mean (20+30+40+50+60+70/6) value is different kindly confirm than

P

Prabhat Yadav

5

Superb and amazing 😍🤩 enjoyable experience.

K

Kesavaraman Balakrishnan

5

wow... Teaching and voice is good

P

Prashant Dadhania

5

Good Course

Recommended Courses

Free हिन्दी

Python Programming Course

258710

4.3 Enroll For Free

Free हिन्दी

Excel For Data Analysis

59661

3.6 Enroll For Free

Free हिन्दी

SQL For Data Analysis

23197

3.8 Enroll For Free

Free हिन्दी

Complete Machine Learning Course

22924

4.3 Enroll For Free

Course Content

Introduction to Statistics For Data Analysis and Data Science

Fundamentals of Statistics

Basics of Descriptive Statistics

Measures of Frequency

Measures of Central Tendency - Mean

Measures of Central Tendency - Median

Measures of Central Tendency - Mode

Measures of Variability - Mean Absolute Deviation

Measures of Variability - Variance

Measures of Variability - Standard Deviation

Measures of Position

Measures of Shape - Skewness

Measures of Shape - Kurtosis, Box and Whisker Plot

Assignment : Descriptive Statistics

Basics of Inferential Statistics

Introduction to Probability

Basics of Probability

Probability Distribution

Discrete Probability Distribution

Continuous Probability Distribution

Uniform Distribution

Normal Distribution

Standard Normal Distribution

Log Normal Distribution

Exponential Distribution

Methods to Detect Outliers

Methods to Treat Outliers in Python

Feature Scaling - Normalization V/S Standardization

Sampling Methods

Sampling Distribution

Central Limit Theorem

Assignment : Inferential Statistics

Hypothesis Testing in Data Science

Null and Alternative Hypothesis

Making a Decision - Reject or Fail to Reject Null Hypothesis

Type I & Type II Errors

Covariance and Correlation Coefficients

Types of Correlation Coefficients

Assignment : Hypothesis Testing

Types of Statistical Tests

Z-Test: Critical Value Method

Examples of Z-Test: Critical Value Method

Z-Test: P-Value Method

Examples of Z-Test: P-Value Method

T-Test: One Sample Mean Test

T-Test: Paired Two Sample Mean Test

T-Test : Unpaired Two-Sample Mean Test

Two Sample Proportion Test

Chi-Square Test: Independence Test

Chi-Square Test: Goodness of Fit

ANOVA Test or F-Test

Assignment : Types of Hypothesis Testing

Basics of Linear Regression Analysis

Assumptions of Linear Regression

Multiple Linear Regression Analysis

Course Summary

Interview Questions

Career Guidelines

Enroll For Free

Statistics For Data Science Course in English Code

Free

Full Course, No Certificate

With Ads
No Certificate

₹1249/-

No Ads

Full Course, with NSDC Certificate

Ad Free
Globally Recognized NSDC Certificate