In this chapter we will see what continuous probability distribution and how are its different types of distributions.
Continuous random variable is such a random variable which takes an infinite number of values in any interval of time.
Suppose that I have an interval between two to three, which means in between the interval of two and three I can get an infinite number of values, it can be 2.1, 2.1 to 2 can come, 2.3.1 can come.
So, any infinite number of values, which lies in a given interval that would be called my continuous random variable.
What was in the case of discrete, I had a countable and finite number of random variables like we had taken x-1234 for the number of red balls.
Now what happens in the case of continuous random variables, we can define the exact value of any random variable or we can find it that is extremely low or we call it almost zero.
What happens in this particular case? We find the probability distribution so that we can get information about that particular interval.
In case of continuous probability distribution, instead of using probability mass function, we use probability density function.
Now, let's take an example, suppose in one company whichever employees go from their home to their office, we call it as commute time, okay.
Which means if you're working in one company and you go to the office, the time that you take to go from home to office, that we will call as commute time.
If I say that I want to find out that person X, this X person’s commute time, he should reach his office in exactly 35 minutes.
I have to find out that in exactly 35 minutes he will reach his office.
What is its probability? Now since he can reach the office in 35.1 minute and he can reach his office in 34.9 minutes.
So, to tell you exactly that he will reach office in 35 minutes, it's probably day with me lies zero.
But if I see the same thing in this way, if I define it in a small interval, suppose if I say that any person…if I want to find out that he will reach his office in 35 to 40 minutes, I can define its probability which we call as continuous probability distribution.
So, what is one very important property of continuous probability distribution? There are very low chances of finding the exact probability, it’s almost zero but we can find continuous probability distribution on any interval.
Now, we have different types of continuous probability distribution like uniform distribution, exponential distribution, normal distribution, log normal distribution.
We'll learn in detail about all these in the coming chapters.
But before understanding continuous probability distribution we will learn about a very important distribution function which we call as cumulative distribution function.
So, till now, all the probabilities that we have seen or all the intervals that we have seen were talking about the exact values are finite values or one given interval that we saw right now, in the case of continuous but many times it so happens that if we are even talking about less than, then there are many useful scenarios that are getting created.
This means that now we see what is cumulative distribution function in the case of discrete random variables.
And then we will see how they are defined in the case of continuous random variables.
Cumulative probability of any x discrete random variable is that probability which is equal to that value or is created by the sum of less than value.
Suppose if I have x’s value from 0 to 4, which means I have four discrete random variable values, corresponding to that I know the x’s probability.
Which means corresponding to that I know the respective probabilities of them all but now I have to find out its cumulative frequency or cumulative density.
So, what I will do is that I will sum up the 0 value in the first value because of which I can get the sum of the first random variable.
If I sum the current value with its previous value and finally the value that I get forms my cumulative distribution function.
We have to always remember that cumulative distribution function is one increasing function.
Why? Because we add up the current value in the previous value.
Because of which I get the sum of the final cumulative function or the total that is equal to one.
Now, if you see in this scenario that on the x axis, I've taken a discrete random variable.
On y axis I have taken cumulative probabilities.
So, we have easily plotted it on the bar graph.
Why? Because we can draw different values easily through a bar chart.
Now if I talk about continuous random variables.
Suppose the commute time example which we just talked about.
That was my continuous random variable where I had defined the intervals.
If it takes 20 to 25 minutes for an employee to reach the office and I have its data that it's probably is 0.15.
In the same way, the probability of 25 to 30 minutes is 0.2 and so on.
Now if I want to know the cumulative frequency, I will not be able to find it for the interval, so what we will do is, from 20 to 25 and 25 to 30, there is a common value in between which is 25.
We will find this cumulative frequency.
How will that be calculated? That will normally be 0.15 which is the first value, it is that value.
Second, now I want to calculate the cumulative frequency for the 30 normal variables.
So, how will we calculate that? My current probability, which means 0.2, if I add previous probability in it that would be my probability of 30 random variables.
In the same way for 35, if I add 0.3 which is its current value, if I add 0.35 in it then I will get a cumulative frequency of 35.
In this way, we add current value into the previous value and we draw one cumulative frequency curve.
In this case you will see that rather than being a bar chart it is a continuous curve in which I have plotted the respective values.
Now, we have some probability in both discrete and continuous cases.
And we have also defined what is cumulative distribution function.
Now, we have understood that in case of discrete random variables we can show it through a bar chart, but in case of cumulative we draw a continuous curve, and not a bar chart.
Why? Because, the cumulative probability of 21 will be different from 21.1.
In this way the different decimal’s values which are there, those would be different from each other and we can calculate the value of one particular interval.
Now there are two graphs given to me, which means one is a probability density function graph and one is a cumulative distribution function graph.
And I've been told to find the probabilities in the two graphs.
Now, if the cumulative distribution function is very easy you need to find the cumulative frequency normally, in which ways? You will check the exact value in the graph and you will get that value.
Suppose if I've been asked what the cumulative probability of 28 minutes is.
Which means, if my x is 28, what would be my cumulative probability? Corresponding to x whatever is the y’s value, like in this particular curve, 0.28.
So, that would be my cumulative probability.
Now what happens in case our probability density function? My area under the curve which means, my whichever graph is made in between that the area that I get, with that we can find the probability in our probability density function.
It means If somebody asked me, x is between 20 and 28, tell me what would be the probability density? what we will do is, we will draw one line on 20 and one line on 28 on our curve.
We will find the area under that curve.
And whatever value comes, like in this case I got 0.28.
And will call the area under the curve as probability density.
What was in cumulative density, we normally see its value and we can find that exact value.
Now, we have learned both the curves we have seen in both PDF and CDF.
PDF is basically probability density function and CDF is cumulative distribution function.
But which one is better from these two and in real life which one do we use more? Let's see that one.
The PDFs are commonly used functions in real life.
There is a very simple reason for it.
In the PDF function we can easily see the patterns, which means, for one uniform distribution if I draw its PDF and CDF.
So, we can easily see the PDF defines a uniform value with which I can exactly get to know that this is one uniform distribution graph.
But what is this CDF scene? That I get only one increasing curve from which I cannot define whether it is a symmetric distribution or one uniform distribution.
Let's see this with one more example.
If I draw a symmetrical distribution’s PDF and CDF then you will see on the left that easily I am able to know that this particular curve is symmetrical to 0.
Which means, left and right and around 0 it is the same.
But if I see the CDF of the symmetric distribution, that is only one increasing curve which is almost similar to my uniform distribution.
So, in real life scenarios we use probability density function and, in this way, we perform different data analysis.
So, in this chapter we covered the most important topics one is, what is continuous probability distribution? And second, how is it different from the cumulative distribution function? If you have any comments or questions related to this course then you can click on the discussion button below this video and you can post them over there.
In this way, you can connect with other learners like you and you can discuss with them.
This course is really nice, just have one question in empirical rule explanation , SD deviation example trainer is saying mean however mean (20+30+40+50+60+70/6) value is different kindly confirm than