Which is our two sample mean test paired, which means till now we had taken one sample.
From one population we were comparing one sample’s data.
But if you have to perform such a test, where you have two different samples, let's say morning sample or evening summary.
In this way, if you have got two different data or two different samples and you have to compare them, then we will have to perform two sample mean test.
Even they are of two types.
One is paired and one id unpaired.
First of all, we will see in this what is paired.
Paired means whichever is the same subject we will read it twice, which means let's say we will assume that there is a new medicine with us which we have to test, which means we have to perform the clinical trial and we have to test a new medicine over our patient.
So, what we will do is, we will create two samples of this medicine.
In which way? First, we will choose few people who has who have not taken the new medicine, which means we have to compare the sample of people who have taken the medicine and same people who have not taken the medicine.
So, in this way when two populations are there with us in two different samples.
Then we will perform two sample mean test.
let's take one more hypothesis.
Let's say that Virat Kohli has performed really good in his second evening, okay.
So, I have to compare that is Virat Kohli’s performance was the same in the first and second inning or they were different or which performance was better.
In this case what happened.
We have taken data of one person in first inning and second inning.
In one test we are comparing that how was Virat Kohli's performance.
In this situation, we use two sample mean test and since this data is from one sample from one subject, so we will also call it as paired sample mean test.
In this way, we will illustrate it in Excel that how easy it is if I have two sample’s data and I have to find out its mean.
Two sample mean test’s formula or t test which we call it, we will see its formula.
If you will see in this formula.
In the numerator, there is x d bar, which means, the difference of my both sample’s mean.
In denominator has I have SD divided by under root n.
SD is the difference of both the sample’s standard deviation and n will be my sample size.
Since it is paired, we will have only one sample size value for both the samples.
How do we use this formula? Let's see that first in which ways we can create our null and alternate hypothesis and then we will come to the conclusion or what would be our result.
So, the example goes this way.
There is one researcher, what he has to do is he has to check his new diabetes pill.
Which means, he has to check if that diabetes pill has any effect on my patients or not.
So, what he did is he chose 10 patients randomly.
He tested those patients before bills performance and after bills performance, which means if you will see in this table, we have some data which is before and we have got 10 patient's data in after.
So, this means on those patients we checked the effect of that diabetes pill before and after.
So, we have noted that data.
Now since we have two samples, but the population is one in that case we will perform paired two sample mean test.
First thing that we have to do is we have to create null and alternate hypothesis.
So, my null hypothesis in any two paired sample mean test is that the population mean before will be equal to the population mean after the test.
This means that we are saying, whenever the patient has taken the pill, it had no effect on him.
Whatever is the alternate hypothesis, that will be, the new before is not equal to mew after.
This means the pill has some or the other effect.
This means that we can give this particular treatment to the patient.
Now since there is equal to sign used here, so we will call it a two tailed test.
First thing is we have created alternate and null hypothesis.
What would be the second step? That we will find one t statistics, which means whichever is the formula by which we can calculate t, we will find the values corresponding to it.
So, to find the values corresponding to it.
If you can see the formula carefully, there I have mean’s difference plus standard deviation’s difference.
So, what we do is we create one column where we create a difference of before and after values.
So, in this table I have created a difference, like 9-7=2, 10-6=4.
In this way, I have found the complete differences.
After finding the differences we calculate x d bar.
X d bar means both the sample’s mean and its difference.
We have already calculated the difference.
If I divide those values with total number of observations, then I will take the value of x d bar which comes out as 1.7.
What is SD over here.
Normally, if I can calculate standard deviation corresponding to both the samples then that would be my SD’s value.
Its formula is this and we have put values in it and the value that we get of SD is 1.49.
Now, we know x d bar, we have also calculated SD’s value.
N’s value we have is 10 as our sample size, number of patients is 10.
We have put all the values in t, we got a particular value which comes out as 3.61.
Now, we found t calculated, if you have followed all the test’s pattern, all the previous tests that we have performed.
Once if we find the value calculated, to compare we have to first find its tabular or critical value.
What we will do is for this particular test, from one t distribution we'll find T Tabular.
Now, here one very interesting concept comes, which is known as a degree of freedom.
So, what happens in the case of t test? All the number of observations that we have, if I subtracted one from it or I reduce one value from total number of observations, then I’ll get the degree of freedom, which means if I take a value corresponding to degree of freedom and corresponding to alpha, if I check a value in t table then I will get the T tabular value.
The z score table that you have seen, in that table we were finding a value corresponding to alpha.
What happens in the t table? Corresponding to alpha plus corresponding to degree of freedom whichever value comes that we will assign to t tabular or t critical.
So, in this particular case my degree of freedom is 9.
And the alpha value is 0.05 plus this is a two tailed test, we will find one value that follows all these conditions.
And in this particular case t tabular comes as 2.262.
But T calculated that we had seen in previous example that was 3.6 Whenever we get these two values, tabular and calculator, we make a decision based on whether the critical value is smaller than the calculated value or the critical value is greater than our calculated value.
In this particular scenario since the critical value is smaller than calculated value.
Whatever value comes that lies in the rejection region.
This means that our final decision would be we will reject the null hypothesis.
What was our null hypothesis? Both their means are the same.
So, this means that if we are rejecting, we can say that the pill has no significant effect on diabetic patient.
So, in this way, by using the formula, we can perform two sample mean test which generally becomes little hectic, because there are a lot of mathematical calculations.
So, for these kinds of situations, in the industry mainly in excel or the other tools that are made related to statistics like Python or R.
In such particular tools we generally perform these tests.
How these tests are performed in Excel? We will see it now.
So, let's see it once that in which we will perform it in Excel, if I have this two before and after data.
So here in the current scenario, I have the data of 32 patients for both the values.
Even here if you see the after taking medicines data, so even here my value is 32.
So, this means that for both these samples, we can perform two sample mean tests.
So, in this particular case, first of all what would be my null hypothesis? It would be that both their means, which means mew of before will be equal to mew of after.
The alternate hypothesis would be that mew before is not equal to mew of after.
Now we don't have to do anything.
If we have to normally perform in this.
Then in which we can I perform two sample mean this normally we will go in the data analytic section.
In which way we will perform two sample mean test.
We will normally go in the data analysis section.
There we will perform t tests for two sample means.
When I clicked on it, it is asking me for two range.
Variable one range and variable two range.
If you will see variable one range, than I can normally either select it from this section, where only the values will be selected or it is possible that I will select it normally through the label section.
If you're selecting from C1 column, then you will click on labels, if you're directly selecting the values then there no need to click on labels.
Now we will select the variable two range which lies from C2 to C33.
We have put both the ranges, here it is asking what is the hypothesised mean difference.
So obviously our null hypothesis is that there is no difference in both.
So, our mean difference, we have considered as zero.
Now here you have got three options.
You want to take the output on the same page, or do you want to take on a new workbook.
Normally to perform our analysis we see the values on the same page.
It is asking for one range.
You can select anything for the range, you can select any random number of columns.
Whatever is the value, that can be shown.
Normally after putting all the data when I press ok, so here you can see that I've got all my particular values, I have got all the result for variable one and variable two in which we performed t test paired two sample for mean.
What is the mean value, what is the variance of both the samples, observation? We can get it cross verified whether you have taken the total number of values correctly or not.
Here you have got hypothesized mean as 0 plus in this particular case, when we calculated the value normally then we found one t statistics which we had found by completely putting the value in the formula.
Plus, we had found a T critical value.
Since it is a two tailed test, we were finding the t critical value for two tailed tests.
In this situation what happened is you only put all the values in except, you presented one data set.
You entered into one data analysis tag and you entered the values in normal clicks and you got the answer.
In this particular case what you can do is if you see here, the T stat comes as the value 0.51 and the T critical for two tests that come that is 2.0.
So, in this scenario, my T critical value is greater than T stats value.
So, simply since the T critical value is more our mean lies in the acceptance region.
The meaning of lying in the acceptance region is that we fail to reject or null hypothesis.
So, we can say that this means before taking medicine and after taking medicine there is no effect.
So, you have seen that in which ways we normally in Excel we do this different calculation which are very difficult for us to do them manually.
So, with the help of Excel’s simple tool, we can perform them.
If you have any comments or questions related to this course then you can click on the discussion button below this video and you can post them over there.
In this way, you can connect with other learners like you and you can discuss with them.
Share a personalized message with your friends.