In this module we will see in which was we can perform different t tests.
To understand t test lets first see what is the t distribution, t distribution is similar to normal distribution, we have learned about normal distribution where our mean, mode and median lies in the centre.
Now, if my any such distribution which is symmetrical to the central tendency that is called as the normal distribution but what is the difference between normal distribution and t distribution? We will see that, if my normal distribution becomes a little short or has flatter tails, which means, if my standard deviation is large then we call the normal distribution as the t distribution.
Each t distribution has one degree of freedom attached to it, which is related to its sample size.
Degree of freedom means any of my graph or any of my distribution, it can lie from were to were.
How much more it can keep its axis.
This means that if my sample size is less so my t distribution will be that much flatter but if the sample size is more that my degree of freedom also increases, then our T distribution looks like standard normal distribution or we can say that if my sample value gets beyond 30.
In that case, my t distribution approximately comes equal to my normal distribution or gets equal to the Z distribution.
We simply, if we were seeing Z distribution, we were finding Z score, which means we were using Z table.
But now if we are seeing t distribution for t test then we will use t table in it.
So, one very basic difference if someone asks you from T test and Z test, from these two tests which test you're going to use.
So, for this there is a very simple way.
We will first of all raise a question, whatever are the population parameters, like standard deviation and mean.
Are they known or unknown? If I don't know the population standard deviation or I do not have its value, in that case, we can use t test.
If suppose you have population standard deviations value, now you can raise one more question that my sample size is below 30 or over 30.
If it's below 30, then in that case we can use t distribution.
If it is more than 30, we can use Z test.
We know the z score formula x minus mew upon sigma, in this mew is my population’s mean and sigma is population’s standard deviation.
Since in t test we generally use sample, so in this case, its formula would be x bar minus mew upon s upon under root n, what is s upon under root n? This would be my standard error of any sampling distribution.
Now since we use t table in it and we have a new term here which is degree of freedom.
So, on the basis of our T table, we find a z score or we also call it a t score.
How will we find it? If you will see in this table.
Here on the left side, we have got degree of freedom which means first column is of degree of freedom, corresponding to that the different columns that are there.
They are of one tailed test and two tailed test significance level.
let's assume that our degree of freedom is eight and for one tailed test, significant level is 0.05.
So, what we will do is, we will see on the horizontal line which corresponding value is given by degree of freedom and on the vertical line what is the value 0.05 giving for 1 tailed test.
Wherever two intersect.
That will give me my p value which is 1.960.
In this way we find different values, we find the exact score from the t table or t score and we perform our hypothesis testing.
If I take two more examples for different sample sizes, so let's assume that in my first case, the sample size is n equals to 25.
Now it's a simple thing.
Since this is this value is below 30.
Then I can use t test or t table over here.
If my alpha is 5%.
For the two tailed test, we can simply find its value with a t table, in which what would be your degree of freedom 25 minus one 1, which means 24 and corresponding to that alpha is equal to 0.05 for two tailed test we will use it and we will find one ZC’s value.
Now if you have one scenario where n equals to 32.
In that case, you can apply t test or you can even apply a z test.
Why? Because my N’s value is greater than 30.
In that case my degree of freedom will be 32 minus one which is 31 and my alpha’s value for two tailed test, simply whatever value comes is your ZC.
If you wish you can directly find Z scores value directly from the t table.
Now we have seen that what is T distribution but different t tests we perform on which basis or what is the need for us to perform the t test.
Whenever we have two groups, okay.
They can be from the same population or they can be from two different populations.
What are these different populations or the same populations, we will understand it through each test.
So, now if we have to compare those groups, the sample mean of those two groups, is it same or is it different in this way, on the basis of different methods, we have got five types of t tests available with us from which one sample mean test paired two sample mean test, unpaired two sample mean test, two sample proportion test and AB testing.
How do we perform all these tests? How do we make our null and alternate hypothesis? How do we make the decisions out of it, all these things we will study in detail.
So, now you must be thinking that you have given us the names of different tests.
But how and in which situations we can use these test.
So first of all, we will see one sample mean test, like its name suggests one sample which means you have only one sample’s data.
From one population you have collected the samples and you have to compare that does your sample mean is comparable to my population mean or not? Which means a sample mean equivalent to population mean or not? In that case, we'll perform one sample mean test.
Generally, its formula is which we just saw, t is equal to x bar minus mew upon s upon under root where my x bar is sample mean mew is my population mean.
S is the standard deviation of the sample and n is my sample size.
So, one sample mean test.
I will explain it to you through one example.
We understand things easily through the examples.
How do we have to put the values? How do we have to create the null hypothesis or alternate hypothesis? So, let's take one simple example.
Let's assume that we have got different potato fields or we have different potato farms.
We have to find out yield.
The farming that is happening there, how is it better compared to the standard yield, which means standard yield of potato is mew is equal to 20.
So now what we did from all the fields, from all the farms, we randomly took samples of 12 Farms, In the sample of those 12 farms, the values are in this way, which we have placed in the X’s value.
We will calculate sample mean through it and we will take out a sample standard deviation.
Now the hypothesis that you have to perform or what you have to find out is that, the 12 sample farms data that I have taken, is it frequently better compared to my standard yield? It means that I have to compare that is my sample mean comparable to my population mean? So, in this case, in these kinds of scenarios, we use one sample mean test.
So, what is the first thing that we do? We have to create one null and one alternate hypothesis.
In that case, my null hypothesis would be x bar is equal to 20.
And my alternate hypothesis, my x bar is greater than 20.
Why are we taking 20 here? Because that is our population’s standard yield.
Now let's assume that the sample mean that has come to us and the sample’s standard deviation that has come out.
We have avoided its calculation and we have figured out one value, that my x bar is 20.175 and the sample standard deviation is s equals to 3.0211.
We also have population mean, which is mew is equal to 20 and n’s value is 12.
Which means we have used 12 samples.
We already have the entire data we only have to put our values in our T tail formula.
When I put all the values in it of x bar, mew, s and under root n, then I got one t’s value, which we call as t calculated.
Let's say that T calculated value is 0.2006.
Now whenever we have to make the decision, in all the old tests we have seen that we find one critical value as well.
Where there is one sample test, we calculate the degree of freedom, what would be my degree of freedom? N minus 1, which means 11 would be my degree of freedom for this case.
If we assume that alpha is 0.05%, which means we are assuming that in our testing there are chances of 5% error.
So, corresponding to that, looking at the t table, we can find our T critical which means if I see 11 degree of freedom, particularly in the horizontal way and if I see alpha is equal to 0.05 in the vertical column, and the value that I get there, that will become my t critical which is 1.796.
Now if you pay attention, this is the time when we have to make the decision.
We will compare our T calculated with my t critical.
If we will see my t calculated is lower than t critical which means our values lie in acceptance region and that does not lie in the rejection region.
This means that our conclusion is we fail to reject the null hypothesis.
So, in this way, simply in the p value method we found the p value.
In t value method, we found t critical and t calculated.
When we had a critical value method, there we were finding a Z score for one particular critical point.
So whichever test these are all these tests are related to one basis that we have to find out one calculated value and one critical value.
We have to compare them and decide whether my value lie in acceptance region or rejection region.
So, on this basis, our one sample mean test gets covered.
Let's see one more example of one sample mean test.
Let's say that the average birth weight, which means in any country, let's say in India, the babies that take birth in the cities, their average weight is 2.9 kg.
So, this means that it has created my entire population.
The new born babies have 2.9 kg wight in all the cities of India.
I have to perform one sample mean test and I have to find if I take samples of the village people, which means if we take samples of the babies born in villages, will their average weight will it be equal to the city's average weight or not? So, in this way, you can create different hypothesis and find it by yourself in which situation we can use which test.
Now one very interesting thing we have seen theoretically how we can perform this test, in which we had to use t table.
We had to use different graphs to find this out.
Plus, we also did a lot of mathematical calculations.
What generally happens in the industries is there are many such tools that have come, which are very advanced and we can simply use them.
All the variety of test that we are learning, we can perform them in one line code or normally in Excel.
So, if you will see the code which we have written in Python, if you will simply import ttest_1samp from Scipy which is a Python Library.
This is already is a module which is available.
From all the communities of Python there already people have written the test’s code.
What you have to do is you just have to import it in your Jupiter Notebook.
And this will give us one result.
We can reject our null hypothesis or we can fail to reject our null hypothesis.
So, in this way, in Excel sheets we can use them completely very easily.
How do we use it in Excel? We will see it in the next test.
If you have any comments or questions related to this course then you can click on the discussion button below this video and you can post them over there.
In this way, you can connect with other learners like you and you can discuss with them.
Share a personalized message with your friends.