Now the next important test that comes up Chai-square that is called as goodness of fit.
Like its name suggests goodness of fit.
It simply tells us that any distribution that is there, any variable, how much more it is likely for a distribution or not.
Which means, if you have taken any sample, is it the representative of the entire population or not.
To understand this let’s take a very interesting example.
We all eat toffies, right? Suppose we have many packets of candies.
We can say candies, toffies, whatever you like we can consider it.
Suppose I have many candy packets, in every packet I have 100 candies and, in every packet, I have 5 different flavours.
Let’s say, we have Apple toffee, there is orange toffee, strawberries.
There are different five flavours available in every packet.
Now what we did is from all the toffee packets, we picked any 10 packets and we created their sample.
From those sample I have to find out that in every packet I have same number of toffees in each flavour.
Which means, if in every packet you've got 100 pieces, 100 toffees and in every packet, you have 5 different flavours then our expectation is that there should be equal number of flavours in every packet, which means in the first packet I have 2 orange toffees.
Then in the second packet also I should have two orange toffees.
So, to perform this analysis, what we will do is we will perform goodness of fit test, where we have 100 toffees in one packet and in every packet, there are five different flavours available.
Every flavour’s 20 pieces are available with me.
So, if I divide 100 by 5, then our expectation is that every packet should at least 20 pieces of one flavour.
So, in this way, my proportion of every flavour would be simply 20 upon 100, which means 0.2.
With this we can create our null hypothesis, which will be p one is equal to P two is equal to P three is equal to P four is equal to P five and that will be equal to 0.2, which means in every packet I must have all favour same proportion.
What will be the alternative hypothesis? At least one pi I should not be equal, any such proportion which is not equal.
So, we have created null and alternate hypothesis.
Now what is our expectation? That every packet should have 20 pieces of the same flavour.
So, we have created our null and alternate hypothesis.
We have a data given for the observed values in which I have 180 candies of Apple flavour, 250 candies of lime flavour, 120 candies of cheery flavour, 225 candies of orange flavour and 225 candies of grape flavour as well.
So, we do have observed values, now I want expected values.
How much would be the expected value for us? We have expected that in every packet at least 20 pieces of candies should be the same, which means 20 candies of every flavour should be there.
Total I have 10 packets.
So, the total expected value of my candies will be 10 multiplied by 20.
That means in our sample 200 pieces of candies should be of each flavour.
Now we have got expected and observe, both the values.
What will I simply to, I will calculate observed minus expected value.
I will create their square and we will divide that squared value with the expected value.
So, when we did all these calculations, so in my last columns the values that come, those are 2, 12.5, 32, 3.125 and 3.125.
I have summed them all and the chi-square value that I got was 52.75.
We have calculated chi-square value.
Normally when we had performed the independence test, there my degree of freedom was valid for both rows and columns.
Because we had that kind of a data of two values.
So, we considered rows as well as columns so that the degree of freedom represents the entire data but, in this situation, since we are focusing on the number of flavours.
So, what will be our degree of freedom? Number of flavours minus 1.
Which means df is equal to 5-1and that is equal to 4.
Now corresponding to degree of freedom, one significance level, alfa is equal to 5% corresponding, if I see chi-square table, then I will get one value, which we call as chi-square critical or tabular.
My that value comes as 9.488.
In this situation if you will see then my chi-square value is very less compared to the chi-square value that we had calculated.
So, in this way we can simply say that we reject the null hypothesis.
This means that the flavour of the candies are not available in equal proportion.
So, this can easily tell us that in which ways we can perform the goodness of fir test by using chi-square.
In this module we have seen how chi-square test is performed in both ways.
In the next module we will see what is ANOVA test.
If you have any comments or questions related to this course, then click on the Discussion button given below this video and post it there.
In this way, you can connect with other learners like you and you can discuss with them.
Share a personalized message with your friends.