Chi Square Test

What is Chi Square Test?

I understand that you must be having a lot of questions such as how will you define chi square test? For this reason we have for you the chi square test explained in a very simple way which is understandable so that next time someone will ask you what chi square test meaning is? you can explain them chi square meaning straightforwardly. So let us grab this moment to learn chi square definition. 


In statistics, the Chi Square definition is explained as a test used by researchers for testing the relationships between categorical variables in the same population.


It measures how expectations are compared to actual observed data. The data we usually use to calculate the static must be random, mutually exclusive and raw. It must be drawn from independent variables and requires the sample which must be large enough. 


Chi Square Method

There are basically two types of chi square method.

  1. The test of independence: This test asks you questions based on relationships such as “Is a relationship between gender and SAT scores exist”?

  2. The goodness-of-fit test: This will ask you questions like “if a coin is being tossed 100 times, is there any chance of 50 time heads and 50 times tails?


Table of Chi Square Test

df

0.01

0.05

0.10

0.005

0.025

1

6.63

3.841

2.706

7.879

5.024

2

9.21

5.991

4.605

10.597

7.378

3

11.34

7.815

6.251

12.838

9.348

4

13.277

9.488

7.779

14.860

11.143

5

15.086

11.071

9.236

16.750

12.833

6

16.812

12.592

10.645

18.548

14.449

7

18.475

14.067

12.017

20.278

16.013

8

20.090

15.507

13.362

21.955

17.535

9

21.666

16.919

14.684

23.589

19.023

10

23.209

18.307

15.987

25.188

20.483

11

24.725

19.675

17.275

26.757

21.920

12

26.217

21.026

18.549

28.299

23.337

13

27.688

22.362

19.812

29.819

24.736

14

29.141

23.685

21.064

31.319

26.119

15

30.578

24.996

22.307

32.801

27.488

16

32.000

26.296

23.542

34.267

28.845

17

33.409

27.587

24.769

35.718

30.191

18

34.805

28.869

25.989

37.156

31.526

19

36.191

30.144

27.204

38.582

32.852

20

37.566

31.410

28.412

39.997

34.170


Chi Square Distribution Formula

With the chi square test table given above and the chi square distribution formula, you can find the answers to your questions:


Chi square distribution formula can be written as:


\[x_{c}^{2} \sum \frac{(O_{i} - E_{1})^{2}}{E_{i}}\]


Where, c is the chi square test degrees of freedom, O is the observed value(s) and E is the expected value(s).


Chi Square Test Example

Example: Consider a situation where a random poll of 2,000 different voters, both male and female was taken.  The people were classified on the basis of their gender and whether they were democrat, republican, or independent. So the grid will consist of columns labeled as republican, democrat, and independent, whereas two rows labeled as male and female. The data from 2,000 respondents is as follows:


Solution: Our first step to calculate the chi squared statistic will be to find the expected frequencies. The calculation will be made for each "cell" in the grid. Since there are two strata of gender and three categories of political view, we have a total of six expected frequencies. The formula for the expected frequency will be:


E(r, c) = \[\frac{n(r)\times c(r)}{n}\]


Where, r is the row, c is the column and r is the corresponding total.

The expected frequency in this example are: 


E(1, 1) = \[\frac{900\times 800}{2000}\] = 360


E(1, 2) = \[\frac{900\times 800}{2000}\] = 360


E(1, 3) = \[\frac{200\times 800}{2000}\] = 80


E(2, 1) = \[\frac{900\times 1200}{2000}\] = 540


E(1, 2) = \[\frac{900\times 1200}{2000}\] = 120


E(2, 3) = \[\frac{200\times 1200}{2000}\] = 120


Now, these are the used values to calculate the chi squared statistic using the following chi square distribution formula:


\[\sum \frac{[O(r, c) - E(r, c)]^{2}]}{E(r, c)}\]


Where, O(r,c) is the observed data for the provided rows and columns.

The expression for each observed value in this example are:


O(1, 1) = \[\frac{[400 - 360]^{2}}{360}\] = 4.44


O(1, 2) = \[\frac{[300 - 360]^{2}}{360}\] = 10


O(1, 3) = \[\frac{[100 - 80]^{2}}{80}\] = 5


O(2, 1) = \[\frac{[500 - 540]^{2}}{540}\] = 2.96


O(2, 2) = \[\frac{[600 - 540]^{2}}{540}\] = 6.67


O(2, 3) = \[\frac{[100 - 120]^{2}}{120}\] = 3.33


So if we equal the sum of these values, it will come upto 32.41. Then we have to look at the chi square test table and find the given chi square test degrees of freedom in our set up to see if the  result is statistically significant or not.

FAQ (Frequently Asked Questions)

Question 1) What are the Properties of Chi Square Test?

Answer 1) Chi Square is a tool for testing the relationships between categorical variables in the same population. It measures how expectations are compared to actual observed data.


Some of the properties of chi square distribution are listed below: 

  • The data must be raw, random and mutually exclusive. 

  • The data must consist of independent variables.

  • The sample drawn must be large enough.  

  • Variance is equal to two times the number of degrees of freedom. 

  • The number of degrees of freedom and the mean distribution are equal to one another. 

  • When the degrees of freedom increases, the normal distribution is approached by the chi square distribution curve. 

Question 2) What are a Few Chi Square Test Real Life Example?

Answer 2) The chi-squared distribution emerges out from the estimates of the variance of a normal distribution. It is an approximation to both the distribution of tests of goodness of fit as well as of independence of discrete classifications.


Analysis of variance (for normally distributed data) utilizes the F distribution, which is the ratio of independent chi-square, so even if it isn’t used as a major stepping stone, it is, however, one that we use.


Neither the normal distribution, nor any other really can be considered as a real life example as they are just used as models that are sometimes reasonable approximations.