Weekend batch
Avijeet is a Senior Research Analyst at Simplilearn. Passionate about Data Analytics, Machine Learning, and Deep Learning, Avijeet is also interested in politics, cricket, and football.
Getting Started with Google Display Network: The Ultimate Beginner’s Guide
Sanity Testing Vs Smoke Testing: Know the Differences, Applications, and Benefits Of Each
Fundamentals of Software Testing
The Building Blocks of API Development
Content preview.
Arcu felis bibendum ut tristique et egestas quis:
8.1 - the chi-square test of independence.
How do we test the independence of two categorical variables? It will be done using the Chi-Square Test of Independence.
As with all prior statistical tests we need to define null and alternative hypotheses. Also, as we have learned, the null hypothesis is what is assumed to be true until we have evidence to go against it. In this lesson, we are interested in researching if two categorical variables are related or associated (i.e., dependent). Therefore, until we have evidence to suggest that they are, we must assume that they are not. This is the motivation behind the hypothesis for the Chi-Square Test of Independence:
Note! There are several ways to phrase these hypotheses. Instead of using the words "independent" and "dependent" one could say "there is no relationship between the two categorical variables" versus "there is a relationship between the two categorical variables." Or "there is no association between the two categorical variables" versus "there is an association between the two variables." The important part is that the null hypothesis refers to the two categorical variables not being related while the alternative is trying to show that they are related.
Once we have gathered our data, we summarize the data in the two-way contingency table. This table represents the observed counts and is called the Observed Counts Table or simply the Observed Table. The contingency table on the introduction page to this lesson represented the observed counts of the party affiliation and opinion for those surveyed.
The question becomes, "How would this table look if the two variables were not related?" That is, under the null hypothesis that the two variables are independent, what would we expect our data to look like?
Consider the following table:
Success | Failure | Total | |
---|---|---|---|
Group 1 | A | B | A+B |
Group 2 | C | D | C+D |
Total | A+C | B+D | A+B+C+D |
The total count is \(A+B+C+D\). Let's focus on one cell, say Group 1 and Success with observed count A. If we go back to our probability lesson, let \(G_1\) denote the event 'Group 1' and \(S\) denote the event 'Success.' Then,
\(P(G_1)=\dfrac{A+B}{A+B+C+D}\) and \(P(S)=\dfrac{A+C}{A+B+C+D}\).
Recall that if two events are independent, then their intersection is the product of their respective probabilities. In other words, if \(G_1\) and \(S\) are independent, then...
\begin{align} P(G_1\cap S)&=P(G_1)P(S)\\&=\left(\dfrac{A+B}{A+B+C+D}\right)\left(\dfrac{A+C}{A+B+C+D}\right)\\[10pt] &=\dfrac{(A+B)(A+C)}{(A+B+C+D)^2}\end{align}
If we considered counts instead of probabilities, then we get the count by multiplying the probability by the total count. In other words...
\begin{align} \text{Expected count for cell with A} &=P(G_1)P(S)\ x\ (\text{total count}) \\ &= \left(\dfrac{(A+B)(A+C)}{(A+B+C+D)^2}\right)(A+B+C+D)\\[10pt]&=\mathbf{\dfrac{(A+B)(A+C)}{A+B+C+D}} \end{align}
This is the count we would expect to see if the two variables were independent (i.e. assuming the null hypothesis is true).
The expected count for each cell under the null hypothesis is:
\(E=\dfrac{\text{(row total)}(\text{column total})}{\text{total sample size}}\)
To demonstrate, we will use the Party Affiliation and Opinion on Tax Reform example.
Observed Table:
favor | indifferent | opposed | total | |
---|---|---|---|---|
democrat | 138 | 83 | 64 | 285 |
republican | 64 | 67 | 84 | 215 |
total | 202 | 150 | 148 | 500 |
Find the expected counts for all of the cells.
We need to find what is called the Expected Counts Table or simply the Expected Table. This table displays what the counts would be for our sample data if there were no association between the variables.
Calculating Expected Counts from Observed Counts
favor | indifferent | opposed | total | |
---|---|---|---|---|
democrat | \(\frac{285(202)}{500}=115.14\) | \(\frac{285(150)}{500}=85.5\) | \(\frac{285(148)}{500}=84.36\) | 285 |
republican | \(\frac{215(202)}{500}=86.86\) | \(\frac{215(150)}{500}=64.5\) | \(\frac{215(148)}{500}=63.64\) | 215 |
total | 202 | 150 | 148 | 500 |
To better understand what these expected counts represent, first recall that the expected counts table is designed to reflect what the sample data counts would be if the two variables were independent. Taking what we know of independent events, we would be saying that the sample counts should show similarity in opinions of tax reform between democrats and republicans. If you find the proportion of each cell by taking a cell's expected count divided by its row total, you will discover that in the expected table each opinion proportion is the same for democrats and republicans. That is, from the expected counts, 0.404 of the democrats and 0.404 of the republicans favor the bill; 0.3 of the democrats and 0.3 of the republicans are indifferent; and 0.296 of the democrats and 0.296 of the republicans are opposed.
The statistical question becomes, "Are the observed counts so different from the expected counts that we can conclude a relationship exists between the two variables?" To conduct this test we compute a Chi-Square test statistic where we compare each cell's observed count to its respective expected count.
In a summary table, we have \(r\times c=rc\) cells. Let \(O_1, O_2, …, O_{rc}\) denote the observed counts for each cell and \(E_1, E_2, …, E_{rc}\) denote the respective expected counts for each cell.
The Chi-Square test statistic is calculated as follows:
\(\chi^{2*}=\frac{(O_1-E_1)^2}{E_1}+\frac{(O_2-E_2)^2}{E_2}+...+\frac{(O_{rc}-E_{rc})^2}{E_{rc}}=\overset{rc}{ \underset{i=1}{\sum}}\frac{(O_i-E_i)^2}{E_i}\)
Under the null hypothesis and certain conditions (discussed below), the test statistic follows a Chi-Square distribution with degrees of freedom equal to \((r-1)(c-1)\), where \(r\) is the number of rows and \(c\) is the number of columns. We leave out the mathematical details to show why this test statistic is used and why it follows a Chi-Square distribution.
As we have done with other statistical tests, we make our decision by either comparing the value of the test statistic to a critical value (rejection region approach) or by finding the probability of getting this test statistic value or one more extreme (p-value approach).
The critical value for our Chi-Square test is \(\chi^2_{\alpha}\) with degree of freedom =\((r - 1) (c -1)\), while the p-value is found by \(P(\chi^2>\chi^{2*})\) with degrees of freedom =\((r - 1)(c - 1)\).
Let's apply the Chi-Square Test of Independence to our example where we have a random sample of 500 U.S. adults who are questioned regarding their political affiliation and opinion on a tax reform bill. We will test if the political affiliation and their opinion on a tax reform bill are dependent at a 5% level of significance. Calculate the test statistic.
The contingency table ( political_affiliation.csv ) is given below. Each cell contains the observed count and the expected count in parentheses. For example, there were 138 democrats who favored the tax bill. The expected count under the null hypothesis is 115.14. Therefore, the cell is displayed as 138 (115.14).
favor | indifferent | opposed | total | |
---|---|---|---|---|
democrat | 138 (115.14) | 83 (85.5) | 64 (84.36) | 285 |
republican | 64 (86.86) | 67 (64.50) | 84 (63.64) | 215 |
total | 202 | 150 | 148 | 500 |
Calculating the test statistic by hand:
\begin{multline} \chi^{2*}=\dfrac{(138−115.14)^2}{115.14}+\dfrac{(83−85.50)^2}{85.50}+\dfrac{(64−84.36)^2}{84.36}+\\ \dfrac{(64−86.86)^2}{86.86}+\dfrac{(67−64.50)^2}{64.50}+\dfrac{(84−63.64)^2}{63.64}=22.152\end{multline}
...with degrees for freedom equal to \((2 - 1)(3 - 1) = 2\).
To perform the Chi-Square test in Minitab...
Note! If you have the observed counts in a table, you can copy/paste them into Minitab. For instance, you can copy the entire observed counts table (excluding the totals!) for our example and paste these into Minitab starting with the first empty cell of a column.
The following is the Minitab output for this example.
favor | indifferent | opposed | All | |
---|---|---|---|---|
1 | 138 115.14 4.5836 | 83 85.50 0.0731 | 64 84.36 4.9138 | 285 |
2 | 64 86.86 6.0163 | 67 64.50 0.0969 | 84 63.64 6.5137 | 215 |
All | 202 | 150 | 148 | 500 |
Pearson Chi-Sq = 4.5386 + 0.073 + 4.914 + 6.016 + 0.097 + 6.5137 = 22.152 DF = 2, P-Value = 0.000
The Chi-Square test statistic is 22.152 and calculated by summing all the individual cell's Chi-Square contributions:
\(4.584 + 0.073 + 4.914 + 6.016 + 0.097 + 6.532 = 22.152\)
The p-value is found by \(P(X^2>22.152)\) with degrees of freedom =\((2-1)(3-1) = 2\).
Minitab calculates this p-value to be less than 0.001 and reports it as 0.000. Given this p-value of 0.000 is less than the alpha of 0.05, we reject the null hypothesis that political affiliation and their opinion on a tax reform bill are independent. We conclude that there is evidence that the two variables are dependent (i.e., that there is an association between the two variables).
Exercise caution when there are small expected counts. Minitab will give a count of the number of cells that have expected frequencies less than five. Some statisticians hesitate to use the Chi-Square test if more than 20% of the cells have expected frequencies below five, especially if the p-value is small and these cells give a large contribution to the total Chi-Square value.
The operations manager of a company that manufactures tires wants to determine whether there are any differences in the quality of work among the three daily shifts. She randomly selects 496 tires and carefully inspects them. Each tire is either classified as perfect, satisfactory, or defective, and the shift that produced it is also recorded. The two categorical variables of interest are the shift and condition of the tire produced. The data ( shift_quality.txt ) can be summarized by the accompanying two-way table. Does the data provide sufficient evidence at the 5% significance level to infer that there are differences in quality among the three shifts?
Perfect | Satisfactory | Defective | Total | |
---|---|---|---|---|
Shift 1 | 106 | 124 | 1 | 231 |
Shift 2 | 67 | 85 | 1 | 153 |
Shift 3 | 37 | 72 | 3 | 112 |
Total | 210 | 281 | 5 | 496 |
C1 | C2 | C3 | Total | |
1 | 106 97.80 | 124 130.87 | 1 2.33 | 231 |
2 | 67 64.78 | 85 86.68 | 1 1.54 | 153 |
3 | 37 47.42 | 72 63.45 | 3 1.13 | 112 |
Total | 210 | 281 | 5 | 496 |
Chi-Sq = 8.647 DF = 4, P-Value = 0.071
Note that there are 3 cells with expected counts less than 5.0.
In the above example, we don't have a significant result at a 5% significance level since the p-value (0.071) is greater than 0.05. Even if we did have a significant result, we still could not trust the result, because there are 3 (33.3% of) cells with expected counts < 5.0
Sometimes researchers will categorize quantitative data (e.g., take height measurements and categorize as 'below average,' 'average,' and 'above average.') Doing so results in a loss of information - one cannot do the reverse of taking the categories and reproducing the raw quantitative measurements. Instead of categorizing, the data should be analyzed using quantitative methods.
A food services manager for a baseball park wants to know if there is a relationship between gender (male or female) and the preferred condiment on a hot dog. The following table summarizes the results. Test the hypothesis with a significance level of 10%.
Condiment | |||||
---|---|---|---|---|---|
Gender | Ketchup | Mustard | Relish | Total | |
Male | 15 | 23 | 10 | 48 | |
Female | 25 | 19 | 8 | 52 | |
Total | 40 | 42 | 18 | 100 |
The hypotheses are:
We need to expected counts table:
Condiment | |||||
---|---|---|---|---|---|
Gender | Ketchup | Mustard | Relish | Total | |
Male | 15 (19.2) | 23 (20.16) | 10 (8.64) | 48 | |
Female | 25 (20.8) | 19 (21.84) | 8 (9.36) | 52 | |
Total | 40 | 42 | 18 | 100 |
None of the expected counts in the table are less than 5. Therefore, we can proceed with the Chi-Square test.
The test statistic is:
\(\chi^{2*}=\frac{(15-19.2)^2}{19.2}+\frac{(23-20.16)^2}{20.16}+...+\frac{(8-9.36)^2}{9.36}=2.95\)
The p-value is found by \(P(\chi^2>\chi^{2*})=P(\chi^2>2.95)\) with (3-1)(2-1)=2 degrees of freedom. Using a table or software, we find the p-value to be 0.2288.
With a p-value greater than 10%, we can conclude that there is not enough evidence in the data to suggest that gender and preferred condiment are related.
Statistics By Jim
Making statistics intuitive
By Jim Frost 21 Comments
Chi-squared tests of independence determine whether a relationship exists between two categorical variables . Do the values of one categorical variable depend on the value of the other categorical variable? If the two variables are independent, knowing the value of one variable provides no information about the value of the other variable.
I’ve previously written about Pearson’s chi-square test of independence using a fun Star Trek example . Are the uniform colors related to the chances of dying? You can test the notion that the infamous red shirts have a higher likelihood of dying. In that post, I focus on the purpose of the test, applied it to this example, and interpreted the results.
In this post, I’ll take a bit of a different approach. I’ll show you the nuts and bolts of how to calculate the expected values, chi-square value, and degrees of freedom. Then you’ll learn how to use the chi-squared distribution in conjunction with the degrees of freedom to calculate the p-value.
I’ve used the same approach to explain how:
Of course, you’ll usually just let your statistical software perform all calculations. However, understanding the underlying methodology helps you fully comprehend the analysis.
For the Star Trek example, uniform color and status are the two categorical variables. The contingency table below shows the combination of variable values, frequencies, and percentages.
7 | 9 | 24 | 40 | |
129 | 46 | 215 | 390 | |
136 | 55 | 239 | N = 430 | |
5.15% | 16.36% | 10.04% |
However, our fatality rates are not equal. Gold has the highest fatality rate at 16.36%, while Blue has the lowest at 5.15%. Red is in the middle at 10.04%. Does this inequality in our sample suggest that the fatality rates are different in the population? Does a relationship exist between uniform color and fatalities?
Thanks to random sampling error, our sample’s fatality rates don’t exactly equal the population’s rates. If the population rates are equal, we’d likely still see differences in our sample. So, the question becomes, after factoring in sampling error, are the fatality rates in our sample different enough to conclude that they’re different in the population? In other words, we want to be confident that the observed differences represent a relationship in the population rather than merely random fluctuations in the sample. That’s where Pearson’s chi-squared test for independence comes in!
The two hypotheses for the chi-squared test of independence are the following:
Related posts : Hypothesis Testing Overview and Guide to Data Types
The chi-squared test of independence compares our sample data in the contingency table to the distribution of values we’d expect if the null hypothesis is correct. Let’s construct the contingency table we’d expect to see if the null hypothesis is true for our population.
For chi-squared tests, the term “expected frequencies” refers to the values we’d expect to see if the null hypothesis is true. To calculate the expected frequency for a specific combination of categorical variables (e.g., blue shirts who died), multiply the column total (Blue) by the row total (Dead), and divide by the sample size.
Row total X Column total / Sample Size = Expected value for one table cell
To calculate the expected frequency for the Dead/Blue cell in our dataset, do the following:
40 * 136 / 430 = 12.65
If the null hypothesis is true, we’d expect to see 12.65 fatalities for wearers of the Blue uniforms in our sample. Of course, we can’t have a fraction of a death, but that doesn’t affect the results.
I’ll calculate the expected values for all six cells that represent the combinations of the three uniform colors and two statuses. I’ll also include the observed values in our sample. Expected values are in parentheses.
7 (12.65) | 9 (5.12) | 24 (22.23) | 40 | |
129 (123.35) | 46 (49.88) | 215 (216.77) | 390 | |
9.3% | 9.3% | 9.3% |
In this table, notice how the column percentages for the expected dead are all 9.3%. This equality occurs when the null hypothesis is valid, which is the condition that the expected values represent.
Using this table, we can also compare the values we observe in our sample to the frequencies we’d expect if the null hypothesis that the variables are not related is correct.
For example, the observed frequency for Blue/Dead is less than the expected value (7 < 12.65). In our sample, deaths of those in blue uniforms occurred less frequently than we’d expect if the variables are independent. On the other hand, the observed frequency for Gold/Dead is greater than the expected value (9 > 5.12). Meanwhile, the observed frequency for Red/Dead approximately equals the expected value. This interpretation matches what we concluded by assessing the column percentages in the first contingency table.
Pearson’s chi-squared test works by mathematically comparing observed frequencies to the expected values and boiling all those differences down into one number. Let’s see how it does that!
Related post : Using Contingency Tables to Calculate Probabilities
Most hypothesis tests calculate a test statistic. For example, t-tests use t-values and F-tests use F-values as their test statistics. These statistical tests compare your observed sample data to what you would expect if the null hypothesis is true. The calculations reduce your sample data down to one value that represents how different your data are from the null. Learn more about Test Statistics .
For chi-squared tests, the test statistic is, unsurprisingly, chi-squared, or χ 2 .
The chi-squared calculations involve a familiar concept in statistics—the sum of the squared differences between the observed and expected values. This concept is similar to how regression models assess goodness-of-fit using the sum of the squared differences.
Here’s the formula for chi-squared.
Let’s walk through it!
To calculate the chi-squared statistic, take the difference between a pair of observed (O) and expected values (E), square the difference, and divide that squared difference by the expected value. Repeat this process for all cells in your contingency table and sum those values. The resulting value is χ 2 . We’ll calculate it for our example data shortly!
Notice several important considerations about chi-squared values:
Zero represents the null hypothesis. If all your observed frequencies equal the expected frequencies exactly, the chi-squared value for each cell equals zero, and the overall chi-squared statistic equals zero. Zero indicates your sample data exactly match what you’d expect if the null hypothesis is correct.
Squaring the differences ensures both that cell values must be non-negative and that larger differences are weighted more than smaller differences. A cell can never subtract from the chi-squared value.
Larger values represent a greater difference between your sample data and the null hypothesis. Chi-squared tests are one-tailed tests rather than the more familiar two-tailed tests. The test determines whether the entire set of differences exceeds a significance threshold. If your χ 2 passes the limit, your results are statistically significant! You can reject the null hypothesis and conclude that the variables are dependent–a relationship exists.
Related post : One-tailed and Two-tailed Hypothesis Tests
Let’s calculate the chi-squared statistic for our example data! To do that, I’ll rearrange the contingency table, making it easier to illustrate how to calculate the sum of the squared differences.
The first two columns indicate the combination of categorical variable values. The next two are the observed and expected values that we calculated before. The last column is the squared difference divided by the expected value for each row. The bottom line sums those values.
Our chi-squared test statistic is 6.17. Ok, great. What does that mean? Larger values indicate a more substantial divergence between our observed data and the null hypothesis. However, the number by itself is not useful because we don’t know if it’s unusually large. We need to place it into a broader context to determine whether it is an extreme value.
One chi-squared test produces a single chi-squared value. However, imagine performing the following process.
First, assume the null hypothesis is valid for the population. At the population level, there is no relationship between the two categorical variables. Now, we’ll repeat our study many times by drawing many random samples from this population using the same design and sample size. Next, we perform the chi-squared test of independence on all the samples and plot the distribution of the chi-squared values. This distribution is known as a sampling distribution, which is a type of probability distribution.
If we follow this procedure, we create a graph that displays the distribution of chi-squared values for a population where the null hypothesis is true. We use sampling distributions to calculate probabilities for how unlikely our sample statistic is if the null hypothesis is correct. Chi-squared tests use the chi-square distribution.
Fortunately, we don’t need to collect many random samples to create this graph! Statisticians understand the properties of chi-squared distributions so we can estimate the sampling distribution using the details of our design.
Our goal is to determine whether our sample chi-squared value is so rare that it justifies rejecting the null hypothesis for the entire population. The chi-squared distribution provides the context for making that determination. We’ll calculate the probability of obtaining a chi-squared value that is at least as high as the value that our study found (6.17).
This probability has a name—the P-value! A low probability indicates that our sample data are unlikely when the null hypothesis is true.
Alternatively, you can use a chi-square table to determine whether our study’s chi-square test statistic exceeds the critical value .
Related posts : Sampling Distributions , Understanding Probability Distributions and Interpreting P-values
For chi-squared tests, the degrees of freedom define the shape of the chi-squared distribution for a design. Chi-square tests use this distribution to calculate p-values. The graph below displays several chi-square distributions with differing degrees of freedom.
For a table with r rows and c columns, the method for calculating degrees of freedom for a chi-square test is (r-1) (c-1). For our example, we have two rows and three columns: (2-1) * (3-1) = 2 df.
Read my post about degrees of freedom to learn about this concept along with a more intuitive way of understanding degrees of freedom in chi-squared tests of independence.
Below is the chi-squared distribution for our study’s design.
The distribution curve displays the likelihood of chi-squared values for a population where there is no relationship between uniform color and status at the population level. I shaded the region that corresponds to chi-square values greater than or equal to our study’s value (6.17). When the null hypothesis is correct, chi-square values fall in this area approximately 4.6% of the time, which is the p-value (0.046). With a significance level of 0.05, our sample data are unusual enough to reject the null hypothesis.
The sample evidence suggests that a relationship between the variables exists in the population. While this test doesn’t indicate red shirts have a higher chance of dying, there is something else going on with red shirts. Read my other post chi-squared to learn about that !
When you have smaller sample sizes, you might need to use Fisher’s exact test instead of the chi-square version. To learn more, read my post, Fisher’s Exact Test: Using and Interpreting .
Learn more about How to Find the P Value .
You can also read about the chi-square goodness of fit test , which assesses the distribution of outcomes for a categorical or discrete variable.
Pearson’s chi-squared test for independence doesn’t tell you the effect size. To understand the strength of the relationship, you’d need to use something like Cramér’s V, which is a measure of association like Pearson’s correlation —except for categorical variables. That’s the topic of a future post!
November 15, 2021 at 1:56 pm
Jim – I want to start by saying that I love your site. It has helped me out greatly during many occasions. In this particular example I am interested in understanding the logic around the math for the expected values. For example, can you explain how I should interpret scaling the total number dead by the total number blue?
From there I get that we divide by the total number of people to get the number of blue deaths expected within the group of 430 people. Is this a formula that is well known for contingency tables or did you apply that strictly for this scenario?
Hopefully this question made sense?
Either way, thanks for the contributing to the community!
November 16, 2021 at 11:48 am
I’m so glad to hear that my site has been helpful!
I’m not 100% sure what you’re asking, so I’m not sure if I’m answering your question. To start, the formulas are the standard ones for the chi-squared test of independence, which you use in conjunction with contingency tables. You’d use the same methods and formulas for other datasets.
The portion you’re asking about is how to calculate the expected number for blue deaths if there is no association between uniform color and deaths (i.e., the null hypothesis of the test is true). So, the interpretation of the value is: If there is no relationship between uniform color and deaths, we’d expect 12.6 fatalities among those wearing blue uniforms. The test as a whole compares these expected values (for all table cells) to the observed values to determine whether the data support rejecting the null hypothesis and concluding that there is a relationship between the variables.
April 22, 2021 at 7:38 am
I teach AP Stat and am planning on using your example. However, in checking conditions I would like to be able to give background on the origin of the data. I went to your link and found that this data was collected for the TV episodes. Are those the episodes just for the original series?
April 23, 2021 at 11:21 pm
That’s great you’re teaching an AP Stats class! 🙂
Yes, the data I use are from the original TV series that aired from 1966-69.
July 5, 2020 at 12:34 pm
Thank you for your gracious reply. I’m especially happy because it meant that I actually understood! You’ve done a great service with this blog; I plan to return regularly! Thank you.
July 5, 2020 at 5:43 pm
I was think exactly that after fixing the comment. It would make a perfect comprehension test. Read this article and find the two incorrect letters! You passed! 🙂
July 4, 2020 at 9:13 am
I very much appreciate your clear explanations. I’m a “50 something” trying to finish a PhD in Library Science and my brain needs the help!
One question, please?
You write above:
Larger values represent a greater difference between your sample data and the null hypothesis. Chi-squared tests are one-tailed tests rather than the more familiar two-tailed tests. The test determines whether the entire set of differences exceeds a significance threshold. If your χ2 passes the limit, your results are statistically significant! You can reject the null hypothesis and conclude that the variables are independent.
I thought that rejecting the null hypothesis allowed you to conclude the opposite. If the null hypothesis is
Null: The variables are independent. No relationship exists.
Then rejecting the Null hypothesis means rejecting that the variables are independent, not concluding that the variables are independent.
This is, please, a honest question, (not being “that guy”; i’m not smart enough!).
Again, thank you for your work!! I’m going to check to see if you cover Kendall’s W, as it’s central to a paper I’m reading!
July 4, 2020 at 3:08 pm
First, I definitely welcome all questions! And, especially in this case because you caught a typo! You’re correct about what rejecting the null hypothesis means for this test. I’ve updated the text to say “and conclude that the variables are dependent.” I double-checked elsewhere through article and all the other text about the conclusions based on significance are correct. Just a brain malfunction on my part! I’m grateful you caught that as that little slip changes the entire meaning!
Alas, I don’t cover Kendall’s W–at least not yet. I plan to add that down the road.
April 28, 2020 at 7:28 pm
Thanks Jim. Your explanations are so effective, yet easy to understand!
April 26, 2020 at 8:54 pm
Thank you Jim. Great post and reply. I have a question which is an extension of Michael’s question.
In general, it seems like one could build any test statistic. Find the distribution of your statistic under the null (say using bootstrap), and that will give you a p-value for your dataset.
Are chi-squared, t, or F-statistics special in some way? Or do we continue to use them simply because people have used them historically?
April 27, 2020 at 12:31 am
Originally, hypothesis tests that used these distributions were easier to calculate. You could calculate the test statistic using a simple formula and then look it up in a table. Later, it got even easier when the computer could both calculate the test statistic and tell you its p-value. It’s really the ease of calculation that made them special along with the theories behind them.
Now, we have such powerful computers that they can easily construct very large sets of bootstrap samples. That would’ve been difficult earlier. So, a large part of the answer is that bootstrapping really wasn’t feasible earlier and so the use of the chi-squared, t, and F distributions became the norm. The historically accepted standards.
It’s possible that over time bootstrap methods will gain be used more. I haven’t done extensive research into how efficient they are compared to using the various distributions, but what I have done indicates they are at least roughly on par. If you haven’t, I’d suggest reading my post about bootstrapping for more information.
Thanks for asking the great question!
January 31, 2020 at 1:29 am
Nice explanation
January 30, 2020 at 4:21 am
This has started my year, so far so good, Thank you Jim.
January 29, 2020 at 1:32 am
great lesson thanks
January 28, 2020 at 9:24 pm
Thankyou Jim, I will read and calc this lesson today, at 3 o’clock Brasilia time.
January 28, 2020 at 4:40 am
Thank You Sir
January 27, 2020 at 8:49 am
Great post, thanks for writing it. I am looking forward to the Cramer’s V post!
As a person just starting to dive into statistics I am curios why we so often square the differences to make calculations. It seems squaring a difference will put to much weight on large differences. For example, in the chi-square test what if we used the absolute value of observed and expected differences? Just something I have been wondering about.
January 28, 2020 at 11:43 pm
Hi Michael,
There’s several ways of looking at your question. In some cases, if you just want to know how far observations are from the mean for a dataset, you would be justified using the mean absolute deviation rather than the standard deviation, which incorporates squared deviations but then takes the square root.
However, in other cases, the squared deviations are built into the underlying analysis. Such as in linear regression where it penalizes larger errors which helps force them to be smaller. Otherwise, the regression line would not “consider” larger errors to be much worse than smaller errors. Here’s an article about it in the regression context .
Or, if you’re working with the normal distribution and using it calculate probabilities or what not, that distribution has the mean and standard deviation as parameters. And the standard deviation incorporates squared differences. You could not work with the normal distribution using mean absolute deviations (MAD).
In a similar vein for chi-squared tests, you have to realize that the chi-squared distribution is based on squared differences. So, if you wanted to do a similar analysis but with the mean absolute deviation (MAD), you’d have to devise an entirely new test statistic and sampling distribution for it! You couldn’t just use the chi-squared distribution because that is specifically for these differences that use squaring. Same thing for F-tests which use ratios of variances, and variances are of course based on squared differences. Again, to use MAD for something like ANOVA, you’d need to come up with a new test statistic and sampling distribution!
But, the general reason is that squaring does weight large differences more heavily and that fits in with the rational that given a distribution of values, outlier values should be weighted more because they are relatively unlikely to occur so when they do it’s noteworthy. It makes those large differences between the expected and the observed more “odd.” And, some analyses use an underlying sampling distribution that is based on a test statistic calculated using squared differences in some fashion.
January 27, 2020 at 2:08 am
Thank you Jim.
January 27, 2020 at 1:12 am
Great lesson Jim! You’re putting it a very simple ways for non-statisticians. Thanks for sharing the knowledge!
January 26, 2020 at 8:32 pm
Thanks for sharing, Jim!
Run a free plagiarism check in 10 minutes, generate accurate citations for free.
Published on May 31, 2022 by Shaun Turney . Revised on June 21, 2023.
The chi-square (Χ 2 ) distribution table is a reference table that lists chi-square critical values . A chi-square critical value is a threshold for statistical significance for certain hypothesis tests and defines confidence intervals for certain parameters.
Chi-square critical values are calculated from chi-square distributions . They’re difficult to calculate by hand, which is why most people use a reference table or statistical software instead.
Download chi-square table (PDF)
When to use a chi-square distribution table, chi-square distribution table (right-tail probabilities), how to use the table, left-tailed and two-tailed probabilities, practice questions, other interesting articles, frequently asked questions about chi-square tables.
You will need a chi-square critical value if you want to:
Use the table below to find the chi-square critical value for your chi-square test or confidence interval or download the chi-square distribution table (PDF) .
The table provides the right-tail probabilities. If you need the left-tail probabilities, you’ll need to make a small additional calculation .
To find the chi-square critical value for your hypothesis test or confidence interval, follow the three steps below.
The team wants to use a chi-square goodness of fit test to test the null hypothesis ( H 0 ) that the four entrances are used equally often by the population.
There isn’t just one chi-square distribution —there are many, and their shapes differ depending on a parameter called “degrees of freedom” (also referred to as df or k ). Each row of the chi-square distribution table represents a chi-square distribution with a different df.
You need to use the distribution with the correct df for your test or confidence interval. The table below gives equations to calculate df for several common procedures:
Test or procedure | Degrees of freedom ( ) equation |
---|---|
Test of a single variance Confidence interval for variance or standard deviation | = sample size − 1 |
= number of groups − 1 | |
= (number of variable 1 groups − 1) * (number of variable 2 groups − 1) | |
= 1 |
df = number of groups − 1
The columns of the chi-square distribution table indicate the significance level of the critical value. By convention, the significance level (α) is almost always .05, so the column for .05 is highlighted in the table.
In rare situations, you may want to increase α to decrease your Type II error rate or decrease α to decrease your Type I error rate.
To calculate a confidence interval, choose the significance level based on your desired confidence level :
α = 1 − confidence level
The most common confidence level is 95% (.95), which corresponds to α = .05.
You now have the two numbers you need to find your critical value in the chi-square distribution table:
The security team can now compare this chi-square critical value to the Pearson’s chi-square they calculated for their sample. If the critical value is larger than the sample’s chi-square, they can reject the null hypothesis.
The table provided here gives the right-tail probabilities. You should use this table for most chi-square tests, including the chi-square goodness of fit test and the chi-square test of independence, and McNemar’s test.
If you want to perform a two-tailed or left-tailed test, you’ll need to make a small additional calculation.
The most common left-tailed test is the test of a single variance when determining whether a population’s variance or standard deviation is less than a certain value.
To find the critical value for a left-tailed probability in the table above, simply use the table column for 1 − α.
You pride yourself on making every cookie the same size, so you decide to randomly sample 25 of your cookies to see if their standard deviation is less than 0.2 inches.
This is a left-tailed test because you want to know if the standard deviation is less than a certain value. You look up the left-tailed probability in the right-tailed table by subtracting one from your significance level : 1 − α = 1 − .05 = 0.95.
The critical value for df = 25 − 1 = 24 and α = .95 is 13.848.
The most common left-tailed test is the test of a single variance when determining whether a population’s variance or standard deviation is equal to a certain value.
They find in a medical textbook that the standard deviation of head diameter of six-month-old babies is 1 inch, but they want to confirm this number themselves. They randomly sample 20 six-month-old babies and measure their heads.
This is a two-tailed test because they want to know if the standard deviation is equal to a certain value. They should look up the two critical values in the columns for:
The critical value for df = 20 − 1 = 19 and α = .025 is 32.852. The critical value for df = 19 and α = .975 is 8.907.
Professional editors proofread and edit your paper by focusing on:
See an example
If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.
Methodology
Research bias
You can use the qchisq() function to find a chi-square critical value in R.
For example, to calculate the chi-square critical value for a test with df = 22 and α = .05:
qchisq(p = .05, df = 22, lower.tail = FALSE)
You can use the CHISQ.INV.RT() function to find a chi-square critical value in Excel.
For example, to calculate the chi-square critical value for a test with df = 22 and α = .05, click any blank cell and type:
=CHISQ.INV.RT(0.05,22)
A chi-square distribution is a continuous probability distribution . The shape of a chi-square distribution depends on its degrees of freedom , k . The mean of a chi-square distribution is equal to its degrees of freedom ( k ) and the variance is 2 k . The range is 0 to ∞.
If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.
Turney, S. (2023, June 21). Chi-Square (Χ²) Table | Examples & Downloadable Table. Scribbr. Retrieved August 21, 2024, from https://www.scribbr.com/statistics/chi-square-distribution-table/
Other students also liked, chi-square (χ²) tests | types, formula & examples, chi-square (χ²) distributions | definition & examples, hypothesis testing | a step-by-step guide with easy examples, what is your plagiarism score.
IMAGES
COMMENTS
We then determine the appropriate test statistic for the hypothesis test. The formula for the test statistic is given below. Test Statistic for Testing H0: p1 = p 10 , p2 = p 20 , ..., pk = p k0. We find the critical value in a table of probabilities for the chi-square distribution with degrees of freedom (df) = k-1.
Where: Χ 2 is the chi-square test statistic; Σ is the summation operator (it means "take the sum of") O is the observed frequency; E is the expected frequency; The larger the difference between the observations and the expectations (O − E in the equation), the bigger the chi-square will be.To decide whether the difference is big enough to be statistically significant, you compare the ...
The Chi-square test is a non-parametric statistical test used to determine if there's a significant association between two or more categorical variables in a sample. It works by comparing the observed frequencies in each category of a cross-tabulation with the frequencies expected under the null hypothesis, which assumes there is no ...
Chi-squared distribution, showing χ 2 on the x-axis and p-value (right tail probability) on the y-axis.. A chi-squared test (also chi-square or χ 2 test) is a statistical hypothesis test used in the analysis of contingency tables when the sample sizes are large. In simpler terms, this test is primarily used to examine whether two categorical variables (two dimensions of the contingency table ...
A chi-square (Χ 2) test of independence is a nonparametric hypothesis test. You can use it to test whether two categorical variables are related to each other. Example: Chi-square test of independence. Imagine a city wants to encourage more of its residents to recycle their household waste.
Step-by-Step Guide to Perform Chi-Square Test. To effectively execute a Chi-Square Test, follow these methodical steps:. State the Hypotheses: The null hypothesis (H0) posits no association between the variables — i.e., independent — while the alternative hypothesis (H1) posits an association between the variables. Construct a Contingency Table: Create a matrix to present your observations ...
The formula for the chi-squared test is χ 2 = Σ (Oi − Ei)2/ Ei, where χ 2 represents the chi-squared value, Oi represents the observed value, Ei represents the expected value (that is, the value expected from the null hypothesis), and the symbol Σ represents the summation of values for all i. One then looks up in a table the chi-squared ...
Chi-Square Test Statistic. χ 2 = ∑ ( O − E) 2 / E. where O represents the observed frequency. E is the expected frequency under the null hypothesis and computed by: E = row total × column total sample size. We will compare the value of the test statistic to the critical value of χ α 2 with the degree of freedom = ( r - 1) ( c - 1), and ...
This module will continue the discussion of hypothesis testing, where a specific statement or hypothesis is generated about a population parameter, and sample statistics are used to assess the likelihood that the hypothesis is true. ... The specific tests considered here are called chi-square tests and are appropriate when the outcome is ...
To calculate the expected numbers a constant multiplier for each sample is obtained by dividing the total of the sample by the grand total for both samples. In table 8.1 for sample A this is 155/289 = 0.5363. This fraction is then successively multiplied by 22, 46, 73, 91, and 57. For sample B the fraction is 134/289 = 0.4636.
Published on May 20, 2022 by Shaun Turney . Revised on June 21, 2023. A chi-square (Χ2) distribution is a continuous probability distribution that is used in many hypothesis tests. The shape of a chi-square distribution is determined by the parameter k. The graph below shows examples of chi-square distributions with different values of k.
The chi-squared test of independence (or association) and the two-sample proportions test are related. The main difference is that the chi-squared test is more general while the 2-sample proportions test is more specific. And, it happens that the proportions test it more targeted at specifically the type of data you have.
The basic idea behind the test is to compare the observed values in your data to the expected values that you would see if the null hypothesis is true. There are two commonly used Chi-square tests: the Chi-square goodness of fit test and the Chi-square test of independence. Both tests involve variables that divide your data into categories.
Courses on Khan Academy are always 100% free. Start practicing—and saving your progress—now: https://www.khanacademy.org/math/ap-statistics/chi-square-tests/...
We perform a two-tailed test because the alternative hypothesis, \(H_a: p \neq 0.60\), contains a not equal to inequality. We can now find the P-value, which is the probability of seeing a sample proportion as extreme or more extreme than 0.5, by finding the probability from the standard normal distribution. ... Chi-Square Statistic. The test ...
Computational Exercises. In each of the following exercises, specify the number of degrees of freedom of the chi-square statistic, give the value of the statistic and compute the P -value of the test. A coin is tossed 100 times, resulting in 55 heads. Test the null hypothesis that the coin is fair.
You should use the Chi-Square Goodness of Fit Test whenever you would like to know if some categorical variable follows some hypothesized distribution. Here are some examples of when you might use this test: Example 1: Counting Customers. A shop owner wants to know if an equal number of people come into a shop each day of the week, so he counts ...
Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance, history, and more. Khan Academy is a nonprofit with the mission of providing a free, world-class education for anyone, anywhere.
A Chi-Square Test is used to examine whether the observed results are in order with the expected values. When the data to be analysed is from a random sample, and when the variable is the question is a categorical variable, then Chi-Square proves the most appropriate test for the same. A categorical variable consists of selections such as ...
To conduct this test we compute a Chi-Square test statistic where we compare each cell's observed count to its respective expected count. In a summary table, we have r × c = r c cells. Let O 1, O 2, …, O r c denote the observed counts for each cell and E 1, E 2, …, E r c denote the respective expected counts for each cell.
Start Unit test. Chi-square tests are a family of significance tests that give us ways to test hypotheses about distributions of categorical data. This topic covers goodness-of-fit tests to see if sample data fits a hypothesized distribution, and tests for independence between two categorical variables.
The name for this test is "t-test," and it is one of the simplest and most popular hypothesis tests. The chi-square test is for another use. It is to see if one thing is related to another.
For chi-squared tests, the term "expected frequencies" refers to the values we'd expect to see if the null hypothesis is true. To calculate the expected frequency for a specific combination of categorical variables (e.g., blue shirts who died), multiply the column total (Blue) by the row total (Dead), and divide by the sample size.
A Chi-Square test of independence is used to examine the relationship between observed and expected results. The null hypothesis (H0) of this test is that any observed difference between the features is purely due to chance. In simple terms, this test can help us identify if the relationship between two categorical variables is due to chance ...
The chi-square (Χ2) distribution table is a reference table that lists chi-square critical values. A chi-square critical value is a threshold for statistical significance for certain hypothesis tests and defines confidence intervals for certain parameters. Chi-square critical values are calculated from chi-square distributions.
Khan Academy