1. Consider the following table:
Preferred flavor of Ice Cream | |||||
Sex | Chocolate | Vanilla | Strawberry | Coffee | Total |
male | 25 | 15 | 15 | 29 | 84 |
female | 35 | 20 | 10 | 20 | 85 |
Total | 60 | 35 | 25 | 49 | 169 |
2. If you were going to perform a chi-square test on this table, what would your degrees of freedom be?
You calculate degrees of freedom by multiplying the number of rows minus one times the number of columns minus one: df = (rows - 1) × (cols - 1) = (4 - 1) × (2 - 1) = 3
3. What would the expected number be for the male vanilla cell? (How many men would you expect to prefer vanilla?)
There are two ways to calculate the expecteds. For both, you need to add a few more numbers to the table. You need the marginal counts and percentages, often called "the marginals." The counts are already present in the "Totals" row and column. The marginal percentages have been added in the table below. You can see that 62.5% of all people passed the test and 50% took the drug.
The first way to calculate the expecteds uses the marginal percentages. If sex is not related to flavor preference, you would expect the same percentages of males to prefer the same flavors as females. Since overall 20.71% prefer vanilla, you would expect 20.71% of males to prefer vanilla. Now, 20.71% of 84 people is 17.3964, so this is how many males you would expect to prefer vanilla. You would also expect 20.71% of the females to prefer vanilla, which is 17.6035. So you'd expect 17.3964 males and 17.6035 females to prefer vanilla.
Preferred flavor of Ice Cream | |||||
Sex | Chocolate | Vanilla | Strawberry | Coffee | Total |
male | 25 |
|
15 | 29 |
|
female | 35 |
|
10 | 20 |
|
Total |
|
|
|
|
169 |
The second method of calculating the expecteds is a bit more straightforward and is likely to result in slightly more accurate results because it reduces rounding error. To calculate the expecteds for the male vanilla cell, multiply the number of cases in the male row by the number in the vanilla column and divide the result by the total number in the table. Thus 84 × 35 ÷ 169 = 17.3964, the number of males you would expect to prefer vanilla.
4. What is the contribution of the female strawberry cell to the value of chi-squared?
Each cell contributes an amount to chi-squared:
The expected for the female strawberry cell is 85 × 25 ÷ 169 = 12.57396. Therefore, the female strawberry cell's contribution to chi-squared is:
5. What is the value of chi-squared for this table?
This table shows the observed and expected counts for each cell in the table.
|
Preferred flavor of Ice Cream | |||
Sex | Chocolate | Vanilla | Strawberry | Coffee |
male |
|
|
|
|
female |
|
|
|
|
This table shows the difference between observed and expected (the top number in each cell) and the cell's contribution to chi-squared (the bottom number in each cell). The cell contributions are calculated according to the formula in the answer to question 4.
|
Preferred flavor of Ice Cream | |||
Sex | Chocolate | Vanilla | Strawberry | Coffee |
male |
|
|
|
|
female |
|
|
|
|
The value of chi-squared for the table is the sum of the values in each cell:
.77983 + .33012 + .53318 + .88588 + .77065 + .32624 + .52691 + .87546 = 5.02827
6. What is the null hypothesis?
HØ : "Sex is not related to preference of ice cream flavor."
HALT : "Sex is related to preference of ice cream flavor."
7. Is the row variable independent of the column variable?
The 95% critical value for chi-squared with 3 degrees of freedom is 7.815. Since chi-squared for this data, 5.02827, is less than the critical value of 7.815, you fail to reject the null hypothesis. The data shown in this table could have come from a population in which sex is not related to preferred flavor of ice cream. In other words, the data does not convince you that sex is related to preferred flavor. So the answer to the question is "Yes, the row variable is independent of the column variable."
8. Do you reject the null hypothesis?
Since chi-squared for this data, 5.02827, is less than the critical value of 7.815, you fail to reject the null hypothesis.
9. How certain are you of your answer?
Critical values of chi-squared | |||||
df | Probability | ||||
.3 | .2 | .1 | .05 | .02 | |
1 | 1.074 | 1.642 | 2.706 | 3.841 | 5.412 |
2 | 2.408 | 3.219 | 4.605 | 5.991 | 7.824 |
3 | 3.665 | 4.642 | 6.251 | 7.815 | 9.837 |
4 | 4.878 | 5.989 | 7.779 | 9.488 | 11.668 |
Because your value of chi-squared is between 4.624 and 6.251, the probability of getting a value of chi-squared as large as the one you obtained would be between 10% and 20%, even if the row variable is independent of the column variable.
10. How do you know that is how certain you are?
You know because you have examined the table of critical values of chi-squared and you found that a value of chi-squared of 5.02827 with three degrees of freedom has a probability between 0.1 and 0.2.
11. Which method would be more appropriate for interpreting this table - percentage down, compare across or percentage across, compare down? Why?
In this table, the independent variable is sex. Since you want to compare men with women, you would use percentage across, compare down. In other words, you would work with row percents and you would compare the percentages of men and women that fall into a column.
|
Preferred flavor of Ice Cream | ||||
Sex | Chocolate | Vanilla | Strawberry | Coffee | Total |
male |
|
|
|
|
|
female |
|
|
|
|
|
Total |
|
|
|
|
169 |
You would say: 41.176% of women prefer chocolate compared to only 29.762% of men. 35.524% of men prefer coffee compared to only 23.529% of women. 17.857% of men prefer strawberry, compared to only 11.765% of women. For women, the most preferred flavour was chocolate, while for men, the most preferred flavour was coffee. Overall, the least popular flavor was strawberry.
12. In a large survey done in the USA on a regular basis (the General Social Survey), people were asked "Taken all together, how would you say things are these days -- would you say that you are very happy, pretty happy, or not too happy?" The results of this question are shown here:
count | Reported Happiness vs. Marital Status | |||
Marital Status | ||||
Happiness | Married | Divorced | Single | Total |
Very | 347 | 97 | 58 | 502 |
Less than Very | 467 | 256 | 220 | 943 |
Total | 814 | 353 | 278 | 1445 |
13. What is the null hypothesis?
HØ : "People's marital status is not related to how happy they say they are."
14. If you were going to perform a chi-square test on this table, what would your degrees of freedom be?
The degrees of freedom would be (3 - 1) × (2 - 1) = 2.
15. What is the alternate hypothesis?
HALT : "People's marital status is related to how happy they say they are."
16. What is the value of chi-square for this table?
|
Reported Happiness vs. Marital Status | ||
Marital Status | |||
Happiness | Married | Divorced | Single |
Very |
|
|
|
|
|
|
|
chi-squared = 14.5807 + 5.3582 + 15.4103 + 7.76194 + 2.8524 + 8.20358 = 54.16713
17. Do you reject the null hypothesis?
Yes you do. The 95% critical value of chi-squared for 2 degrees of freedom is 5.991. The value you obtained is much larger than this.
18. How certain are you of your answer?
The probability of getting a value of chi-squared as large as the one you see for this table is less than .001 when the null hypothesis is true. The chance of making a Type I error if you reject the null hypothesis is less than one in a thousand (0.1%).