Example 30.3: Unbalanced ANOVA for Two-Way Design with Interaction
This example uses data from Kutner (1974, p. 98) to
illustrate a two-way analysis of variance.
The original data source is Afifi and Azen (1972, p. 166).
These statements produce Output 30.3.1.
/*---------------------------------------------------------*/
/* Note: Kutner's 24 for drug 2, disease 1 changed to 34. */
/*---------------------------------------------------------*/
title 'Unbalanced Two-Way Analysis of Variance';
data a;
input drug disease @;
do i=1 to 6;
input y @;
output;
end;
datalines;
1 1 42 44 36 13 19 22
1 2 33 . 26 . 33 21
1 3 31 -3 . 25 25 24
2 1 28 . 23 34 42 13
2 2 . 34 33 31 . 36
2 3 3 26 28 32 4 16
3 1 . . 1 29 . 19
3 2 . 11 9 7 1 -6
3 3 21 1 . 9 3 .
4 1 24 . 9 22 -2 15
4 2 27 12 12 -5 16 15
4 3 22 7 25 5 12 .
;
proc glm;
class drug disease;
model y=drug disease drug*disease / ss1 ss2 ss3 ss4;
run;
Output 30.3.1: Unbalanced ANOVA for Two-Way Design with Interaction
Unbalanced Two-Way Analysis of Variance |
Class Level Information |
Class |
Levels |
Values |
drug |
4 |
1 2 3 4 |
disease |
3 |
1 2 3 |
Number of observations |
72 |
NOTE: |
Due to missing values, only 58 observations can be used in this analysis. |
|
|
Unbalanced Two-Way Analysis of Variance |
The GLM Procedure |
Dependent Variable: y |
Source |
DF |
Sum of Squares |
Mean Square |
F Value |
Pr > F |
Model |
11 |
4259.338506 |
387.212591 |
3.51 |
0.0013 |
Error |
46 |
5080.816667 |
110.452536 |
|
|
Corrected Total |
57 |
9340.155172 |
|
|
|
R-Square |
Coeff Var |
Root MSE |
y Mean |
0.456024 |
55.66750 |
10.50964 |
18.87931 |
Source |
DF |
Type I SS |
Mean Square |
F Value |
Pr > F |
drug |
3 |
3133.238506 |
1044.412835 |
9.46 |
<.0001 |
disease |
2 |
418.833741 |
209.416870 |
1.90 |
0.1617 |
drug*disease |
6 |
707.266259 |
117.877710 |
1.07 |
0.3958 |
Source |
DF |
Type II SS |
Mean Square |
F Value |
Pr > F |
drug |
3 |
3063.432863 |
1021.144288 |
9.25 |
<.0001 |
disease |
2 |
418.833741 |
209.416870 |
1.90 |
0.1617 |
drug*disease |
6 |
707.266259 |
117.877710 |
1.07 |
0.3958 |
Source |
DF |
Type III SS |
Mean Square |
F Value |
Pr > F |
drug |
3 |
2997.471860 |
999.157287 |
9.05 |
<.0001 |
disease |
2 |
415.873046 |
207.936523 |
1.88 |
0.1637 |
drug*disease |
6 |
707.266259 |
117.877710 |
1.07 |
0.3958 |
Source |
DF |
Type IV SS |
Mean Square |
F Value |
Pr > F |
drug |
3 |
2997.471860 |
999.157287 |
9.05 |
<.0001 |
disease |
2 |
415.873046 |
207.936523 |
1.88 |
0.1637 |
drug*disease |
6 |
707.266259 |
117.877710 |
1.07 |
0.3958 |
|
Note the differences between the four types of sums of squares.
The Type I sum of squares for drug essentially tests for
differences between the expected values of the arithmetic mean
response for different drugs, unadjusted for the effect of disease.
By contrast, the Type II sum of squares for drug measure the
differences between arithmetic means for each drug after adjusting for
disease. The Type III sum of squares measures the differences
between predicted drug means over a balanced drug×disease
population -that is, between the LS-means for drug.
Finally, the Type IV sum of squares is the same as the Type III sum
of squares in this case, since there is data for every drug-by-disease
combination.
No matter which sum of squares you prefer to use, this analysis shows
a significant difference among the four drugs, while the disease effect
and the drug-by-disease interaction are not significant.
As the previous discussion indicates, Type III sums of squares correspond to
differences between LS-means, so you can follow up the Type III tests with
a multiple comparisons analysis of the drug LS-means.
Since the GLM procedure is interactive, you can accomplish this by
submitting the following statements after the previous ones that
performed the ANOVA.
lsmeans drug / pdiff=all adjust=tukey;
run;
Both the LS-means themselves and a matrix of adjusted p-values for
pairwise differences between them are displayed; see Output 30.3.2.
Output 30.3.2: LS-Means for Unbalanced ANOVA
Unbalanced Two-Way Analysis of Variance |
The GLM Procedure |
Least Squares Means |
Adjustment for Multiple Comparisons: Tukey-Kramer |
drug |
y LSMEAN |
LSMEAN Number |
1 |
25.9944444 |
1 |
2 |
26.5555556 |
2 |
3 |
9.7444444 |
3 |
4 |
13.5444444 |
4 |
|
Unbalanced Two-Way Analysis of Variance |
The GLM Procedure |
Least Squares Means |
Adjustment for Multiple Comparisons: Tukey-Kramer |
Least Squares Means for effect drug Pr > |t| for H0: LSMean(i)=LSMean(j)
Dependent Variable: y |
i/j |
1 |
2 |
3 |
4 |
1 |
|
0.9989 |
0.0016 |
0.0107 |
2 |
0.9989 |
|
0.0011 |
0.0071 |
3 |
0.0016 |
0.0011 |
|
0.7870 |
4 |
0.0107 |
0.0071 |
0.7870 |
|
|
The multiple comparisons analysis shows that drugs 1 and 2 have very
similar effects, and that drugs 3 and 4 are also insignificantly different
from each other. Evidently, the main contribution to the
significant drug effect is the difference between the 1/2 pair and
the 3/4 pair.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.