Chapter Contents |
Previous |
Next |
The GLM Procedure |
PROC GLM provides both univariate and multivariate tests for repeated measures for one response. For an overall reference on univariate repeated measures, refer to Winer (1971). The multivariate approach is covered in Cole and Grizzle (1966). For a discussion of the relative merits of the two approaches, see LaTour and Miniard (1983).
Another approach to analysis of repeated measures is via general mixed models. This approach can handle balanced as well as unbalanced or missing within-subject data, and it offers more options for modeling the within-subject covariance. The main drawback of the mixed models approach is that it generally requires iteration and, thus, may be less computationally efficient. For further details on this approach, see Chapter 41, "The MIXED Procedure," and Wolfinger and Chang (1995).
SUBJ GROUP TIME Y 1 1 1 15 1 1 2 19 1 1 3 25 2 1 1 21 2 1 2 18 2 1 3 17 1 2 1 14 1 2 2 12 1 2 3 16 2 2 1 11 2 2 2 20 . . . 10 3 1 14 10 3 2 18 10 3 3 16
There are three observations for each subject, corresponding to measurements taken at times 1, 2, and 3. These data could be analyzed using the following statements:
proc glm data=old; class group subj time; model y=group subj(group) time group*time; test h=group e=subj(group); run;
However, this analysis assumes subjects' measurements are uncorrelated across time. A repeated measures analysis does not make this assumption. It uses a data set new:
GROUP Y1 Y2 Y3 1 15 19 25 1 21 18 17 2 14 12 16 2 11 20 21 . . . 3 14 18 16
In the data set new, the three measurements for a subject are all in one observation. For example, the measurements for subject 1 for times 1, 2, and 3 are 15, 19, and 25. For these data, the statements for a repeated measures analysis (assuming default options) are
proc glm data=new; class group; model y1-y3=group / nouni; repeated time; run;
To convert the univariate form of repeated measures data to the multivariate form, you can use a program like the following:
proc sort data=old; by group subj; run; data new(keep=y1-y3 group); array yy(3) y1-y3; do time=1 to 3; set old; by group subj; yy(time)=y; if last.subj then return; end; run;
Alternatively, you could use PROC TRANSPOSE to achieve the same results with a program like this one:
proc sort data=old; by group subj; run; proc transpose out=new(rename=(_1=y1 _2=y2 _3=y3)); by group subj; id time; run;
Refer to the discussions in SAS Language Reference: Concepts for more information on rearrangement of data sets.
Repeated measures analyses are distinguished from MANOVA because of interest in testing hypotheses about the within-subject effects and the within-subject-by-between-subject interactions.
For tests that involve only between-subjects effects, both the multivariate and univariate approaches give rise to the same tests. These tests are provided for all effects in the MODEL statement, as well as for any CONTRASTs specified. The ANOVA table for these tests is labeled "Tests of Hypotheses for Between Subjects Effects" on the PROC GLM results. These tests are constructed by first adding together the dependent variables in the model. Then an analysis of variance is performed on the sum divided by the square root of the number of dependent variables. For example, the statements
model y1-y3=group; repeated time;
give a one-way analysis of variance using as the dependent variable for performing tests of hypothesis on the between-subject effect GROUP. Tests for between-subject effects are equivalent to tests of the hypothesis , where M is simply a vector of 1s.
For within-subject effects and for within-subject-by-between-subject interaction effects, the univariate and multivariate approaches yield different tests. These tests are provided for the within-subject effects and for the interactions between these effects and the other effects in the MODEL statement, as well as for any CONTRASTs specified. The univariate tests are displayed in a table labeled "Univariate Tests of Hypotheses for Within Subject Effects." Results for multivariate tests are displayed in a table labeled "Repeated Measures Analysis of Variance."
The multivariate tests provided for within-subjects effects and interactions involving these effects are Wilks' Lambda, Pillai's Trace, Hotelling-Lawley Trace, and Roy's maximum root. For further details on these four statistics, see the "Multivariate Tests" section in Chapter 3, "Introduction to Regression Procedures." As an example, the statements
model y1-y3=group; repeated time;
produce multivariate tests for the within-subject effect TIME and the interaction TIME*GROUP.
The multivariate tests for within-subject effects are produced by testing the hypothesis , where the L matrix is the usual matrix corresponding to Type I, Type II, Type III, or Type IV hypotheses tests, and the M matrix is one of several matrices depending on the transformation that you specify in the REPEATED statement. The only assumption required for valid tests is that the dependent variables in the model have a multivariate normal distribution with a common covariance matrix across the between-subject effects.
The univariate tests for within-subject effects and interactions involving these effects require some assumptions for the probabilities provided by the ordinary F-tests to be correct. Specifically, these tests require certain patterns of covariance matrices, known as Type H covariances (Huynh and Feldt 1970). Data with these patterns in the covariance matrices are said to satisfy the Huynh-Feldt condition. You can test this assumption (and the Huynh-Feldt condition) by applying a sphericity test (Anderson 1958) to any set of variables defined by an orthogonal contrast transformation. Such a set of variables is known as a set of orthogonal components. When you use the PRINTE option in the REPEATED statement, this sphericity test is applied both to the transformed variables defined by the REPEATED statement and to a set of orthogonal components if the specified transformation is not orthogonal. It is the test applied to the orthogonal components that is important in determining whether your data have Type H covariance structure. When there are only two levels of the within-subject effect, there is only one transformed variable, and a sphericity test is not needed. The sphericity test is labeled "Test for Sphericity" on the output.
If your data satisfy the preceding assumptions, use the usual F-tests to test univariate hypotheses for the within-subject effects and associated interactions.
If your data do not satisfy the assumption of Type H covariance, an adjustment to numerator and denominator degrees of freedom can be used. Two such adjustments, based on a degrees of freedom adjustment factor known as (epsilon) (Box 1954), are provided in PROC GLM. Both adjustments estimate and then multiply the numerator and denominator degrees of freedom by this estimate before determining significance levels for the F-tests. Significance levels associated with the adjusted tests are labeled "Adj Pr > F" on the output. The first adjustment, initially proposed for use in data analysis by Greenhouse and Geisser (1959), is labeled "Greenhouse-Geisser Epsilon" and represents the maximum-likelihood estimate of Box's factor. Significance levels associated with adjusted F-tests are labeled "G-G" on the output. Huynh and Feldt (1976) have shown that the G-G estimate tends to be biased downward (that is, too conservative), especially for small samples, and they have proposed an alternative estimator that is constructed using unbiased estimators of the numerator and denominator of Box's .Huynh and Feldt's estimator is labeled "Huynh-Feldt Epsilon" on the PROC GLM output, and the significance levels associated with adjusted F-tests are labeled "H-F." Although must be in the range of 0 to 1, the H-F estimator can be outside this range. When the H-F estimator is greater than 1, a value of 1 is used in all calculations for probabilities, and the H-F probabilities are not adjusted. In summary, if your data do not meet the assumptions, use adjusted F-tests. However, when you strongly suspect that your data may not have Type H covariance, all these univariate tests should be interpreted cautiously. In such cases, you should consider using the multivariate tests instead.
The univariate sums of squares for hypotheses involving within-subject effects can be easily calculated from the H and E matrices corresponding to the multivariate tests described in the "Multivariate Analysis of Variance" section. If the M matrix is orthogonal, the univariate sums of squares is calculated as the trace (sum of diagonal elements) of the appropriate H matrix; if it is not orthogonal, PROC GLM calculates the trace of the H matrix that results from an orthogonal M matrix transformation. The appropriate error term for the univariate F-tests is constructed in a similar way from the error SSCP matrix and is labeled Error(factorname), where factorname indicates the M matrix that is used in the transformation.
When the design specifies more than one repeated measures factor, PROC GLM computes the M matrix for a given effect as the direct (Kronecker) product of the M matrices defined by the REPEATED statement if the factor is involved in the effect or as a vector of 1s if the factor is not involved. The test for the main effect of a repeated-measures factor is constructed using an L matrix that corresponds to a test that the mean of the observation is zero. Thus, the main effect test for repeated measures is a test that the means of the variables defined by the M matrix are all equal to zero, while interactions involving repeated-measures effects are tests that the between-subjects factors involved in the interaction have no effect on the means of the transformed variables defined by the M matrix. In addition, you can specify other L matrices to test hypotheses of interest by using the CONTRAST statement, since hypotheses defined by CONTRAST statements are also tested in the REPEATED analysis. To see which combinations of the original variables the transformed variables represent, you can specify the PRINTM option in the REPEATED statement. This option displays the transpose of M, which is labeled as M in the PROC GLM results. The tests produced are the same for any choice of transformation (M) matrix specified in the REPEATED statement; however, depending on the nature of the repeated measurements being studied, a particular choice of transformation matrix, coupled with the CANONICAL or SUMMARY option, can provide additional insight into the data being studied.
The following sections describe the transformations available in the REPEATED statement, provide an example of the M matrix that is produced, and give guidelines for the use of the transformation. As in the PROC GLM output, the displayed matrix is labeled M. This is the M' matrix.
proc glm; model d1-d5= / nouni; repeated drug 5 contrast(1) / summary printm; run;
produce the following M matrix:
When you examine the analysis of variance tables produced by the SUMMARY option, you can tell which of the drugs differed significantly from the placebo.
proc glm; class group; model r1-r5=group / nouni; repeated dose 5 (1 2 5 10 20) polynomial / summary printm; run;
produce the following M matrix.
The SUMMARY option in this example provides univariate ANOVAs for the variables defined by the rows of this M matrix. In this case, they represent the linear, quadratic, cubic, and quartic trends for dose and are labeled dose_1, dose_2, dose_3, and dose_4, respectively.
proc glm; class sex; model resp1-resp4=sex / nouni; repeated trtmnt 4 helmert / canon printm; run;
produce the following M matrix:
repeated drug 5 mean / printm;
the following M matrix is produced:
As with the CONTRAST transformation, if you want to omit a level other than the last, you can specify it in parentheses after the keyword MEAN in the REPEATED statement.
proc glm; class school; model t1-t4=school / nouni; repeated method 4 profile / summary nom printm; run;
produce the following M matrix:
To determine the point at which an improvement in test scores takes place, you can examine the analyses of variance for the transformed variables representing the differences between adjacent tests. These analyses are requested by the SUMMARY option in the REPEATED statement, and the variables are labeled METHOD.1, METHOD.2, and METHOD.3.
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.