Example 41.4: Known G and R
This animal breeding example from Henderson (1984, p. 53) considers
multiple traits. The data are artificial and consist of
measurements of two traits on three animals, but the second trait of
the third animal is missing. Assuming an additive genetic model,
you can use PROC MIXED to predict the breeding value of both traits
on all three animals and also to predict the second trait of the
third animal. The data are as follows:
data h;
input Trait Animal Y;
datalines;
1 1 6
1 2 8
1 3 7
2 1 9
2 2 5
2 3 .
;
Both G and R are known.
In order to read G into PROC MIXED using the GDATA= option in
the RANDOM statement, perform the following DATA step:
data g;
input Row Col1-Col6;
datalines;
1 2 1 1 2 1 1
2 1 2 .5 1 2 .5
3 1 .5 2 1 .5 2
4 2 1 1 3 1.5 1.5
5 1 2 .5 1.5 3 .75
6 1 .5 2 1.5 .75 3
;
The preceding data are in the dense representation for a GDATA= data
set. You can also construct a data set with the sparse
representation using Row, Col, and Value
variables, although this would require 21 observations instead of 6
for this example.
The PROC MIXED code is as follows:
proc mixed data=h mmeq mmeqsol;
class Trait Animal;
model Y = Trait / noint s outp=predicted;
random Trait*Animal / type=un gdata=g g gi s;
repeated / type=un sub=Animal r ri;
parms (4) (1) (5) / noiter;
run;
proc print data=predicted;
run;
The MMEQ and MMEQSOL options request the mixed model equations and
their solution. The variables Trait and Animal are classification
variables, and Trait defines the entire X matrix for the
fixed-effects portion of the model, since the intercept is omitted
with the NOINT option. The fixed-effects solution vector and
predicted values are also requested using the S and OUTP= options,
respectively.
The random effect Trait*Animal leads to a Z matrix
with six columns, the first five corresponding to the identity
matrix and the last consisting of 0s. An unstructured G matrix
is specified using the TYPE=UN option, and it is read into PROC MIXED from a
SAS data set using the GDATA=G specification. The G and GI options
request the display of G and G-1, respectively. The S
option requests that the random-effects solution vector be displayed.
Note that the preceding R matrix is block diagonal if the data
are sorted by animals. The REPEATED statement exploits this fact by
requesting R to have unstructured 2×2 blocks
corresponding to animals, which are the subjects. The R and RI
options request that the estimated 2×2 blocks for the first
animal and its inverse be displayed. The PARMS statement lists the
parameters of this 2×2 matrix. Note that the parameters from
G are not specified in the PARMS statement because they have
already been assigned using the GDATA= option in the RANDOM
statement. The NOITER option prevents PROC MIXED from computing
residual (restricted) maximum likelihood estimates; instead, the
known values are used for inferences.
The results from this analysis are shown in Output 41.4.1.
Output 41.4.1: Known G and R
Model Information |
Data Set |
WORK.H |
Dependent Variable |
Y |
Covariance Structure |
Unstructured |
Subject Effect |
Animal |
Estimation Method |
REML |
Residual Variance Method |
None |
Fixed Effects SE Method |
Model-Based |
Degrees of Freedom Method |
Containment |
|
The "Unstructured" covariance structure applies to both
G and R here.
Class Level Information |
Class |
Levels |
Values |
Trait |
2 |
1 2 |
Animal |
3 |
1 2 3 |
|
The levels of Trait and Animal have been specified
correctly.
Dimensions |
Covariance Parameters |
3 |
Columns in X |
2 |
Columns in Z |
6 |
Subjects |
1 |
Max Obs Per Subject |
6 |
Observations Used |
5 |
Observations Not Used |
1 |
Total Observations |
6 |
|
The three covariance parameters indicated here correspond to those
from the R matrix. Those from G are considered fixed and
known because of the GDATA= option.
Parameter Search |
CovP1 |
CovP2 |
CovP3 |
Res Log Like |
-2 Res Log Like |
4.0000 |
1.0000 |
5.0000 |
-7.3731 |
14.7463 |
|
The preceding table results from the PARMS statement.
Estimated R Matrix for Subject 1 |
Row |
Col1 |
Col2 |
1 |
4.0000 |
1.0000 |
2 |
1.0000 |
5.0000 |
|
The block of R corresponding to the first animal is
shown in the "Estimated R Matrix" table.
Estimated Inv(R) Matrix for Subject 1 |
Row |
Col1 |
Col2 |
1 |
0.2632 |
-0.05263 |
2 |
-0.05263 |
0.2105 |
|
The inverse of the block of R corresponding to the first animal
is shown in the preceding table.
Estimated G Matrix |
Row |
Effect |
Trait |
Animal |
Col1 |
Col2 |
Col3 |
Col4 |
Col5 |
Col6 |
1 |
Trait*Animal |
1 |
1 |
2.0000 |
1.0000 |
1.0000 |
2.0000 |
1.0000 |
1.0000 |
2 |
Trait*Animal |
1 |
2 |
1.0000 |
2.0000 |
0.5000 |
1.0000 |
2.0000 |
0.5000 |
3 |
Trait*Animal |
1 |
3 |
1.0000 |
0.5000 |
2.0000 |
1.0000 |
0.5000 |
2.0000 |
4 |
Trait*Animal |
2 |
1 |
2.0000 |
1.0000 |
1.0000 |
3.0000 |
1.5000 |
1.5000 |
5 |
Trait*Animal |
2 |
2 |
1.0000 |
2.0000 |
0.5000 |
1.5000 |
3.0000 |
0.7500 |
6 |
Trait*Animal |
2 |
3 |
1.0000 |
0.5000 |
2.0000 |
1.5000 |
0.7500 |
3.0000 |
|
The preceding table lists the G matrix as specified in
the GDATA= data set.
Estimated Inv(G) Matrix |
Row |
Effect |
Trait |
Animal |
Col1 |
Col2 |
Col3 |
Col4 |
Col5 |
Col6 |
1 |
Trait*Animal |
1 |
1 |
2.5000 |
-1.0000 |
-1.0000 |
-1.6667 |
0.6667 |
0.6667 |
2 |
Trait*Animal |
1 |
2 |
-1.0000 |
2.0000 |
|
0.6667 |
-1.3333 |
|
3 |
Trait*Animal |
1 |
3 |
-1.0000 |
|
2.0000 |
0.6667 |
|
-1.3333 |
4 |
Trait*Animal |
2 |
1 |
-1.6667 |
0.6667 |
0.6667 |
1.6667 |
-0.6667 |
-0.6667 |
5 |
Trait*Animal |
2 |
2 |
0.6667 |
-1.3333 |
|
-0.6667 |
1.3333 |
|
6 |
Trait*Animal |
2 |
3 |
0.6667 |
|
-1.3333 |
-0.6667 |
|
1.3333 |
|
The preceding table lists G-1. The blank values correspond to
zeros.
Covariance Parameter Estimates |
Cov Parm |
Subject |
Estimate |
UN(1,1) |
Animal |
4.0000 |
UN(2,1) |
Animal |
1.0000 |
UN(2,2) |
Animal |
5.0000 |
|
The parameters from R are listed again.
Fit Statistics |
Res Log Likelihood |
-7.4 |
Akaike's Information Criterion |
-10.4 |
Schwarz's Bayesian Criterion |
-10.1 |
-2 Res Log Likelihood |
14.7 |
|
You can use this model-fitting information to compare
this model with others.
Mixed Model Equations |
Row |
Effect |
Trait |
Animal |
Col1 |
Col2 |
Col3 |
Col4 |
Col5 |
Col6 |
Col7 |
Col8 |
Col9 |
1 |
Trait |
1 |
|
0.7763 |
-0.1053 |
0.2632 |
0.2632 |
0.2500 |
-0.05263 |
-0.05263 |
|
4.6974 |
2 |
Trait |
2 |
|
-0.1053 |
0.4211 |
-0.05263 |
-0.05263 |
|
0.2105 |
0.2105 |
|
2.2105 |
3 |
Trait*Animal |
1 |
1 |
0.2632 |
-0.05263 |
2.7632 |
-1.0000 |
-1.0000 |
-1.7193 |
0.6667 |
0.6667 |
1.1053 |
4 |
Trait*Animal |
1 |
2 |
0.2632 |
-0.05263 |
-1.0000 |
2.2632 |
|
0.6667 |
-1.3860 |
|
1.8421 |
5 |
Trait*Animal |
1 |
3 |
0.2500 |
|
-1.0000 |
|
2.2500 |
0.6667 |
|
-1.3333 |
1.7500 |
6 |
Trait*Animal |
2 |
1 |
-0.05263 |
0.2105 |
-1.7193 |
0.6667 |
0.6667 |
1.8772 |
-0.6667 |
-0.6667 |
1.5789 |
7 |
Trait*Animal |
2 |
2 |
-0.05263 |
0.2105 |
0.6667 |
-1.3860 |
|
-0.6667 |
1.5439 |
|
0.6316 |
8 |
Trait*Animal |
2 |
3 |
|
|
0.6667 |
|
-1.3333 |
-0.6667 |
|
1.3333 |
|
|
The coefficients of the mixed model equations agree with Henderson
(1984, p. 55).
Mixed Model Equations Solution |
Row |
Effect |
Trait |
Animal |
Col1 |
Col2 |
Col3 |
Col4 |
Col5 |
Col6 |
Col7 |
Col8 |
Col9 |
1 |
Trait |
1 |
|
2.5508 |
1.5685 |
-1.3047 |
-1.1775 |
-1.1701 |
-1.3002 |
-1.1821 |
-1.1678 |
6.9909 |
2 |
Trait |
2 |
|
1.5685 |
4.5539 |
-1.4112 |
-1.3534 |
-0.9410 |
-2.1592 |
-2.1055 |
-1.3149 |
6.9959 |
3 |
Trait*Animal |
1 |
1 |
-1.3047 |
-1.4112 |
1.8282 |
1.0652 |
1.0206 |
1.8010 |
1.0925 |
1.0070 |
0.05450 |
4 |
Trait*Animal |
1 |
2 |
-1.1775 |
-1.3534 |
1.0652 |
1.7589 |
0.7085 |
1.0900 |
1.7341 |
0.7209 |
-0.04955 |
5 |
Trait*Animal |
1 |
3 |
-1.1701 |
-0.9410 |
1.0206 |
0.7085 |
1.7812 |
1.0095 |
0.7197 |
1.7756 |
0.02230 |
6 |
Trait*Animal |
2 |
1 |
-1.3002 |
-2.1592 |
1.8010 |
1.0900 |
1.0095 |
2.7518 |
1.6392 |
1.4849 |
0.2651 |
7 |
Trait*Animal |
2 |
2 |
-1.1821 |
-2.1055 |
1.0925 |
1.7341 |
0.7197 |
1.6392 |
2.6874 |
0.9930 |
-0.2601 |
8 |
Trait*Animal |
2 |
3 |
-1.1678 |
-1.3149 |
1.0070 |
0.7209 |
1.7756 |
1.4849 |
0.9930 |
2.7645 |
0.1276 |
|
The solution to the mixed model equations also matches that given by
Henderson (1984, p. 55).
Solution for Fixed Effects |
Effect |
Trait |
Estimate |
Standard Error |
DF |
t Value |
Pr > |t| |
Trait |
1 |
6.9909 |
1.5971 |
3 |
4.38 |
0.0221 |
Trait |
2 |
6.9959 |
2.1340 |
3 |
3.28 |
0.0465 |
|
The estimates for the two traits are nearly identical, but the
standard error of the second one is larger because of the missing
observation.
Solution for Random Effects |
Effect |
Trait |
Animal |
Estimate |
Std Err Pred |
DF |
t Value |
Pr > |t| |
Trait*Animal |
1 |
1 |
0.05450 |
1.3521 |
0 |
0.04 |
. |
Trait*Animal |
1 |
2 |
-0.04955 |
1.3262 |
0 |
-0.04 |
. |
Trait*Animal |
1 |
3 |
0.02230 |
1.3346 |
0 |
0.02 |
. |
Trait*Animal |
2 |
1 |
0.2651 |
1.6589 |
0 |
0.16 |
. |
Trait*Animal |
2 |
2 |
-0.2601 |
1.6393 |
0 |
-0.16 |
. |
Trait*Animal |
2 |
3 |
0.1276 |
1.6627 |
0 |
0.08 |
. |
|
The Estimate column lists the best linear unbiased
predictions (BLUPs) of the breeding values of both traits for all
three animals. The p-values are missing because the default
containment method for computing degrees of freedom results in zero
degrees of freedom for the random effects parameter tests.
Type 3 Tests of Fixed Effects |
Effect |
Num DF |
Den DF |
F Value |
Pr > F |
Trait |
2 |
3 |
10.59 |
0.0437 |
|
The two estimated traits are significantly different from zero at
the 5% level.
Obs |
Trait |
Animal |
Y |
Pred |
StdErrPred |
DF |
Alpha |
Lower |
Upper |
Resid |
1 |
1 |
1 |
6 |
7.04542 |
1.33027 |
0 |
0.05 |
. |
. |
-1.04542 |
2 |
1 |
2 |
8 |
6.94137 |
1.39806 |
0 |
0.05 |
. |
. |
1.05863 |
3 |
1 |
3 |
7 |
7.01321 |
1.41129 |
0 |
0.05 |
. |
. |
-0.01321 |
4 |
2 |
1 |
9 |
7.26094 |
1.72839 |
0 |
0.05 |
. |
. |
1.73906 |
5 |
2 |
2 |
5 |
6.73576 |
1.74077 |
0 |
0.05 |
. |
. |
-1.73576 |
6 |
2 |
3 |
. |
7.12015 |
3.11701 |
0 |
0.05 |
. |
. |
. |
|
The preceding table contains the predicted values of the
observations based on the trait and breeding value estimates, that
is, the fixed and random effects. The predicted values are not the
predictions of future records in the sense that they do not contain
a component corresponding to a new observational error. Refer to
Henderson (1984) for information on predicting future records. The
L95 and U95 columns usually contain confidence limits for the
predicted values; they are missing here because the random-effects
parameter degrees of freedom equals 0.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.