Output Data Sets
OUT= Data Set Created by the OUTPUT Statement
The OUTPUT statement produces an output
data set that contains the following:
- all original data from the SAS data set input to PROC GLM
- the new variables corresponding to the diagnostic
measures specified with statistics keywords in the
OUTPUT statement (PREDICTED=, RESIDUAL=, and so on).
With multiple dependent variables, a name can be specified
for any of the diagnostic measures for each of the dependent
variables in the order in which they occur in the MODEL statement.
For example, suppose that the input data set A
contains the variables y1, y2, y3, x1, and x2.
Then you can use the following statements:
proc glm data=A;
model y1 y2 y3=x1;
output out=out p=y1hat y2hat y3hat
r=y1resid lclm=y1lcl uclm=y1ucl;
run;
The output data set out contains y1, y2, y3, x1, x2,
y1hat, y2hat, y3hat, y1resid, y1lcl, and y1ucl.
The variable x2 is output even though it is not used by PROC GLM.
Although predicted values are generated for
all three dependent variables, residuals are
output for only the first dependent variable.
When any independent variable in the analysis (including all class
variables) is missing for an observation, then all new variables that
correspond to diagnostic measures are missing for the observation in
the output data set.
When a dependent variable in the analysis is missing for
an observation, then some new variables that correspond
to diagnostic measures are missing for the observation
in the output data set, and some are still available.
Specifically, in this case, the new variables that
correspond to COOKD, COVRATIO, DFFITS, PRESS, R, RSTUDENT,
STDR, and STUDENT are missing in the output data set.
The variables corresponding to H, LCL, LCLM, P, STDI,
STDP, UCL, and UCLM are not missing.
OUT= Data Set Created by the LSMEANS Statement
The OUT= option in the LSMEANS statement
produces an output data set that contains
- the unformatted values of each classification variable
specified in any effect in the LSMEANS statement
- a new variable, LSMEAN, which contains
the LS-mean for the specified
levels of the classification variables
- a new variable, STDERR, which contains the
standard error of the LS-mean
The variances and covariances among the LS-means are also output
when the COV option is specified along with the OUT= option.
In this case, only one effect can be specified
in the LSMEANS statement, and the following
variables are included in the output data set:
- new variables, COV1, COV2, ..., COVn,
where n is the number of levels of the
effect specified in the LSMEANS statement.
These variables contain the covariances of each LS-mean
with each other LS-mean.
- a new variable, NUMBER, which provides an
index for each observation to identify the
covariances that correspond to that observation.
The covariances for the observation with NUMBER
equal to n can be found in the variable COVn.
OUTSTAT= Data Set
The OUTSTAT= option in the PROC GLM statement
produces an output data set that contains
- the BY variables, if any
- _TYPE_, a new character variable.
_TYPE_ may take the values `SS1', `SS2', `SS3', `SS4',
or `CONTRAST', corresponding to the various types of sums
of squares generated, or the values `CANCORR', `STRUCTUR',
or `SCORE', if a canonical analysis is performed through
the MANOVA statement and no M= matrix is specified.
- _SOURCE_, a new character variable.
For each observation in the data set, _SOURCE_ contains
the name of the model effect or contrast label from which
the corresponding statistics are generated.
- _NAME_, a new character variable.
For each observation in the data set, _NAME_ contains
the name of one of the dependent variables in the model
or, in the case of canonical statistics, the name of one
of the canonical variables (CAN1, CAN2, and so forth).
- four new numeric variables: SS, DF,
F, and PROB,
containing sums of squares, degrees of freedom, F
values, and probabilities, respectively, for each model
or contrast sum of squares generated in the analysis.
For observations resulting from canonical analyses,
these variables have missing values.
- if there is more than one dependent variable,
then variables with the same names as the
dependent variables represent
- -
- for _TYPE_=SS1, SS2, SS3, SS4,
or CONTRAST, the crossproducts of the
hypothesis matrices
- -
- for _TYPE_=CANCORR, canonical correlations
for each variable
- -
- for _TYPE_=STRUCTUR, coefficients of the total
structure matrix
- -
- for _TYPE_=SCORE, raw canonical
score coefficients
The output data set can be used to perform special hypothesis
tests (for example, with the IML procedure in SAS/IML software), to
reformat output, to produce canonical variates (through the
SCORE procedure), or to rotate structure matrices (through the FACTOR procedure).
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.