Output Data Sets
OUT= Data Set
The OUT= data set contains all the variables in the original data
set plus new variables containing the canonical variable scores.
The N= option determines the number of new variables.
The OUT= data set is not created if N=0.
The names of the new variables are formed by concatenating
the value given by the PREFIX= option (or the prefix CAN if the
PREFIX= option is not specified) and the numbers 1, 2, 3, and so on.
The OUT= data set can be used as input to
PROC CLUSTER or PROC FASTCLUS.
The cluster analysis should be performed on the
canonical variables, not on the original variables.
OUTSTAT= Data Set
The OUTSTAT= data set is a TYPE=ACE data set containing
the following variables.
- the BY variables, if any
- the two new character variables, _TYPE_ and
_NAME_
- the variables analyzed, that is, those in the VAR statement,
or, if there is no VAR statement, all numeric variables not
listed in any other statement
Each observation in the new data set contains some type
of statistic as indicated by the _TYPE_ variable.
The values of the _TYPE_ variable are as follows:
- _TYPE_
- Contents
- MEAN
- mean of each variable
- STD
- standard deviation of each variable
- N
- number of observations on which the analysis is based.
This value is the same for each variable.
- SUMWGT
- sum of the weights if a WEIGHT statement is used.
This value is the same for each variable.
- COV
- covariances between each variable and the
variable named by the _NAME_ variable.
The number of observations with _TYPE_=COV is equal to
the number of variables being analyzed.
- ACE
- estimated within-cluster covariances between each
variable and the variable named by the _NAME_ variable.
The number of observations with _TYPE_=ACE is equal
to the number of variables being analyzed.
- EIGENVAL
- eigenvalues of INV(ACE)*(COV-ACE).
If the N= option requests fewer than the maximum number of
canonical variables, only the specified number of eigenvalues
are produced, with missing values filling out the observation.
- SCORE
- standardized canonical coefficients. The _NAME_ variable
contains the name of the corresponding canonical variable as
constructed from the PREFIX= option. The number of observations with
_TYPE_=SCORE equals the number of canonical variables
computed.
To obtain the canonical variable scores, these coefficients should be
multiplied by the standardized data.
- RAWSCORE
- raw canonical coefficients.
To obtain the canonical variable scores, these coefficients should be
multiplied by the raw (centered) data.
The OUTSTAT= data set can be used
- to initialize another execution of PROC ACECLUS
- to compute canonical variable scores with the SCORE procedure
- as input to the FACTOR procedure, specifying METHOD=SCORE,
to rotate the canonical variables
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.