Chapter Contents |
Previous |
Next |
The VARCLUS Procedure |
The VARCLUS procedure divides a set of numeric variables into either disjoint or hierarchical clusters. Associated with each cluster is a linear combination of the variables in the cluster, which may be either the first principal component or the centroid component. The first principal component is a weighted average of the variables that explains as much variance as possible. See Chapter 52, "The PRINCOMP Procedure," for further details. Centroid components (the CENTROID option) are unweighted averages of either the standardized variables (the default) or the raw variables (if you specify the COV option).
PROC VARCLUS tries to maximize the sum across clusters of the variance of the original variables that is explained by the cluster components. Either the correlation or the covariance matrix can be analyzed. If correlations are used, all variables are treated as equally important. If covariances are used, variables with larger variances have more importance in the analysis.
PROC VARCLUS creates an output data set that can be used with the SCORE procedure to compute component scores for each cluster. A second output data set can be used by the TREE procedure to draw a tree diagram of hierarchical clusters.
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.