Example 59.1: Standardization of Variables in Cluster Analysis
To illustrate the effect of standardization in cluster analysis,
this example uses the
Fish data set described in the "Getting Started" section of
Chapter 27, "The FASTCLUS Procedure." The numbers are
measurements taken on 159 fish caught off the coast of Finland; this
data set is available from the Data Archive of the Journal of
Statistics Education. The complete data set is displayed in
Chapter 60, "The STEPDISC Procedure."
The species (Bream, Parkki, Pike, Perch, Roach, Smelt, and Whitefish),
weight, three different length measurements (measured from the nose of
the fish to the beginning of its tail, the notch of its tail, and the
end of its tail), height, and width of each fish are recorded. The
height and width are recorded as percentages of the third length
variable.
Several new variables are created in the Fish data set:
Weight3, Height, Width, and logLengthRatio.
The weight of a fish indicates its size -a heavier Tuna tends to
be larger than a lighter Tuna. To get a one dimensional measure of
the size of a fish, take the cubic root of the weight (Weight3).
The variables Height, Width, Length1, Length2,
and Length3
are rescaled in order to adjust for dimensionality.
The logLengthRatio variable measures the tail length.
Because the new variables Weight3 -logLengthRatio
depend on
the variable Weight, observations with missing values for
Weight are not added to the data set. Consequently,
there are 157 observations in the SAS data set Fish.
Before you perform a cluster analysis on coordinate data, it
is necessary to consider scaling or transforming the variables
since variables with large variances tend to have a larger effect
on the resulting clusters than those with small variances.
This example uses three different approaches to
standardize or transform the data prior to the cluster analysis.
The first approach uses several standardization methods provided in the STDIZE procedure.
However, since standardization is not always appropriate prior to the
clustering (refer to Milligan and Cooper, 1987, for a Monte Carlo
study on various methods of variable standardization),
the second approach performs the cluster analysis with no
standardization.
The third approach invokes the ACECLUS procedure to transform the data into a
within-cluster covariance matrix.
The clustering is performed by the FASTCLUS procedure to find
seven clusters. Note that the variables Length2
and Length3 are
eliminated from this analysis since they both are significantly
and highly correlated with the variable Length1. The correlation coefficients are
0.9958 and 0.9604, respectively.
An output data set is created, and the FREQ procedure is invoked
to compare the clusters with the species classification.
The DATA step is as follows:
proc format;
value specfmt
1='Bream'
2='Roach'
3='Whitefish'
4='Parkki'
5='Perch'
6='Pike'
7='Smelt';
data Fish (drop=HtPct WidthPct);
title 'Fish Measurement Data';
input Species Weight Length1 Length2 Length3 HtPct
WidthPct @@;
if Weight <= 0 or Weight=. then delete;
Weight3=Weight**(1/3);
Height=HtPct*Length3/(Weight3*100);
Width=WidthPct*Length3/(Weight3*100);
Length1=Length1/Weight3;
Length2=Length2/Weight3;
Length3=Length3/Weight3;
logLengthRatio=log(Length3/Length1);
format Species specfmt.;
symbol = put(Species, specfmt2.);
datalines;
1 242.0 23.2 25.4 30.0 38.4 13.4
1 290.0 24.0 26.3 31.2 40.0 13.8
1 340.0 23.9 26.5 31.1 39.8 15.1
1 363.0 26.3 29.0 33.5 38.0 13.3
... [155 more records]
;
run;
The following macro, Std, standardizes the Fish data.
The macro reads a single argument, mtd, which selects
the METHOD= specification to be used in PROC STDIZE.
/*--- macro for standardization ---*/
%macro Std(mtd);
title2 "Data is standardized by PROC STDIZE with
METHOD= &mtd";
proc stdize data=fish out=sdzout method=&mtd;
var Length1 logLengthRatio Height Width Weight3;
run;
%mend Std;
The following macro, FastFreq,
includes a PROC FASTCLUS statement for performing cluster analysis and
a PROC FREQ statement for cross-tabulating species with the
cluster membership information that is derived from the previous PROC FASTCLUS
statement.
The macro reads a single argument, ds, which selects the input
data set to be used in PROC FASTCLUS.
/*--- macro for clustering and cross-tabulating ---*/
/*--- cluster membership with species ---*/
%macro FastFreq(ds);
proc fastclus data=&ds out=clust maxclusters=7 maxiter=100 noprint;
var Length1 logLengthRatio Height Width Weight3;
run;
proc freq data=clust;
tables species*cluster;
run;
%mend FastFreq;
The following analysis, (labeled `Approach 1')
includes 18 different methods of standardization
followed by clustering. Since there is a large amount of output
from this approach, only results from METHOD=STD, METHOD=RANGE, METHOD=AGK(.14), and
METHOD=SPACING(.14) are shown. The following
statements produce Output 59.1.1 through Output 59.1.4.
/**********************************************************/
/* */
/* Approach 1: data is standardized by PROC STDIZE */
/* */
/**********************************************************/
%Std(MEAN);
%FastFreq(sdzout);
%Std(MEDIAN);
%FastFreq(sdzout);
%Std(SUM);
%FastFreq(sdzout);
%Std(EUCLEN);
%FastFreq(sdzout);
%Std(USTD);
%FastFreq(sdzout);
%Std(STD);
%FastFreq(sdzout);
%Std(RANGE);
%FastFreq(sdzout);
%Std(MIDRANGE);
%FastFreq(sdzout);
%Std(MAXABS);
%FastFreq(sdzout);
%Std(IQR);
%FastFreq(sdzout);
%Std(MAD);
%FastFreq(sdzout);
%Std(AGK(.14));
%FastFreq(sdzout);
%Std(SPACING(.14));
%FastFreq(sdzout);
%Std(ABW(5));
%FastFreq(sdzout);
%Std(AWAVE(5));
%FastFreq(sdzout);
%Std(L(1));
%FastFreq(sdzout);
%Std(L(1.5));
%FastFreq(sdzout);
%Std(L(2));
%FastFreq(sdzout);
Output 59.1.1: Data is standardized by PROC STDIZE with METHOD=STD
Fish Measurement Data |
Data is standardized by PROC STDIZE with METHOD= STD |
Frequency Percent Row Pct Col Pct |
|
Table of Species by CLUSTER |
Species |
CLUSTER(Cluster) |
Total |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
Bream |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
34 21.66 100.00 100.00 |
0 0.00 0.00 0.00 |
34 21.66 |
Roach |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
19 12.10 100.00 38.00 |
19 12.10 |
Whitefish |
0 0.00 0.00 0.00 |
2 1.27 33.33 10.53 |
0 0.00 0.00 0.00 |
1 0.64 16.67 7.69 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
3 1.91 50.00 6.00 |
6 3.82 |
Parkki |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
11 7.01 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
11 7.01 |
Perch |
0 0.00 0.00 0.00 |
17 10.83 30.36 89.47 |
0 0.00 0.00 0.00 |
12 7.64 21.43 92.31 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
27 17.20 48.21 54.00 |
56 35.67 |
Pike |
17 10.83 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
17 10.83 |
Smelt |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
13 8.28 92.86 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
1 0.64 7.14 2.00 |
14 8.92 |
Total |
17 10.83 |
19 12.10 |
13 8.28 |
13 8.28 |
11 7.01 |
34 21.66 |
50 31.85 |
157 100.00 |
|
|
Output 59.1.2: Data is standardized by PROC STDIZE with METHOD=RANGE
Fish Measurement Data |
Data is standardized by PROC STDIZE with METHOD= RANGE |
Frequency Percent Row Pct Col Pct |
|
Table of Species by CLUSTER |
Species |
CLUSTER(Cluster) |
Total |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
Bream |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
34 21.66 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
34 21.66 |
Roach |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
19 12.10 100.00 61.29 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
19 12.10 |
Whitefish |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
3 1.91 50.00 9.68 |
3 1.91 50.00 13.04 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
6 3.82 |
Parkki |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
11 7.01 100.00 100.00 |
0 0.00 0.00 0.00 |
11 7.01 |
Perch |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
9 5.73 16.07 29.03 |
20 12.74 35.71 86.96 |
0 0.00 0.00 0.00 |
27 17.20 48.21 100.00 |
56 35.67 |
Pike |
17 10.83 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
17 10.83 |
Smelt |
0 0.00 0.00 0.00 |
14 8.92 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
14 8.92 |
Total |
17 10.83 |
14 8.92 |
34 21.66 |
31 19.75 |
23 14.65 |
11 7.01 |
27 17.20 |
157 100.00 |
|
|
Output 59.1.3: Data is standardized by PROC STDIZE with METHOD=AGK(.14)
Fish Measurement Data |
Data is standardized by PROC STDIZE with METHOD= AGK(.14) |
Frequency Percent Row Pct Col Pct |
|
Table of Species by CLUSTER |
Species |
CLUSTER(Cluster) |
Total |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
Bream |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
34 21.66 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
34 21.66 |
Roach |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
17 10.83 89.47 73.91 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
2 1.27 10.53 5.71 |
19 12.10 |
Whitefish |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
3 1.91 50.00 13.04 |
0 0.00 0.00 0.00 |
3 1.91 50.00 13.04 |
0 0.00 0.00 0.00 |
6 3.82 |
Parkki |
11 7.01 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
11 7.01 |
Perch |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
3 1.91 5.36 13.04 |
0 0.00 0.00 0.00 |
20 12.74 35.71 86.96 |
33 21.02 58.93 94.29 |
56 35.67 |
Pike |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
17 10.83 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
17 10.83 |
Smelt |
0 0.00 0.00 0.00 |
14 8.92 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
14 8.92 |
Total |
11 7.01 |
14 8.92 |
34 21.66 |
23 14.65 |
17 10.83 |
23 14.65 |
35 22.29 |
157 100.00 |
|
|
Output 59.1.4: Data is standardized by PROC STDIZE with METHOD=SPACING(.14)
Fish Measurement Data |
Data is standardized by PROC STDIZE with METHOD= SPACING(.14) |
Frequency Percent Row Pct Col Pct |
|
Table of Species by CLUSTER |
Species |
CLUSTER(Cluster) |
Total |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
Bream |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
34 21.66 100.00 100.00 |
34 21.66 |
Roach |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
17 10.83 89.47 85.00 |
0 0.00 0.00 0.00 |
2 1.27 10.53 5.26 |
0 0.00 0.00 0.00 |
19 12.10 |
Whitefish |
3 1.91 50.00 13.04 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
3 1.91 50.00 15.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
6 3.82 |
Parkki |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
11 7.01 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
11 7.01 |
Perch |
20 12.74 35.71 86.96 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
36 22.93 64.29 94.74 |
0 0.00 0.00 0.00 |
56 35.67 |
Pike |
0 0.00 0.00 0.00 |
17 10.83 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
17 10.83 |
Smelt |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
14 8.92 100.00 100.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
14 8.92 |
Total |
23 14.65 |
17 10.83 |
11 7.01 |
20 12.74 |
14 8.92 |
38 24.20 |
34 21.66 |
157 100.00 |
|
|
The following analysis (labeled `Approach 2') applies the cluster
analysis directly to the original data. The following statements produce Output 59.1.5.
/**********************************************************/
/* */
/* Approach 2: data is untransformed */
/* */
/**********************************************************/
title2 'Data is untransformed';
%FastFreq(fish);
Output 59.1.5: Untransformed Data
Fish Measurement Data |
Data is untransformed |
Frequency Percent Row Pct Col Pct |
|
Table of Species by CLUSTER |
Species |
CLUSTER(Cluster) |
Total |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
Bream |
13 8.28 38.24 44.83 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
21 13.38 61.76 47.73 |
34 21.66 |
Roach |
3 1.91 15.79 10.34 |
4 2.55 21.05 25.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
12 7.64 63.16 30.77 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
19 12.10 |
Whitefish |
3 1.91 50.00 10.34 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
3 1.91 50.00 6.82 |
6 3.82 |
Parkki |
2 1.27 18.18 6.90 |
3 1.91 27.27 18.75 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
6 3.82 54.55 15.38 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
11 7.01 |
Perch |
8 5.10 14.29 27.59 |
9 5.73 16.07 56.25 |
0 0.00 0.00 0.00 |
1 0.64 1.79 6.67 |
20 12.74 35.71 51.28 |
0 0.00 0.00 0.00 |
18 11.46 32.14 40.91 |
56 35.67 |
Pike |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
10 6.37 58.82 100.00 |
0 0.00 0.00 0.00 |
1 0.64 5.88 2.56 |
4 2.55 23.53 100.00 |
2 1.27 11.76 4.55 |
17 10.83 |
Smelt |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
14 8.92 100.00 93.33 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
14 8.92 |
Total |
29 18.47 |
16 10.19 |
10 6.37 |
15 9.55 |
39 24.84 |
4 2.55 |
44 28.03 |
157 100.00 |
|
|
The following analysis (labeled `Approach 3') transforms the original
data with
the ACECLUS procedure and creates a TYPE=ACE output data set that is used as an input
data set for the cluster analysis.
The following statements produce Output 59.1.6.
/**********************************************************/
/* */
/* Approach 3: data is transformed by PROC ACECLUS */
/* */
/**********************************************************/
title2 'Data is transformed by PROC ACECLUS';
proc aceclus data=fish out=ace p=.02 noprint;
var Length1 logLengthRatio Height Width Weight3;
run;
%FastFreq(ace);
Output 59.1.6: Data is transformed by PROC ACECLUS
Fish Measurement Data |
Data is transformed by PROC ACECLUS |
Frequency Percent Row Pct Col Pct |
|
Table of Species by CLUSTER |
Species |
CLUSTER(Cluster) |
Total |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
Bream |
13 8.28 38.24 44.83 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
21 13.38 61.76 47.73 |
34 21.66 |
Roach |
3 1.91 15.79 10.34 |
4 2.55 21.05 25.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
12 7.64 63.16 30.77 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
19 12.10 |
Whitefish |
3 1.91 50.00 10.34 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
3 1.91 50.00 6.82 |
6 3.82 |
Parkki |
2 1.27 18.18 6.90 |
3 1.91 27.27 18.75 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
6 3.82 54.55 15.38 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
11 7.01 |
Perch |
8 5.10 14.29 27.59 |
9 5.73 16.07 56.25 |
0 0.00 0.00 0.00 |
1 0.64 1.79 6.67 |
20 12.74 35.71 51.28 |
0 0.00 0.00 0.00 |
18 11.46 32.14 40.91 |
56 35.67 |
Pike |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
10 6.37 58.82 100.00 |
0 0.00 0.00 0.00 |
1 0.64 5.88 2.56 |
4 2.55 23.53 100.00 |
2 1.27 11.76 4.55 |
17 10.83 |
Smelt |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
14 8.92 100.00 93.33 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
0 0.00 0.00 0.00 |
14 8.92 |
Total |
29 18.47 |
16 10.19 |
10 6.37 |
15 9.55 |
39 24.84 |
4 2.55 |
44 28.03 |
157 100.00 |
|
|
Table 59.4 displays a table summarizing each classification
results.
In this table, the first column
represents the standardization method, the second column represents
the number of clusters that the 7 species are classified into, and the
third column represents the total number of observations that are
misclassified.
Table 59.4: Summary of Clustering Results
Method of Standardization
|
Number of Clusters
|
Misclassification
|
MEAN | 5 | 71 |
MEDIAN | 5 | 71 |
SUM | 6 | 51 |
EUCLEN | 6 | 45 |
USTD | 6 | 45 |
STD | 5 | 33 |
RANGE | 7 | 32 |
MIDRANGE | 7 | 32 |
MAXABS | 7 | 26 |
IQR | 5 | 28 |
MAD | 4 | 35 |
ABW(5) | 6 | 34 |
AWAVE(5) | 6 | 29 |
AGK(.14) | 7 | 28 |
SPACING(.14) | 7 | 25 |
L(1) | 6 | 41 |
L(1.5) | 5 | 33 |
L(2) | 5 | 33 |
untransformed | 5 | 71 |
PROC ACECLUS | 5 | 71 |
Consider the results displayed in Output 59.1.1.
In that analysis, the method of standardization is STD,
and the number of clusters and the number of misclassifications
are computed as shown in Table 59.5.
In Output 59.1.1, the Bream species is classified as cluster 6 since
all 34 Bream fish are categorized into cluster 6 with no
misclassification. A similar pattern is seen with the Roach, Parkki,
Pike, and Smelt species.
For the Whitefish species, two fish are categorized into cluster 2,
one fish is categorized into cluster 4, and three fish are
categorized into cluster 7. Because the majority of this species is
categorized into cluster 7, it is recorded in Table 59.5
as being classified as cluster 7 with 3 misclassifications. A similar
pattern is seen with the Perch species: it is classified as cluster 7
with 29 misclassifications.
In summary, when the standardization method is STD, seven species of
fish are classified into only 5 clusters and the total number of
misclassified observations is 33.
The result of this analysis demonstrates that when variables are standardized by the STDIZE
procedure with methods including RANGE, MIDRANGE, MAXABS, AGK(.14),
and SPACING(.14), the FASTCLUS procedure produces the correct number of clusters and less
misclassification than it does when other standardization methods are
used.
The SPACING method attains
the best result, probably because the variables Length1 and
Height both exhibit marked groupings (bimodality) in their distributions.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.