Specifies the variables whose values define the subgroup combinations
for the analysis.
CLASS variable(s) </
options>;
|
- variable(s)
- specifies one or more variables that the
procedure uses to group the data. Variables in a CLASS statement are referred
to as class variables. Class variables are numeric or character.
Class variables can have continuous values, but they typically have a few
discrete values that define levels of the variable. You do not have to sort
the data by class variables.
Interaction: |
Use the TYPES statement
and the WAYS statement to control which class variables that PROC MEANS uses
to group the data. |
Tip: |
To reduce the number of
class variable levels, use a FORMAT statement to combine variable values.
When a format combines several internal values into one formatted value, PROC
MEANS outputs the lowest internal value. |
See
also: |
Using Class Variables |
-
ASCENDING
- specifies to sort the class variable levels
in ascending order.
-
DESCENDING
- specifies to sort the class variable levels
in descending order.
Alias: |
DESCEND |
Interaction: |
PROC MEANS issues
a warning message if you specify both ASCENDING and DESCENDING and ignores
both options. |
-
EXCLUSIVE
- excludes from the analysis all combinations
of the class variables that are not found in the preloaded range of user-defined
formats.
-
GROUPINTERNAL
- specifies not to apply formats to the class
variables when PROC MEANS groups the values to create combinations of class
variables.
Interaction: |
If you specify the
PRELOADFMT option, PROC MEANS ignores this option and uses the formatted values. |
Tip: |
This option saves computer
resources when the numeric class variables contain discrete values. |
See also: |
Computer Resources |
-
MISSING
- considers missing values as valid values
for the class variable levels. Special missing values that represent numeric
values (the letters A through Z and the underscore (_) character) are each
considered as a separate value.
- MLF
- enables PROC MEANS to use the primary and secondary format
labels for a given range or overlapping ranges to create subgroup combinations
when a multilabel format is assigned to a class variable.
Requirement: |
You must use PROC FORMAT and the
MULTILABEL option in the VALUE statement to create a multilabel format. |
Interaction: |
If you use the OUTPUT statement
with MLF, the class variable contains a character string that corresponds
to the formatted value. Because the formatted value becomes the internal value,
the length of this variable is the number of characters in the longest format
label. |
Interaction: |
Using MLF with ORDER=FREQ may not
produce the order that you expect for the formatted values. |
Tip: |
If you omit MLF, PROC MEANS uses the primary
format labels, which corresponds to using the first external format value,
to determine the subgroup combinations. |
See also: |
The MULTILABEL option in the VALUE
statement of the FORMAT procedure. |
Featured in: |
Using Multi-label Value Formats with Class Variables |
Note: When the formatted values overlap, one
internal class variable
value maps to more than one class variable subgroup combination. Therefore,
the sum of the N statistics for all subgroups is greater the number of observations
in the data set (the overall N statistic).
-
ORDER=DATA |
FORMATTED | FREQ | UNFORMATTED
- specifies the order to group the levels
of the class variables in the output, where
- DATA
- orders values according to their order in
the input data set.
Interaction: |
If you use PRELOADFMT,
the order for the values of each class variable matches the order that PROC
FORMAT uses to store the values of the associated user-defined format. If
you use the CLASSDATA= option in the PROC statement, PROC MEANS uses the order
of the unique values of each class variable in the CLASSDATA= data set to
order the output levels. If you use both options, PROC MEANS first uses the
user-defined formats to order the output. If you omit EXCLUSIVE in the PROC
statement, PROC MEANS appends after the user-defined format and the CLASSDATA=
values the unique values of the class variables in the input data set based
on the order that they are encountered. |
Tip: |
By default, PROC FORMAT
stores a format definition in sorted order. Use the NOTSORTED option to store
the values or ranges of a user defined format in the order that you define
them. |
Featured
in: |
Computing Output Statistics with Missing Class Variable Values |
- FORMATTED
- orders values by their ascending formatted
values. This order depends on your operating environment.
- FREQ
- orders values by descending frequency count
so that levels with the most observations are listed first.
Interaction: |
For multiway combinations
of the class variables, PROC MEANS determines the order of a level from the
individual class variable frequencies. |
Interaction: |
Use the ASCENDING
option to order values by ascending frequency count. |
Featured
in: |
Using Multi-label Value Formats with Class Variables |
- UNFORMATTED
- orders values by their unformatted values,
which yields the same order as PROC SORT. This order depends on your operating
environment. This sort sequence is particularly useful for displaying dates
chronologically.
Default: |
UNFORMATTED |
Tip: |
By default, all orders except
FREQ are ascending. For descending orders, use the DESCENDING option. |
See
also: |
Ordering the Class Values |
-
PRELOADFMT
- specifies that all formats are preloaded
for the class variables.
Requirement: |
PRELOADFMT has no
effect unless you specify either COMPLETETYPES, EXCLUSIVE, or ORDER=DATA and
you assign formats to the class variables. |
Interaction: |
To limit PROC MEANS
output to the combinations of formatted class variable values present in the
input data set, use the EXCLUSIVE option in the CLASS statement. |
Interaction: |
To include all ranges
and values of the user-defined formats in the output, even when the frequency
is zero, use COMPLETETYPES in the PROC statement. |
Featured
in: |
Using Preloaded Formats with Class Variables |
Using the BY statement is similar to using the CLASS statement
and the NWAY option in that PROC MEANS summarizes each BY group as an independent
subset of the input data. Therefore, no overall summarization of the input
data is available. However, unlike the CLASS statement, the BY statement requires
that you previously sort BY variables.
When you use the NWAY option, PROC MEANS may encounter
insufficient memory to the summarization all the class variables. You can
move some class variables to the BY statement. For maximum benefit, move class
variables to the BY statement that are already sorted or that have the greatest
number of unique values.
You can use the CLASS and BY statements together to
analyze the data by the levels of class variables within BY groups. See Using the BY Statement with Class Variables .
By default, if an observation contains a missing value for any
class variable, PROC MEANS excludes that observation from the analysis. If
you specify the MISSING option in the PROC statement, the procedure considers
missing values as valid levels for the combination of class variables.
Specifying the MISSING option in the CLASS statement
allows you to control the acceptance of missing values for individual class
variables.
The total of unique class values
that PROC MEANS allows depends
on the amount of computer memory that is available. See
Computational Resources for more information.
The GROUPINTERNAL option can improve computer performance
because the grouping process is based on the internal values of the class
variables. If a numeric class variable is not assigned a format and you do
not specify GROUPINTERNAL, PROC MEANS uses the default format to format numeric
values as character strings. Then PROC MEAN groups these numeric variables
by their character values, which takes additional time and computer memory.
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.