Chapter Contents |
Previous |
Next |
The MEANS Procedure |
Missing Values |
PROC MEANS tabulates the number of the missing values. Before the number of missing values are tabulated, PROC MEANS excludes observations with frequencies that are nonpositive when you use the FREQ statement and observations with weights that are missing or nonpositive (when you use the EXCLNPWGT option) when you use the WEIGHT statement. To report this information in the procedure output use the NMISS statistical keyword in the PROC statement.
Column Width for the Output |
The N Obs Statistic |
In the output data set, the value of N Obs is stored in the _FREQ_ variable. Use the NONOBS option in the PROC statement to suppress this information in the displayed output.
Output Data Set |
Note: By default the statistics in the output data set automatically
inherit the analysis variable's format and label. However, statistics computed
for N, NMISS, SUMWGT, USS, CSS, VAR, CV, T, PROBT, SKEWNESS, and KURTOSIS
do not inherit the analysis variable's format because this format may be invalid
for these statistics. Use the NOINHERIT option in the OUTPUT statement to
prevent the other statistics from inheriting the format and label attributes.
The output data set can contain these variables:
The value of _TYPE_ indicates which combination of the class variables PROC MEANS uses to compute the statistics. The character value of _TYPE_ is a series of zeros and ones, where each value of one indicates an active class variable in the type. For example, with three class variables, PROC MEANS represents type 1 as 001, type 5 as 101, and so on.
Usually, the output data set contains one observation per level per type. However, if you omit statistical keywords in the OUTPUT statement, the output data set contains five observations per level (six if you specify a WEIGHT variable). Therefore, the total number of observations in the output data set is equal to the sum of the levels for all the types you request multiplied by 1, 5, or 6, whichever is applicable.
If you omit the CLASS statement (_TYPE_ = 0), there is always exactly one level of output per BY-group. If you use a CLASS statement, then the number of levels for each type you request has an upper bound equal to the number of observations in the input data set. By default, PROC MEANS generates all possible types. In this case the total number of levels for each BY-group has an upper bound equal to
where is the number of class variables and is the number of observations for the given BY group in the input data set and is 1, 5, or 6.
PROC MEANS determines the actual number of levels for a given type from the number of unique combinations of each active class variable. A single level is composed of all input observations whose formatted class values match.
The Effect of Class Variables on the OUTPUT Data Set shows the values of _TYPE_ and the number of observations in the data set when you specify one, two, and three class variables.
The Effect of Class Variables on the OUTPUT Data Set
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.