Chapter Contents |
Previous |
Next |
The MEANS Procedure |
Using Class Variables |
When you use a WAYS statement, PROC MEANS generates types that correspond to every possible unique combination of n class variables chosen from the complete set of class variables. For example
proc means; class a b c d e; ways 2 3; run;is equivalent to
proc means; class a b c d e; types a*b a*c a*d a*e b*c b*d b*e c*d c*e d*e a*b*c a*b*d a*b*e a*c*d a*c*e a*d*e b*c*d b*c*e c*d*e; run;If you omit the TYPES statement and the WAYS statement, PROC MEANS uses all class variables to subgroup the data (the NWAY type) for displayed output and computes all types ( ) for the output data set.
data pets; input Pet $ Gender $; datalines; dog m dog f dog f dog f cat m cat m cat f ; proc means data=pets order=freq; class pet gender; run;The statements produce this output. In the example, PROC MEANS does not list male cats before female cats. Instead, it determines the order of gender for all types over the entire data set. PROC MEANS found more observations for female pets (f=4, m=3).
Computational Resources |
where
is the number of unique values for the class variable | |
is the combined unformatted and formatted length of | |
is some constant on the order of 32 bytes (64 for 64-bit architectures). |
Each unique combination of class variables, , for a given type forms a level in that type (see TYPES Statement ). You can estimate the maximum potential space requirements for all levels of a given type, when all combinations actually exist in the data (a complete type), by calculating
where
is a constant based on the number of variables analyzed and the number of statistics calculated (unless you request QMETHOD=OS to compute the quantiles). | |
are the number of unique levels for the active class variables of the given type. |
If PROC MEANS must write partially complete primary types to disk while it processes input data, then one or more merge passes may be required to combine type levels in memory with those on disk. In addition, if you use an order other than DATA for any class variable, PROC MEANS groups the completed type on disk. For this reason, the peak disk space requirements can be more than twice the memory requirements for a given type.
When PROC MEANS uses a temporary work file, you will receive the following note in the SAS log:
Processing on disk occurred during summarization. Peak disk usage was approximately nnn Mbytes. Adjusting SUMSIZE may improve performance.In most cases processing ends normally.
When you specify class variables in a CLASS statement, the amount of data-dependent memory that PROC MEANS uses before it writes to a utility file is controlled by the SAS system option and PROC option SUMSIZE=. Like the system option SORTSIZE=, SUMSIZE= sets the memory threshold where disk-based operations begin. For best results, set SUMSIZE= to less than the amount of real memory that is likely to be available for the task. For efficiency reasons, PROC MEANS may internally round up the value of SUMSIZE=. SUMSIZE= has no effect unless you specify class variables.
If PROC MEANS reports that there is insufficient memory, increase SUMSIZE=. A SUMSIZE= value greater than MEMSIZE= will have no effect. Therefore, you may also need to increase MEMSIZE=. If PROC MEANS reports insufficient disk space, increase the WORK space allocation. See the SAS documentation for your operating environment for more information on how to adjust your computation resource parameters.
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.