Chapter Contents |
Previous |
Next |
The TABULATE Procedure |
Missing Values |
If . . . | PROC TABULATE, by default, . . . | To override the default . . . |
---|---|---|
an observation contains a missing value for an analysis variable | excludes that observation from the calculation of statistics (except N and NMISS) for that particular variable | no alternative |
an observation contains a missing value for a class variable | excludes that observation from the table (table note 1) | use MISSING in the PROC TABULATE statement, or MISSING in the CLASS statement |
there are no data for a category | does not show the category in the table | use PRINTMISS in the TABLE statement, or use CLASSDATA= in the PROC TABULATE statement |
every observation that contributes to a table cell contains a missing value for an analysis variable | displays a missing value for any statistics (except N and NMISS) in that cell | use MISSTEXT= in the TABLE statement |
there are no data for a formatted value | does not display that formatted value in the table | use PRELOADFMT in the CLASS statement with PRINTMISS in the TABLE statement, or use CLASSDATA= in the PROC TABULATE statement, or add dummy observations to the input data set so that it contains data for each formatted value |
a FREQ variable value is missing or is less than 1 | does not use that observation to calculate statistics | no alternative |
a WEIGHT variable value is missing or 0 | uses a value of 0 | no alternative |
TABLE NOTE 1: The CLASS statement applies to all TABLE statements
in a PROC TABULATE step. Therefore, if you define a variable as a class variable,
PROC TABULATE omits observations that have missing values for that variable
even if you do not use the variable in a TABLE statement.
This section presents a series of PROC TABULATE steps that illustrate how PROC TABULATE treats missing values. The following program creates the data set and formats that are used in this section and prints the data set. The data set COMPREV contains no missing values (see The Data Set COMPREV ).
proc format; value cntryfmt 1='United States' 2='Japan'; value compfmt 1='Supercomputer' 2='Mainframe' 3='Midrange' 4='Workstation' 5='Personal Computer' 6='Laptop'; run;
data comprev; input Country Computer Rev90 Rev91 Rev92; datalines; 1 1 788.8 877.6 944.9 1 2 12538.1 9855.6 8527.9 1 3 9815.8 6340.3 8680.3 1 4 3147.2 3474.1 3722.4 1 5 18660.9 18428.0 23531.1 2 1 469.9 495.6 448.4 2 2 5697.6 6242.4 5382.3 2 3 5392.1 5668.3 4845.9 2 4 1511.6 1875.5 1924.5 2 5 4746.0 4600.8 4363.7 ;
proc print data=comprev noobs; format country cntryfmt. computer compfmt.; title 'The Data Set COMPREV'; run;
proc tabulate data=comprev; class country computer; var rev90 rev91 rev92; table computer*country,rev90 rev91 rev92 / rts=32; format country cntryfmt. computer compfmt.; title 'Revenues from Computer Sales'; title2 'for 1990 to 1992'; run;Computer Sales Data: No Missing Values
data compmiss; set comprev; if _n_=8 then computer=.; run;
proc tabulate data=compmiss; class country computer; var rev90 rev91 rev92; table computer*country,rev90 rev91 rev92 / rts=32; format country cntryfmt. computer compfmt.; title 'Revenues from Computer Sales'; title2 'for 1990 to 1992'; run;Computer Sales Data: Midrange, Japan, Deleted
proc tabulate data=compmiss missing; class country computer; var rev90 rev91 rev92; table computer*country,rev90 rev91 rev92 / rts=32; format country cntryfmt. computer compfmt.; title 'Revenues from Computer Sales'; title2 'for 1990 to 1992'; run;Computer Sales Data: Missing Value for COMP
proc format; value misscomp 1='Supercomputer' 2='Mainframe' 3='Midrange' 4='Workstation' 5='Personal Computer' 6='Laptop' .='No type given'; run;
proc tabulate data=compmiss missing; class country computer; var rev90 rev91 rev92; table computer*country,rev90 rev91 rev92 / rts=32; format country cntryfmt. computer misscomp.; title 'Revenues for Computer Sales'; title2 'for 1990 to 1992'; run;Computer Sales Data: Text Supplied for Missing COMP Value
No type given
and for
United States
or for
Midrange
and
for
Japan
because there are no data in these categories. If you want the
table to represent all possible categories, use the PRINTMISS option in the
TABLE statement, as shown in the following program (see Computer Sales Data: Missing Statistics Values ):
proc tabulate data=compmiss missing; class country computer; var rev90 rev91 rev92; table computer*country,rev90 rev91 rev92 / rts=32 printmiss; format country cntryfmt. computer misscomp.; title 'Revenues for Computer Sales'; title2 'for 1990 to 1992'; run;Computer Sales Data: Missing Statistics Values
proc tabulate data=compmiss missing; class country computer; var rev90 rev91 rev92; table computer*country,rev90 rev91 rev92 / rts=32 printmiss misstext='NO DATA!'; format country cntryfmt. computer misscomp.; title 'Revenues for Computer Sales'; title2 'for 1990 to 1992'; run;Computer Sales Data: Text Supplied for Missing Statistics Values
If you want to include headings for all possible values of COMP (perhaps to make it easier to compare the output with tables that are created later when you do have data for laptops), you have three different ways to create such a table:
The following program adds the PRELOADFMT option to a CLASS statement that contains the relevant variable.
The results are shown in Computer Sales Data: All Possible COMP Valued Included .
proc tabulate data=compmiss missing; class country; class computer / preloadfmt; var rev90 rev91 rev92; table computer*country,rev90 rev91 rev92 / rts=32 printmiss misstext='NO DATA!'; format country cntryfmt. computer compfmt.; title 'Revenues for Computer Sales'; title2 'for 1990 to 1992'; run;Computer Sales Data: All Possible COMP Valued Included
Understanding the Order of Headings with ORDER=DATA |
For this technique to work, the first value of the first class variable must occur in the data with all possible values of all the other class variables. If this criterion is not met, the order of the headings may surprise you.
The following program creates a simple data set in which
the observations are ordered first by the values of Animal, then by the values
of Food. The ORDER= option in the PROC TABULATE statement orders the heading
for the class variables by the order of their appearance in the data set (see Ordering the Headings of Class Variables ).
Although
bones
is the first value for Food in the group of observations where
Animal=
dog
, all other values for Food appear before
bones
in the
data set because
bones
never appears when Animal=
cat
. Therefore,
the header for
bones
in the table in Ordering the Headings of Class Variables is not in alphabetic order.
In other words, PROC TABULATE maintains for subsequent categories the order that was established by earlier categories. If you want to reestablish the order of Food for each value of Animal, use BY-group processing. PROC TABULATE creates a separate table for each BY group, so that the ordering can differ from one BY group to the next.
data foodpref; input Animal $ Food $; datalines; cat fish cat meat cat milk dog bones dog fish dog meat ; proc tabulate data=foodpref format=9. order=data; class animal food; table animal*food; run;
Ordering the Headings of Class Variables
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.