Chapter Contents |
Previous |
Next |
The TABULATE Procedure |
Statistics Available in PROC TABULATE |
Descriptive statistic keywords | |
COLPCTN | PCTSUM |
COLPCTSUM | RANGE |
CSS | REPPCTN |
CV | REPPCTSUM |
MAX | ROWPCTN |
MEAN | ROWPCTSUM |
MIN | STDDEV|STD |
N | STDERR |
NMISS | SUM |
PAGEPCTN | SUMWGT |
PAGEPCTSUM | USS |
PCTN | VAR |
Quantile statistic keywords | |
MEDIAN|P50 | Q3|P75 |
P1 | P90 |
P5 | P95 |
P10 | P99 |
Q1|P25 | QRANGE |
Hypothesis testing keyword | |
PROBT | T |
Explanations of the keywords, the formulas that are used to calculate them, and the data requirements are discussed in Keywords and Formulas .
Formatting Class Variables |
User-defined formats are particularly useful for grouping values into fewer categories. For example, if you have a class variable, Age, with values ranging from 1 to 99, you could create a user-defined format that groups the ages so that your tables contain a manageable number of categories. The following PROC FORMAT step creates a format that condenses all possible values of age into six groups of values.
proc format; value agefmt 0-29='Under 30' 30-39='30-39' 40-49='40-49' 50-59='50-59' 60-69='60-69' other='70 or over'; run;
For information on creating user-defined formats, see The FORMAT Procedure .
By default, PROC TABULATE includes in a table only those formats for which the frequency count is not zero and for which values are not missing. To include missing values for all class variables in the output, use the MISSING option in the PROC TABULATE statement, and to include missing values for selected class variables, use the MISSING option in a CLASS statement. To include formats for which the frequency count is zero, use the PRELOADFMT option in a CLASS statement and the PRINTMISS option in the TABLE statement, or use the CLASSDATA= option in the PROC TABULATE statement.
Formatting Values in Tables |
PROC TABULATE determines the format to use for a particular cell based on the following order of precedence for formats:
For more information about formatting table cells, see "Formatting Values in Table Cells" in Chapter 5, "Controlling the Table's Appearance" in SAS Guide to TABULATE Processing.
How Using BY-group Processing Differs from Using the Page Dimension |
Contrasting the BY Statement and the Page Dimension contrasts the two methods.
Issue | PROC TABULATE with a BY statement | PROC TABULATE with a page dimension in the TABLE statement | |
---|---|---|---|
Order of observations in the input data set | The observations in the input data set must be sorted by the BY variables. (table note 1) | Sorting is unnecessary. | |
One report summarizing all BY groups | You cannot create one report for all the BY groups. | Use ALL in the page dimension to create a report for all classes. (See Summarizing Information with the Universal Class Variable ALL .) | |
Percentages | The percentages in the tables are percentages of the total for that BY group. You cannot calculate percentages for a BY group compared to the totals for all BY groups because PROC TABULATE prepares the individual reports separately. Data for the report for one BY group are not available to the report for another BY group. | You can use denominator definitions to control the meaning of PCTN (see Calculating Percentages .) | |
Titles | You can use the #BYVAL, #BYVAR, and #BYLINE specifications in TITLE statements to customize the titles for each BY group (see Creating Titles That Contain BY-Group Information ). | The BOX= option in the TABLE statement customizes the page headers, but you must use the same title on each page. | |
Ordering class variables | ORDER=DATA and ORDER=FREQ order each BY group independently. | The order of class variables is the same on every page. | |
Obtaining uniform headings | You may need to insert dummy observations into BY groups that do not have all classes represented. | The PRINTMISS option ensures that each page of the table has uniform headings. | |
Multiple ranges with the same format | PROC TABULATE produces a table for each range. | PROC TABULATE combines observations from the two ranges. |
TABLE NOTE 1: You can
use the BY statement without sorting the data set if the data set has an index
for the BY variable.
Calculating Percentages |
REPPCTN and REPPCTSUM statistics--print the percentage of the value in a single table cell in relation to the total of the values in the report. | |
COLPCTN and COLPCTSUM statistics--print the percentage of the value in a single table cell in relation to the total of the values in the column. | |
ROWPCTN and ROWPCTSUM statistics--print the percentage of the value in a single table cell in relation to the total of the values in the row. | |
PAGEPCTN and PAGEPCTSUM statistics--print the percentage of the value in a single table cell in relation to the total of the values in the page. |
These statistics calculate the most commonly used percentages. See Calculating Various Percentage Statistics for an example.
PCTN and PCTSUM statistics can be used to calculate these same percentages. They allow you to manually define denominators. PCTN and PCTSUM statistics print the percentage of the value in a single table cell in relation to the value (used in the denominator of the calculation of the percentage) in another table cell or to the total of the values in a group of cells. By default, PROC TABULATE summarizes the values in all N cells (for PCTN) or all SUM cells (for PCTSUM) and uses the summarized value for the denominator. You can control the value that PROC TABULATE uses for the denominator with a denominator definition.
You place a denominator definition in angle brackets (< and >) next to the N or PCTN statistic. The denominator definition specifies which categories to sum for the denominator.
This section illustrates how to specify denominator
definitions in a simple table. Using Denominator Definitions to Display Basic Frequency Counts and Percentages illustrates how to
specify denominator definitions in a table that is composed of multiple subtables.
For more examples of denominator definitions, see "How Percentages Are Calculated"
in Chapter 3, "Details of TABULATE Processing" in SAS Guide to TABULATE
Processing.
proc tabulate data=energy; class division type; table division* (n='Number of customers' pctn<type>='% of row' [1] pctn<division>='% of column' [2] pctn='% of all customers'), [3] type/rts=50; title 'Number of Users in Each Division'; run;
The TABLE statement creates a row for each value of Division and a column for each value of Type. Within each row, the TABLE statement nests four statistics: N and three different calculations of PCTN (see Three Different Uses of the PCTN Statistic with Frequency Counts Highlighted ). Each occurrence of PCTN uses a different denominator definition.
Three Different Uses of the PCTN Statistic with Frequency Counts Highlighted
<type>
sums the frequency counts for all occurrences of Type within the same
value of Division. Thus, for Division=1, the denominator is 6 + 6, or 12.
<division>
sums the frequency counts for all occurrences of Division within the
same value of Type. Thus, for Type=1, the denominator is 6 + 3 + 8 + 5, or
22.
proc tabulate data=energy format=8.2; class division type; var expenditures; table division* (sum='Expenditures'*f=dollar10.2 pctsum<type>='% of row' [1] pctsum<division>='% of column' [2] pctsum='% of all customers'), [3] type*expenditures/rts=40; title 'Expenditures in Each Division'; run;The TABLE statement creates a row for each value of Division and a column for each value of Type. Because Type is crossed with Expenditures, the value in each cell is the sum of the values of Expenditures for all observations that contribute to the cell. Within each row, the TABLE statement nests four statistics: SUM and three different calculations of PCTSUM (see Three Different Uses of the PCTSUM Statistic with Sums Highlighted ). Each occurrence of PCTSUM uses a different denominator definition.
Three Different Uses of the PCTSUM Statistic with Sums Highlighted
<type>
sums the values of Expenditures for all occurrences of Type within
the same value of Division. Thus, for Division=1, the denominator is $7,477
+ $5,129.
<division>
sums the frequency counts for all occurrences of Division within the
same value of Type. Thus, for Type=1, the denominator is $7,477 + $19,379
+ $5,476 + $13,959.
Using Style Elements in PROC TABULATE |
Region | Style |
---|---|
column headings | Header |
continuation message | Aftercaption |
box | Header |
page dimension text | Beforecaption |
row headings | Rowheader |
data cells | Data |
table | Table |
Detailed information on STYLE= is provided in the documentation for individual statements.
To set the style element for | Use STYLE in this statement |
---|---|
data cells | PROC TABULATE |
page dimension text, continuation messages, and class variable name headings | CLASS |
class level value headings | CLASSLEV |
keyword headings | KEYWORD |
the entire table | TABLE |
analysis variable name headings | VAR |
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.