Chapter Contents |
Previous |
Next |
The UNIVARIATE Procedure |
Interaction: | When you use the HISTOGRAM, PROBPLOT, or QQPLOT statement, PROC UNIVARIATE creates comparative histograms, comparative probability plots, or comparative quantile-quantile plots. |
Featured in: | Creating a Two-Way Comparative Histogram |
CLASS
variable-1<(variable-option(s))>
<variable-2<(variable-option(s))>>
</ KEYLEVEL='value1'|('value1' 'value2')>; |
Required Arguments |
Class variables can be numeric or character. Class variables can have continuous values, but they typically have a few discrete values that define levels of the variable. You do not have to sort the data by class variables. PROC UNIVARIATE uses the formatted values of the class variables to determine the classification levels.
You can use the HISTOGRAM, PROBPLOT, or QQPLOT statement with the CLASS statement to create one-way and two-way comparative plots. When you use one class variable, PROC UNIVARIATE displays an array of component plots (stacked or side-by-side), one for each level of the classification variable. When you use two class variables, PROC UNIVARIATE displays a matrix of component plots, one for each combination of levels of the classification variables. The observations in a given level are referred to collectively as a cell.
Restriction: | The length of a character class variable cannot exceed 16. |
Interaction: | When you create a one-way comparative
plot, the observations in the input data set are sorted by the formatted values
(levels) of the variable. PROC UNIVARIATE creates a separate plot for the
analysis variable values in each level, and arranges these component plots
in an array to form the comparative plot with uniform horizontal and vertical
axes.
When you create a two-way comparative plot, the observations in the input data set are cross-classified according to the values (levels) of these variables. PROC UNIVARIATE creates a separate plot for the analysis variable values in each cell of the cross-classification and arranges these component plots in a matrix to form the comparative plot with uniform horizontal and vertical axes. The levels of variable-1 are the labels for the rows of the matrix, and the levels of variable-2 are the labels for the columns of the matrix. |
Interaction: | If you associate a label with a variable, PROC UNIVARIATE displays the variable label in the comparative plot and this label is parallel to the column (or row) labels. |
Tip: | Use the MISSING option to treat missing values as valid levels. |
Tip: | To reduce the number of classification levels, use a FORMAT statement to combine variable values. |
Options |
If you specify only one class variable and use a HISTOGRAM statement, KEYLEVEL='value' identifies the key cell as the level for which variable is equal to value. By default, PROC UNIVARIATE sorts the levels in the order that is determined by the ORDER= option. Then, the key cell is the first occurrence of a level in this order. The cells display in order from top to bottom or left to right. Consequently, the key cell appears at the top (or left). When you specify a different key cell with the KEYLEVEL= option, this cell appears at the top (or left).
Likewise, with the PROBPLOT statement and the QQPLOT statement the key cell determines uniform axis scaling.
If you specify two class variables, use KEYLEVEL=('value1' 'value2') to identify the key cell as the level for which variable-n is equal to value-n. By default, PROC UNIVARIATE sorts the levels of the first variable in the order that is determined by its ORDER= option and, within each of these levels, it sorts the levels of the second variable in the order that is determined by its ORDER= option. Then, the default key cell is the first occurrence of a combination of levels for the two variables in this order. The cells display in the order of variable-1 from top to bottom and in the order of variable-2 from left to right. Consequently, the default key cell appears at the upper left corner. When you specify a different key cell with the KEYLEVEL= option, this cell appears at the upper left corner.
Restriction: | The length of the KEYLEVEL= value cannot exceed 16 characters and you must specify a formatted value. |
Requirement: | This option is ignored unless you specify a HISTOGRAM, PROBPLOT, or QQPLOT statement. |
See also: | the ORDER= option |
Default: | If you omit MISSING, PROC UNIVARIATE excludes the observations with a missing class variable value from the analysis. |
Requirement: | Enclose this option in parentheses after the class variable. |
See also: | SAS Language Reference: Concepts for a discussion of missing values that have special meaning. |
Interaction: | When you use a HISTOGRAM, PROBPLOT, or QQPLOT statement, PROC UNIVARIATE displays the rows (columns) of the comparative plot from top to bottom (left to right) in the order that the class variable values first appear in the input data set. |
Interaction: | When you use a HISTOGRAM, PROBPLOT,
or QQPLOT statement, PROC UNIVARIATE displays the rows (columns) of the comparative
plot from top to bottom (left to right) in increasing order of the formatted
class variable values. For example, a numeric class variable DAY (with values
1, 2, and 3) has a user-defined format that assigns Wednesday to the value 1, Thursday to
the value 2, and Friday to the value 3. The
rows of the comparative plot will appear in alphabetical order (Friday, Thursday,
Wednesday) from top to bottom. |
Interaction: | When you use a HISTOGRAM, PROBPLOT, or QQPLOT statement, PROC UNIVARIATE displays the rows (columns) of the comparative plot from top to bottom (left to right) in order of decreasing frequency count for the class variable values. |
If there are two or more distinct internal values with the same formatted value then PROC UNIVARIATE determines the order by the internal value that occurs first in the input data set.
Interaction: | When you use a HISTOGRAM, PROBPLOT,
or QQPLOT statement, PROC UNIVARIATE displays the rows (columns) of the comparative
plot from top to bottom (left to right) in increasing order of the internal
(unformatted) values of the class variable. The first class variable is used
to label the rows of the comparative plots (top to bottom). The second class
variable are used to label the columns of the comparative plots (left to right).
For example, a numeric class variable DAY (with values 1, 2, and 3) has a
user-defined format that assigns Wednesday
to the value 1, Thursday to the value 2, and Friday to the value 3. The rows of the comparative plot will
appear in day-of-the-week order (Wednesday, Thursday, Friday) from top to
bottom. |
Default: | INTERNAL |
Requirement: | Enclose this option in parentheses after the class variable. |
Interaction: | When you use a HISTOGRAM, PROBPLOT,
or QQPLOT statement and ORDER=INTERNAL, PROC UNIVARIATE constructs the levels
of the class variables by using the formatted values of the variables. The
formatted values of the first class variable are used to label the rows of
the comparative plots (top to bottom). The formatted values of a second class
variable are used to label the columns of the comparative plots (left to right).
PROC UNIVARIATE determines the layout of a two-way comparative plot by using the order for the first class variable to obtain the order of the rows from top to bottom. Then it applies the order for the second class variable to the observations that correspond to the first row to obtain the order of the columns from left to right. If any columns remain unordered (that is, the categories are unbalanced), PROC UNIVARIATE applies the order for the second class variable to the observations in the second row, and so on, until all the columns have been ordered. |
Featured in: | Creating a Two-Way Comparative Histogram |
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.