Chapter Contents |
Previous |
Next |
SAS Procedures Guide |
In the following notation, summation is over observations that contain nonmissing values of the analyzed variable and, except where shown, over nonmissing weights and frequencies of one or more:
By default, the base procedures treat a negative weight as if it is equal to zero. However, if you use the EXCLNPWGT option in the PROC statement, the procedure also excludes those values of with nonpositive weights. Note that most SAS/STAT procedures, such as PROC TTEST and PROC GLM, exclude values with nonpositive weights by default.
If you omit the WEIGHT statement, then for all i.
where is the variance divisor (the VARDEF= option) that you specify in the PROC statement. Valid values are as follows:
When VARDEF= | equals . . . |
N |
|
DF |
|
WEIGHT |
|
WDF |
|
The standard keywords and formulas for each statistic follow. Some formulas use keywords to designate the corresponding statistic.
Statistic | PROC MEANS and SUMMARY | PROC UNIVARIATE | PROC TABULATE | PROC REPORT | PROC CORR | PROC SQL | |
---|---|---|---|---|---|---|---|
Number of missing values | X | X | X | X | X | ||
Number of nonmissing values | X | X | X | X | X | X | |
Number of observations | X | X | X | ||||
Sum of weights | X | X | X | X | X | X | |
Mean | X | X | X | X | X | X | |
Sum | X | X | X | X | X | X | |
Extreme values | X | X | |||||
Minimum | X | X | X | X | X | X | |
Maximum | X | X | X | X | X | X | |
Range | X | X | X | X | X | ||
Uncorrected sum of squares | X | X | X | X | X | X | |
Corrected sum of squares | X | X | X | X | X | X | |
Variance | X | X | X | X | X | X | |
Covariance | X | ||||||
Standard deviation | X | X | X | X | X | X | |
Standard error of the mean | X | X | X | X | X | ||
Coefficient of variation | X | X | X | X | X | ||
Skewness | X | X | X | ||||
Kurtosis | X | X | X | ||||
Confidence Limits | |||||||
of the mean | X | X | |||||
of the variance | X | ||||||
of quantiles | X | ||||||
Median | X | X | X | X | |||
Mode | X | ||||||
Percentiles/Deciles/Quartiles | X | X | X | ||||
t test | |||||||
for mean=0 | X | X | X | X | X | ||
for mean= | X | ||||||
Nonparametric tests for location | X | ||||||
Tests for normality | X | ||||||
Correlation coefficients | X | ||||||
Cronbach's alpha | X |
Descriptive Statistics |
where is . The weighted kurtosis is computed as
When VARDEF=N, the kurtosis is computed as
and the weighted kurtosis is computed as
where is . The formula is invariant under the transformation . When you use VARDEF=WDF or VARDEF=WEIGHT, the kurtosis is set to missing.
Note: PROC MEANS and PROC TABULATE do not compute weighted kurtosis.
where is . The weighted skewness is computed as
When VARDEF=N, the skewness is computed as
and the weighted skewness is computed as
The formula is invariant under the transformation . When you use VARDEF=WDF or VARDEF=WEIGHT, the skewness is set to missing.
Note: PROC MEANS and PROC TABULATE do not compute weighted skewness.
when VARDEF=DF, which is the default. Otherwise, STDERR is set to missing.
Percentile and Related Statistics |
Here, PCTLDEF= specifies the method that the procedure uses to compute the tth percentile, as shown in the table that follows.
When you use the WEIGHT statement, the tth percentile is computed as
where is the weight associated with and is the sum of the weights. When the observations have identical weights, the weighted percentiles where the same as the unweighted percentiles with PCTLDEF=5.
PCTLDEF= | Description | Formula | |
---|---|---|---|
1 | weighted average at |
|
|
where is taken to be | |||
2 | observation numbered closest to |
|
if |
|
if and is even | ||
|
if and is odd | ||
where i is the integer part of | |||
3 | empirical distribution function |
|
if |
|
if | ||
4 | weighted average aimed at |
|
|
where is taken to be | |||
5 | empirical distribution function with averaging |
|
if |
|
if |
Hypothesis Testing Statistics |
By default, is equal to zero. You can use the MU0= option in the PROC UNIVARIATE statement to specify . You must use VARDEF=DF, which is the default variance divisor, otherwise T is set to missing.
By default, when you use a WEIGHT statement, the procedure counts the values with nonpositive weights in the degrees of freedom. Use the EXCLNPWGT option in the PROC statement to exclude values with nonpositive weights. Most SAS/STAT procedures, such as PROC TTEST and PROC GLM automatically exclude values with nonpositive weights.
Confidence Limits for the Mean |
where is , is the ( ) critical value of the Student's t statistics with degrees of freedom, and is the value of the ALPHA= option which by default is 0.05. Unless you use VARDEF=DF, which is the default variance divisor, CLM is set to missing.
Unless you use VARDEF=DF, which is the default variance divisor, LCLM is set to missing.
Unless you use VARDEF=DF, which is the default variance divisor, UCLM is set to missing.
Using Weights |
Data Requirements for Summarization Procedures |
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.