Chapter Contents |
Previous |
Next |
The CORR Procedure |
Missing Values |
If you specify the NOMISS option, PROC CORR uses listwise deletion when a value of the BY, FREQ, VAR, WEIGHT, or WITH statement variable is missing. PROC CORR excludes all observations with missing values from the analysis. Therefore, the number of observations for each pair of variables is identical. The PARTIAL statement always excludes the observations with missing values by automatically invoking NOMISS. Listwise deletion is needed to correctly calculate Cronbach's coefficient alpha when data are missing. If a data set contains missing values, when you specify ALPHA use the NOMISS option
There are two reasons to specify NOMISS and, thus, to avoid pairwise deletion. First, NOMISS is computationally more efficient, so you use fewer computer resources. Second, if you use the correlations as input to regression or other statistical procedures, a pairwise-missing correlation matrix leads to several statistical difficulties. Pairwise correlation matrices may not be nonnegative definite, and the pattern of missing values may bias the results.
Procedure Output |
When you specify the CSSCP, SSCP, or COV option, the appropriate sum-of-squares and crossproducts and covariance matrix appears at the top of the correlation report. If the data set contains missing values, PROC CORR prints additional statistics for each pair of variables. These statistics, calculated from the observations with nonmissing row and column variable values, may include
For each pair of variables, PROC CORR always prints the correlation coefficients, the number of observations used to calculate the coefficient, and the significance probability. When you specify the ALPHA option, PROC CORR prints Cronbach's coefficient alpha, the correlation between the variable and the total of the remaining variables, and Cronbach's coefficient alpha using the remaining variables for the raw variables and the standardized variables.
Output Data Sets |
proc corr nocorr cov outp=b(type=cov);specifies the output data set type as COV.
PROC CORR does not print the output data set. Use PROC PRINT, PROC REPORT, or another SAS reporting tool to print the output data set.
The output data set includes the following variables
You can use a combination of the _TYPE_ and _NAME_ variables to identify the contents of an observation. The _NAME_ variable indicates which row of the correlation matrix the observation corresponds to. The values of the _TYPE_ variable are
When you specify the SSCP option, the OUTP= data set includes an additional observation that contains intercept values. When you specify the ALPHA option, the OUTP= data set also includes observations with the following _TYPE_ values:
When you use a PARTIAL statement, the previous statistics are calculated for the variables after partialling. If PROC CORR computes Pearson correlation statistics, MEAN equals zero and STD equals the partial standard deviation associated with the partial variance for the OUTP=, OUTK=, or OUTS= data set. Otherwise, PROC CORR assigns missing values to MEAN and STD. OUTP= Data Set with Pearson Partial Correlations lists the observations in an OUTP= data set when the COV option and PARTIAL statement are used to compute Pearson partial correlations. The _TYPE_ variable identifies COV, MEAN, STD, N, and CORR as the statistical values for the variables Weight, Oxygen, and Runtime. MEAN always equals 0, while STD is a partial standard deviation.
OUTP= Data Set with Pearson Partial Correlations
Pearson Correlation Statistics Using the PARTIAL Statement 1 Output Data Set from PROC CORR _TYPE_ _NAME_ Weight Oxygen Runtime COV Weight 72.4374 -12.7511 2.0677 COV Oxygen -12.7511 27.0165 -5.5937 COV Runtime 2.0677 -5.5937 1.9451 MEAN 0.0000 0.0000 0.0000 STD 8.5110 5.1977 1.3947 N 28.0000 28.0000 28.0000 CORR Weight 1.0000 -0.2882 0.1742 CORR Oxygen -0.2882 1.0000 -0.7716 CORR Runtime 0.1742 -0.7716 1.0000 |
Chapter Contents |
Previous |
Next |
Top of Page |
Copyright 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.