Introduction to
Regression Procedures |
General Regression: The REG Procedure
The REG procedure is a general-purpose procedure for regression that
- handles multiple regression models
- provides nine model-selection methods
- allows interactive changes both in the
model and in the data used to fit the model
- allows linear quality restrictions on parameters
- tests linear hypotheses and multivariate hypotheses
- produces collinearity diagnostics, influence
diagnostics, and partial regression leverage plots
- saves estimates, predicted values, residuals,
confidence limits, and other diagnostic
statistics in output SAS data sets
- generates plots of data and of various statistics
- "paints" or highlights scatter plots to identify
particular observations or groups of observations
- uses, optionally, correlations or crossproducts for input
Model-selection Methods in PROC REG
The nine methods of model selection implemented in PROC REG are
- NONE
- no selection.
This method is the default and uses the full model given
in the MODEL statement to fit the linear regression.
- FORWARD
- forward selection.
This method starts with no variables in
the model and adds variables one by one to the model.
At each step, the variable added is the
one that maximizes the fit of the model.
You can also specify groups of variables to
treat as a unit during the selection process.
An option enables you to specify the criterion for inclusion.
- BACKWARD
- backward elimination.
This method starts with a full model and
eliminates variables one by one from the model.
At each step, the variable with the smallest
contribution to the model is deleted.
You can also specify groups of variables to
treat as a unit during the selection process.
An option enables you to specify the criterion for exclusion.
- STEPWISE
- stepwise regression, forward and backward.
This method is a modification of the forward-selection method in
that variables already in the model do not necessarily stay there.
You can also specify groups of variables to
treat as a unit during the selection process.
Again, options enable you to specify criteria for
entry into the model and for remaining in the model.
- MAXR
- maximum R2 improvement.
This method
tries to find the best one-variable model,
the best two-variable model, and so on.
The MAXR method differs from the STEPWISE method in that many more models are evaluated
with MAXR, which considers all switches before making any switch.
The STEPWISE method may remove the "worst" variable
without considering what the "best" remaining
variable might accomplish, whereas MAXR would consider
what the "best" remaining variable might accomplish.
Consequently, MAXR typically takes
much longer to run than STEPWISE.
- MINR
- minimum R2 improvement.
This method closely resembles MAXR, but the switch chosen
is the one that produces the smallest increase in R2.
- RSQUARE
- finds a specified number of models having the
highest R2 in each of a range of model sizes.
- CP
- finds a specified number of models with the
lowest Cp within a range of model sizes.
- ADJRSQ
- finds a specified number of models having the
highest adjusted R2 within a range of model sizes.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.