Reweighting Observations in an Analysis
Reweighting observations is an interactive feature
of PROC REG that enables you to change the weights of
observations used in computing the regression equation.
Observations can also be deleted from the analysis (not
from the data set) by changing their weights to zero.
The Class data (in the "Getting Started" section) are used to illustrate
some of the features of the REWEIGHT statement.
First, the full model is fit, and the residuals
are displayed in Figure 55.44.
proc reg data=Class;
model Weight=Age Height / p;
id Name;
run;
The REG Procedure |
Model: MODEL1 |
Dependent Variable: Weight |
Output Statistics |
Obs |
Name |
Dep Var Weight |
Predicted Value |
Residual |
1 |
Alfred |
112.5000 |
124.8686 |
-12.3686 |
2 |
Alice |
84.0000 |
78.6273 |
5.3727 |
3 |
Barbara |
98.0000 |
110.2812 |
-12.2812 |
4 |
Carol |
102.5000 |
102.5670 |
-0.0670 |
5 |
Henry |
102.5000 |
105.0849 |
-2.5849 |
6 |
James |
83.0000 |
80.2266 |
2.7734 |
7 |
Jane |
84.5000 |
89.2191 |
-4.7191 |
8 |
Janet |
112.5000 |
102.7663 |
9.7337 |
9 |
Jeffrey |
84.0000 |
100.2095 |
-16.2095 |
10 |
John |
99.5000 |
86.3415 |
13.1585 |
11 |
Joyce |
50.5000 |
57.3660 |
-6.8660 |
12 |
Judy |
90.0000 |
107.9625 |
-17.9625 |
13 |
Louise |
77.0000 |
76.6295 |
0.3705 |
14 |
Mary |
112.0000 |
117.1544 |
-5.1544 |
15 |
Philip |
150.0000 |
138.2164 |
11.7836 |
16 |
Robert |
128.0000 |
107.2043 |
20.7957 |
17 |
Ronald |
133.0000 |
118.9529 |
14.0471 |
18 |
Thomas |
85.0000 |
79.6676 |
5.3324 |
19 |
William |
112.0000 |
117.1544 |
-5.1544 |
Sum of Residuals |
0 |
Sum of Squared Residuals |
2120.09974 |
Predicted Residual SS (PRESS) |
3272.72186 |
|
Figure 55.45: Full Model for CLASS Data, Residuals Shown
Upon examining the data and residuals, you realize that
observation 17 (Ronald) was mistakenly included in the analysis.
Also, you would like to examine the effect of
reweighting to 0.5 those observations with residuals
that have absolute values greater than or equal to 17.
reweight obs.=17;
reweight r. le -17 or r. ge 17 / weight=0.5;
print p;
run;
At this point, a message (on the log) appears that tells you which
observations have been reweighted and what the new weights are.
Figure 55.45 is produced.
The REG Procedure |
Model: MODEL1.2 |
Dependent Variable: Weight |
Output Statistics |
Obs |
Name |
Weight Variable |
Dep Var Weight |
Predicted Value |
Residual |
1 |
Alfred |
1.0000 |
112.5000 |
121.6250 |
-9.1250 |
2 |
Alice |
1.0000 |
84.0000 |
79.9296 |
4.0704 |
3 |
Barbara |
1.0000 |
98.0000 |
107.5484 |
-9.5484 |
4 |
Carol |
1.0000 |
102.5000 |
102.1663 |
0.3337 |
5 |
Henry |
1.0000 |
102.5000 |
104.3632 |
-1.8632 |
6 |
James |
1.0000 |
83.0000 |
79.9762 |
3.0238 |
7 |
Jane |
1.0000 |
84.5000 |
87.8225 |
-3.3225 |
8 |
Janet |
1.0000 |
112.5000 |
103.6889 |
8.8111 |
9 |
Jeffrey |
1.0000 |
84.0000 |
98.7606 |
-14.7606 |
10 |
John |
1.0000 |
99.5000 |
85.3117 |
14.1883 |
11 |
Joyce |
1.0000 |
50.5000 |
58.6811 |
-8.1811 |
12 |
Judy |
0.5000 |
90.0000 |
106.8740 |
-16.8740 |
13 |
Louise |
1.0000 |
77.0000 |
76.8377 |
0.1623 |
14 |
Mary |
1.0000 |
112.0000 |
116.2429 |
-4.2429 |
15 |
Philip |
1.0000 |
150.0000 |
135.9688 |
14.0312 |
16 |
Robert |
0.5000 |
128.0000 |
103.5150 |
24.4850 |
17 |
Ronald |
0 |
133.0000 |
117.8121 |
15.1879 |
18 |
Thomas |
1.0000 |
85.0000 |
78.1398 |
6.8602 |
19 |
William |
1.0000 |
112.0000 |
116.2429 |
-4.2429 |
Sum of Residuals |
0 |
Sum of Squared Residuals |
1500.61194 |
Predicted Residual SS (PRESS) |
2287.57621 |
NOTE: |
The above statistics use observation weights or frequencies. |
|
|
Figure 55.46: Model with Reweighted Observations
The first REWEIGHT statement excludes observation 17,
and the second REWEIGHT statement reweights observations 12 and 16 to 0.5.
An important feature to note from this example is that
the model is not refit until after the PRINT statement.
REWEIGHT statements do not cause the model to be refit.
This is so that multiple REWEIGHT statements
can be applied to a subsequent model.
In this example, since the intent is to reweight observations
with large residuals, the observation that was
mistakenly included in the analysis should be deleted; then, the
model should be fit for those remaining observations, and
the observations with large residuals should be reweighted.
To accomplish this, use the REFIT statement. Note that the model label
has been changed from MODEL1 to MODEL1.2 as two REWEIGHT statements
have been used.
These statements produce Figure 55.46:
reweight allobs / weight=1.0;
reweight obs.=17;
refit;
reweight r. le -17 or r. ge 17 / weight=.5;
print;
run;
The REG Procedure |
Model: MODEL1.5 |
Dependent Variable: Weight |
Output Statistics |
Obs |
Name |
Weight Variable |
Dep Var Weight |
Predicted Value |
Residual |
1 |
Alfred |
1.0000 |
112.5000 |
120.9716 |
-8.4716 |
2 |
Alice |
1.0000 |
84.0000 |
79.5342 |
4.4658 |
3 |
Barbara |
1.0000 |
98.0000 |
107.0746 |
-9.0746 |
4 |
Carol |
1.0000 |
102.5000 |
101.5681 |
0.9319 |
5 |
Henry |
1.0000 |
102.5000 |
103.7588 |
-1.2588 |
6 |
James |
1.0000 |
83.0000 |
79.7204 |
3.2796 |
7 |
Jane |
1.0000 |
84.5000 |
87.5443 |
-3.0443 |
8 |
Janet |
1.0000 |
112.5000 |
102.9467 |
9.5533 |
9 |
Jeffrey |
1.0000 |
84.0000 |
98.3117 |
-14.3117 |
10 |
John |
1.0000 |
99.5000 |
85.0407 |
14.4593 |
11 |
Joyce |
1.0000 |
50.5000 |
58.6253 |
-8.1253 |
12 |
Judy |
1.0000 |
90.0000 |
106.2625 |
-16.2625 |
13 |
Louise |
1.0000 |
77.0000 |
76.5908 |
0.4092 |
14 |
Mary |
1.0000 |
112.0000 |
115.4651 |
-3.4651 |
15 |
Philip |
1.0000 |
150.0000 |
134.9953 |
15.0047 |
16 |
Robert |
0.5000 |
128.0000 |
103.1923 |
24.8077 |
17 |
Ronald |
0 |
133.0000 |
117.0299 |
15.9701 |
18 |
Thomas |
1.0000 |
85.0000 |
78.0288 |
6.9712 |
19 |
William |
1.0000 |
112.0000 |
115.4651 |
-3.4651 |
Sum of Residuals |
0 |
Sum of Squared Residuals |
1637.81879 |
Predicted Residual SS (PRESS) |
2473.87984 |
NOTE:
|
The above statistics use observation weights or frequencies.
|
|
|
Figure 55.47: Observations Excluded from Analysis, Model Refitted
and Observations Reweighted
Notice that this results in a slightly different
model than the previous set of statements:
only observation 16 is reweighted to 0.5. Also note that the model
label is now MODEL1.5 since five REWEIGHT statements have been used
for this model.
Another important feature of the REWEIGHT statement is the ability to
nullify the effect of a previous or all REWEIGHT statements.
First, assume that you have several REWEIGHT
statements in effect and you want to restore
the original weights of all the observations.
The following REWEIGHT statement accomplishes
this and produces Figure 55.47:
reweight allobs / reset;
print;
run;
The REG Procedure |
Model: MODEL1.6 |
Dependent Variable: Weight |
Output Statistics |
Obs |
Name |
Dep Var Weight |
Predicted Value |
Residual |
1 |
Alfred |
112.5000 |
124.8686 |
-12.3686 |
2 |
Alice |
84.0000 |
78.6273 |
5.3727 |
3 |
Barbara |
98.0000 |
110.2812 |
-12.2812 |
4 |
Carol |
102.5000 |
102.5670 |
-0.0670 |
5 |
Henry |
102.5000 |
105.0849 |
-2.5849 |
6 |
James |
83.0000 |
80.2266 |
2.7734 |
7 |
Jane |
84.5000 |
89.2191 |
-4.7191 |
8 |
Janet |
112.5000 |
102.7663 |
9.7337 |
9 |
Jeffrey |
84.0000 |
100.2095 |
-16.2095 |
10 |
John |
99.5000 |
86.3415 |
13.1585 |
11 |
Joyce |
50.5000 |
57.3660 |
-6.8660 |
12 |
Judy |
90.0000 |
107.9625 |
-17.9625 |
13 |
Louise |
77.0000 |
76.6295 |
0.3705 |
14 |
Mary |
112.0000 |
117.1544 |
-5.1544 |
15 |
Philip |
150.0000 |
138.2164 |
11.7836 |
16 |
Robert |
128.0000 |
107.2043 |
20.7957 |
17 |
Ronald |
133.0000 |
118.9529 |
14.0471 |
18 |
Thomas |
85.0000 |
79.6676 |
5.3324 |
19 |
William |
112.0000 |
117.1544 |
-5.1544 |
Sum of Residuals |
0 |
Sum of Squared Residuals |
2120.09974 |
Predicted Residual SS (PRESS) |
3272.72186 |
|
Figure 55.48: Restoring Weights of All Observations
The resulting model is identical to the original
model specified at the beginning of this section.
Notice that the model label is now MODEL1.6.
Note that the Weight column does not appear, since all observations have
been reweighted to have weight=1.
Now suppose you want only to undo the changes
made by the most recent REWEIGHT statement.
Use REWEIGHT UNDO for this.
The following statements produce Figure 55.48:
reweight r. le -12 or r. ge 12 / weight=.75;
reweight r. le -17 or r. ge 17 / weight=.5;
reweight undo;
print;
run;
The REG Procedure |
Model: MODEL1.9 |
Dependent Variable: Weight |
Output Statistics |
Obs |
Name |
Weight Variable |
Dep Var Weight |
Predicted Value |
Residual |
1 |
Alfred |
0.7500 |
112.5000 |
125.1152 |
-12.6152 |
2 |
Alice |
1.0000 |
84.0000 |
78.7691 |
5.2309 |
3 |
Barbara |
0.7500 |
98.0000 |
110.3236 |
-12.3236 |
4 |
Carol |
1.0000 |
102.5000 |
102.8836 |
-0.3836 |
5 |
Henry |
1.0000 |
102.5000 |
105.3936 |
-2.8936 |
6 |
James |
1.0000 |
83.0000 |
80.1133 |
2.8867 |
7 |
Jane |
1.0000 |
84.5000 |
89.0776 |
-4.5776 |
8 |
Janet |
1.0000 |
112.5000 |
103.3322 |
9.1678 |
9 |
Jeffrey |
0.7500 |
84.0000 |
100.2835 |
-16.2835 |
10 |
John |
0.7500 |
99.5000 |
86.2090 |
13.2910 |
11 |
Joyce |
1.0000 |
50.5000 |
57.0745 |
-6.5745 |
12 |
Judy |
0.7500 |
90.0000 |
108.2622 |
-18.2622 |
13 |
Louise |
1.0000 |
77.0000 |
76.5275 |
0.4725 |
14 |
Mary |
1.0000 |
112.0000 |
117.6752 |
-5.6752 |
15 |
Philip |
1.0000 |
150.0000 |
138.9211 |
11.0789 |
16 |
Robert |
0.7500 |
128.0000 |
107.0063 |
20.9937 |
17 |
Ronald |
0.7500 |
133.0000 |
119.4681 |
13.5319 |
18 |
Thomas |
1.0000 |
85.0000 |
79.3061 |
5.6939 |
19 |
William |
1.0000 |
112.0000 |
117.6752 |
-5.6752 |
Sum of Residuals |
0 |
Sum of Squared Residuals |
1694.87114 |
Predicted Residual SS (PRESS) |
2547.22751 |
NOTE: |
The above statistics use observation weights or frequencies. |
|
|
Figure 55.49: Example of UNDO in REWEIGHT Statement
The resulting model reflects changes made only by the first
REWEIGHT statement since the third REWEIGHT statement
negates the effect of the second REWEIGHT statement.
Observations 1, 3, 9, 10, 12, 16, and
17 have their weights changed to 0.75. Note that the label MODEL1.9
reflects the use of nine REWEIGHT statements for the current model.
Now suppose you want to reset the observations selected by
the most recent REWEIGHT statement to their original weights.
Use the REWEIGHT statement with the RESET option to do this.
The following statements produce Figure 55.49:
reweight r. le -12 or r. ge 12 / weight=.75;
reweight r. le -17 or r. ge 17 / weight=.5;
reweight / reset;
print;
run;
The REG Procedure |
Model: MODEL1.12 |
Dependent Variable: Weight |
Output Statistics |
Obs |
Name |
Weight Variable |
Dep Var Weight |
Predicted Value |
Residual |
1 |
Alfred |
0.7500 |
112.5000 |
126.0076 |
-13.5076 |
2 |
Alice |
1.0000 |
84.0000 |
77.8727 |
6.1273 |
3 |
Barbara |
0.7500 |
98.0000 |
111.2805 |
-13.2805 |
4 |
Carol |
1.0000 |
102.5000 |
102.4703 |
0.0297 |
5 |
Henry |
1.0000 |
102.5000 |
105.1278 |
-2.6278 |
6 |
James |
1.0000 |
83.0000 |
80.2290 |
2.7710 |
7 |
Jane |
1.0000 |
84.5000 |
89.7199 |
-5.2199 |
8 |
Janet |
1.0000 |
112.5000 |
102.0122 |
10.4878 |
9 |
Jeffrey |
0.7500 |
84.0000 |
100.6507 |
-16.6507 |
10 |
John |
0.7500 |
99.5000 |
86.6828 |
12.8172 |
11 |
Joyce |
1.0000 |
50.5000 |
56.7703 |
-6.2703 |
12 |
Judy |
1.0000 |
90.0000 |
108.1649 |
-18.1649 |
13 |
Louise |
1.0000 |
77.0000 |
76.4327 |
0.5673 |
14 |
Mary |
1.0000 |
112.0000 |
117.1975 |
-5.1975 |
15 |
Philip |
1.0000 |
150.0000 |
138.7581 |
11.2419 |
16 |
Robert |
1.0000 |
128.0000 |
108.7016 |
19.2984 |
17 |
Ronald |
0.7500 |
133.0000 |
119.0957 |
13.9043 |
18 |
Thomas |
1.0000 |
85.0000 |
80.3076 |
4.6924 |
19 |
William |
1.0000 |
112.0000 |
117.1975 |
-5.1975 |
Sum of Residuals |
0 |
Sum of Squared Residuals |
1879.08980 |
Predicted Residual SS (PRESS) |
2959.57279 |
NOTE: |
The above statistics use observation weights or frequencies. |
|
|
Figure 55.50: REWEIGHT Statement with RESET option
Note that observations that meet the condition of
the second REWEIGHT statement (residuals with an
absolute value greater than or equal to 17) now
have weights reset to their original value of 1.
Observations 1, 3, 9, 10, and 17 have weights of 0.75,
but observations 12 and 16 (which meet the condition of the
second REWEIGHT statement) have their weights reset to 1.
Notice how the last three examples show three
ways to change weights back to a previous value.
In the first example, ALLOBS and the RESET option are used to
change weights for all observations back to their original values.
In the second example, the UNDO option is used to negate the
effect of a previous REWEIGHT statement, thus changing weights
for observations selected in the previous REWEIGHT statement
to the weights specified in still another REWEIGHT statement.
In the third example, the RESET option is used to
change weights for observations selected in a previous
REWEIGHT statement back to their original values. Finally, note that
the label MODEL1.12 indicates that twelve REWEIGHT statements have
been applied to the original model.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.