Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
The LOESS Procedure

Example 38.2: Sulfate Deposits in the USA for 1990

The following data set contains measurements in grams per square meter of sulfate (SO4) deposits during 1990 at 179 sites throughout the 48 states.

   data SO4;
      input Latitude Longitude SO4 @@;
   datalines;
   32.45833  87.24222 1.403 34.28778  85.96889 2.103
   33.07139 109.86472 0.299 36.07167 112.15500 0.304
   31.95056 112.80000 0.263 33.60500  92.09722 1.950
   .
   .      more data lines
   .
   42.92889 109.78667 0.182 43.22278 109.99111 0.161
   43.87333 104.19222 0.306 44.91722 110.42028 0.210
   45.07611  72.67556 2.646
   ;

The following statements produce the two scatter plots of the SO4 data shown in Output 38.2.1 and Output 38.2.2:

 
   symbol1 color=black value=dot ;  
   proc gplot data=SO4;
      plot Latitude*Longitude/hreverse;
   run;
 
   proc g3d data=SO4; 
      format SO4 f4.1;
      scatter Longitude*Latitude=SO4 / 
            shape='balloon' 
            size=0.35
            rotate=80
            tilt=60;
   run;

Output 38.2.1: Locations of Sulfate Measurements
lwse2a.gif (4547 bytes)

Output 38.2.2: Scatter plot of SO4 Data
lwse2b.gif (7112 bytes)

From these scatter plots, it is clear that the largest concentrations are in the northeastern United States. These plots also indicate that a nonparametric surface, such as a loess fit, is appropriate for these data.

The sulfate measurements are irregularly spaced. The following statements create a SAS data set containing a regular grid of points that will be used in the SCORE statement:

 
   data PredPoints; 
      do Latitude = 26 to 46 by 1;
         do Longitude = 79 to 123 by 1;
            output;
         end;
      end;

The following statements fit loess models for two values of the smoothing parameter and save the results in output data sets:

 
   proc loess data=SO4;
      ods Output ScoreResults=ScoreOut
                 OutputStatistics=StatOut;
      model SO4=Latitude Longitude/smooth=0.15 0.4 residual;
      score data=PredPoints;
   run;

Notice that even though there are two predictors in the model, the SCALE= option is not appropriate because the predictors (Latitude and Longitude) are identically scaled.

Output 38.2.3 shows scatter plots of the fit residuals versus each of the predictors for the two smoothing parameters specified. A loess fit of the residuals is also shown on these scatter plots and is obtained using PROC LOESS with the StatOut data set generated by the previous PROC LOESS step.

   proc loess data=StatOut;
      by SmoothingParameter;
      ods output OutputStatistics=ResidLatOut;
      model residual=Latitude;                
   run; 
   proc loess data=StatOut;
      by SmoothingParameter;
      ods output OutputStatistics=ResidLongOut;
      model residual=Longitude;                
   run;
   proc sort data=ResidLatOut;
      by SmoothingParameter Latitude;
   run; 
   proc sort data=ResidLongOut;
      by SmoothingParameter Longitude;
   run;

   goptions nodisplay; 
   symbol1 color=black value=dot ;  
   symbol2 color=black interpol=join value=none; 
   %let opts = vaxis=axis1 overlay vref=0 lv=2; 
   axis1 label = (angle=90 rotate=0);

   proc gplot data=ResidLatOut;
      by smoothingParameter;
      plot (DepVar Pred) * Latitude / &opts name='lat';
   run; 

   proc gplot data=ResidLongOut;
      by smoothingParameter;
      plot (DepVar Pred) * Longitude / &opts name='long';
   run; 

   goptions display;
   proc greplay nofs tc=sashelp.templt template=l2r2;
       igout gseg;
       treplay 1:long 2:long1 3:lat 4:lat1;
   run; quit ;

Output 38.2.3: Scatter Plots of Loess Fit Residuals
lwse2c.gif (6830 bytes)

The scatter plots in Output 38.2.3 reveal that, with smoothing parameter 0.4, there is significant information in the data that is not being captured by the loess model. By contrast, the residuals for the more localized smoothing parameter 0.15 show a better fit.

The ScoreOut data set contains the model predictions at the grid defined in the PredPoints data set. The following statements request a fitted surface and a contour plot of this surface with a smoothing parameter of 0.15:

   proc g3d data=ScoreOut(where= (smoothingParameter=0.15));
      format Latitude f4.0; 
      format Longitude f4.0;
      format p_SO4 f4.1;
      plot Longitude*Latitude=p_SO4/tilt=60 rotate=80;
   run;

   proc gcontour data=ScoreOut(where= (smoothingParameter=0.15)); 
      format latitude f4.0; 
      format longitude f4.0; 
      format p_SO4 f4.1;
      plot Latitude*Longitude = p_SO4/hreverse;
   run;

Output 38.2.4: LOESS Fit of SO4 Data
lwse2d.gif (6530 bytes)

Output 38.2.5: Contour Plot of LOESS Fit of SO4 Data
lwse2e.gif (4209 bytes)

Chapter Contents
Chapter Contents
Previous
Previous
Next
Next
Top
Top

Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.