Chapter Contents |
Previous |
Next |
The TPSPLINE Procedure |
You can use PROC TPSPLINE to fit a surface that reflects the general trend and that reveals underlying features of the data.
data so4; input latitude longitude so4 @@; datalines; 32.45833 87.24222 1.403 34.28778 85.96889 2.103 33.07139 109.86472 0.299 36.07167 112.15500 0.304 31.95056 112.80000 0.263 33.60500 92.09722 1.950 34.17944 93.09861 2.168 36.08389 92.58694 1.578 . . . 162 additional observations . . . 45.82278 91.87444 0.984 41.34028 106.19083 0.335 42.73389 108.85000 0.236 42.49472 108.82917 0.313 42.92889 109.78667 0.182 43.22278 109.99111 0.161 43.87333 104.19222 0.306 44.91722 110.42028 0.210 45.07611 72.67556 2.646 ; data pred; do latitude = 25 to 47 by 1; do longitude = 68 to 124 by 1; output; end; end; run;
The preceding statements create the SAS data set so4 and the data set pred in order to make predictions on a regular grid. The following statements fit a surface for SO4 deposition. The ODS OUTPUT statement creates a data set called GCV to contain the GCV values for LOGNLAMBDA in the range from -6 to 1.
proc tpspline data=so4; ods output GCVFunction=gcv; model so4 = (latitude longitude) /lognlambda=(-6 to 1 by 0.1); score data=pred out=prediction1; run;
Partial output from these statements is displayed in Output 64.3.1.
Output 64.3.1: Partial Output from PROC TPSPLINE for Data Set SO4symbol1 interpol=join value=none; title "GCV Function"; proc gplot data=gcv; plot gcv*lognlambda/frame cframe=ligr vaxis=axis1 haxis=axis2; run;
Output 64.3.2 displays the plot of the GCV function versus nlambda in log10 scale. The GCV function has two minima. PROC TPSPLINE locates the minimum at 0.277005. The figure also displays a local minimum located around -2.56. Note that the TPSPLINE procedure may not always find the global minimum, although it did in this case.
Output 64.3.2: GCV Function of SO4 Data Setproc tpspline data=so4; model so4 = (latitude longitude) /lognlambda0=-2.56; score data=pred out=prediction2; run;Output 64.3.3: Output from PROC TPSPLINE for Data Set SO4 with LOGNLAMBDA=-2.56
|
The estimate based on LOGNLAMBDA=-2.56 has a larger value for the degrees of freedom, and it has a much smaller standard deviation.
However, a smaller standard deviation in nonparametric regression does not necessarily mean that the estimate is good: a small value always produces an estimate closer to the data and, therefore, a smaller standard deviation.
The following statements produce two contour plots of the estimates using the GCONTOUR procedure. In the final step, the plots are placed into a single graphic with the GREPLAY procedure.
title "TPSPLINE fit with lognlambda=0.277"; proc gcontour data=prediction1 gout=grafcat; plot latitude*longitude = P_so4/ name="tpscon1" legend=legend1 vaxis=axis1 haxis=axis2 cframe=ligr hreverse; run; title "TPSPLINE fit with lognlambda=-2.56"; proc gcontour data=prediction2 gout=grafcat; plot latitude*longitude = P_so4/ name="tpscon2" legend=legend1 vaxis=axis1 haxis=axis2 cframe=ligr hreverse; run; title; proc greplay igout=grafcat tc=sashelp.templt template=v2 nofs; treplay 1:tpscon1 2:tpscon2; quit; run;
Compare the two estimates by examining the contour plots of both estimates (Output 64.3.4).
Output 64.3.4: Contour Plot of TPSPLINE Estimates with Different Lambdas
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.