Chapter Contents |
Previous |
Next |
The SURVEYREG Procedure |
Each state is divided into several regions, and each region is used as a stratum. Within each stratum, a simple random sample with replacement is drawn. A total of 19 farms is selected to the stratified simple random sample. The sample size and population size within each stratum are displayed in Table 62.3.
Table 62.3: Number of Farms in Each StratumNumber of Farms in | ||||
Stratum | State | Region | Population | Sample |
1 | Iowa | 1 | 100 | 3 |
2 | 2 | 50 | 5 | |
3 | 3 | 15 | 3 | |
4 | Nebraska | 1 | 30 | 6 |
5 | 2 | 40 | 2 | |
Total | 235 | 19 |
Three models are considered to represent the data:
Data from the stratified sample are saved in the SAS data set Farms.
data Farms; input State $ Region FarmArea CornYield Weight; datalines; Iowa 1 100 54 33.333 Iowa 1 83 25 33.333 Iowa 1 25 10 33.333 Iowa 2 120 83 10.000 Iowa 2 50 35 10.000 Iowa 2 110 65 10.000 Iowa 2 60 35 10.000 Iowa 2 45 20 10.000 Iowa 3 23 5 5.000 Iowa 3 10 8 5.000 Iowa 3 350 125 5.000 Nebraska 1 130 20 5.000 Nebraska 1 245 25 5.000 Nebraska 1 150 33 5.000 Nebraska 1 263 50 5.000 Nebraska 1 320 47 5.000 Nebraska 1 204 25 5.000 Nebraska 2 80 11 20.000 Nebraska 2 48 8 20.000 ;
In the data set Farms, the variable Weight represents the sampling weight. In this example, the sampling weight is proportional to the reciprocal of the sampling rate within each stratum from which a farm is selected. The information on population size in each stratum is saved in the SAS data set TotalInStrata.
data TotalInStrata; input State $ Region _TOTAL_; datalines; Iowa 1 100 Iowa 2 50 Iowa 3 15 Nebraska 1 30 Nebraska 2 40 ;
Using the sample data from the data set Farms and the control information data from the data set TotalInStrata, you can fit Model I using PROC SURVEYREG.
title1 'Analysis of Farm Area and Corn Yield'; title2 'Model I: Same Intercept and Slope'; proc surveyreg data=Farms total=TotalInStrata; strata State Region / list; model CornYield = FarmArea / covb; weight Weight; run;Output 62.4.1: Data Summary and Stratum Information Fitting Model I
|
Alternatively, you can assume that the linear relationship between corn yield (CornYield) and farm area (FarmArea) is different among the states. Therefore, you consider fitting Model II.
In order to analyze the data using Model II, you create auxiliary variables FarmAreaNE and FarmAreaIA to represent farm area in different states:
The following statements create these variables in a new data set called FarmsByState and use PROC SURVEYREG to fit Model II.
title1 'Analysis of Farm Area and Corn Yield'; title2 'Model II: Same Intercept, Different Slopes'; data FarmsByState; set Farms; if State='Iowa' then do; FarmAreaIA=FarmArea ; FarmAreaNE=0 ; end; else do; FarmAreaIA=0 ; FarmAreaNE=FarmArea; end; run;
The following statements perform the regression using the new data set FarmsByState. The analysis uses the auxilary variables FarmAreaIA and FarmAreaNE as the regressors.
proc SURVEYREG data=FarmsByState total=TotalInStrata; strata State Region; model CornYield = FarmAreaIA FarmAreaNE / covb; weight Weight; run;Output 62.4.3: Regression Results from Fitting Model II
|
For Model III, different intercepts are used for the linear relationship in two states. The following statements illustrate the use of the NOINT option in the MODEL statement associated with the CLASS statement to fit Model III.
title1 'Analysis of Farm Area and Corn Yield'; title2 'Model III: Different Intercepts and Slopes'; proc SURVEYREG data=FarmsByState total=TotalInStrata; strata State Region; class State; model CornYield = State FarmAreaIA FarmAreaNE / noint covb solution; weight Weight; run;
The model statement includes the classification effect State as a regressor. Therefore, the parameter estimates for effect State will presents the intercepts in two states.
Output 62.4.4: Regression Results for Fitting Model III
|
Output 62.4.4 displays the regression results for fitting Model III, including the data summary, parameter estimates, and covariance matrix of the regression coefficients. The estimated covariance matrix shows a lack of correlation between the regression coefficients from different states. This suggests that Model III might be the best choice for building a model for farm area and corn yield in these two states.
However, some statistics remain the same under different regression models, for example, Weighted Mean of CornYield. These estimators do not rely on the particular model you use.
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.