Chapter Contents |
Previous |
Next |
The SURVEYSELECT Procedure |
This example illustrates replicated sampling, which selects multiple samples from the survey population according to the same design. You can use replicated sampling to provide a simple method of variance estimation, or to evaluate variable nonsampling errors such as interviewer differences. Refer to Kish (1965), Kish (1987), and Kalton (1983) for information on replicated sampling.
This design includes 4 replicates, each with a sample size of 50 customers. The sampling frame is stratified by State and sorted by Type and Usage within strata. Customers are selected by sequential random sampling with equal probability within strata. The following PROC SURVEYSELECT statements select a probability sample of customers from the Customers data set using this design.
title1 'Customer Satisfaction Survey'; title2 'Replicated Sampling'; proc surveyselect data=Customers method=seq rep=4 n=(8 12 20 10) seed=40070 out=SampleRep; strata State; control Type Usage; run;
The STRATA statement names the stratification variable State. The CONTROL statement names the control variables Type and Usage. In the PROC SURVEYSELECT statement, the METHOD=SEQ option requests sequential random sampling. The REP=4 option specifies 4 replicates of this sample. The N=(8 12 20 10) option specifies the stratum sample sizes for each replicate. The N= option lists the stratum sample sizes in the same order as the strata appear in the Customers data set, which has been sorted by State. The sample size of 8 customers corresponds to the first stratum, State = `AL'. The sample size 12 corresponds to the next stratum, State = `FL', and so on. The SEED=40070 option specifies '40070' as the initial seed for random number generation.
Figure 63.1.1 displays the output from PROC SURVEYSELECT, which summarizes the sample selection. A total of 200 customers is selected in 4 replicates. PROC SURVEYSELECT selects each replicate using sequential random sampling within strata determined by State. The sampling frame Customers is sorted by control variables Type and Usage within strata, according to hierarchic serpentine sorting. The output data set SampleRep contains the sample.
Output 63.1.1: Sample Selection Summarytitle1 'Customer Satisfaction Survey'; title2 'Sample Selected by Replicated Design'; title3 '(First Stratum)'; proc print data=SampleRep; where State = 'AL'; run;
Figure 63.1.2 displays the 32 sample customers of the first stratum (State = `AL') from the output data set SampleRep, which includes the entire sample of 200 customers. The variable SelectionProb contains the selection probability, and SamplingWeight contains the sampling weight. Since customers are selected with equal probability within strata in this design, all customers in the same stratum have the same selection probability. These selection probabilities and sampling weights apply to a single replicate, and the variable Replicate contains the sample replicate number.
Output 63.1.2: Customer Sample (First Stratum)
|
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.