STAT 410 96-2 Assignment 9 Solutions
-
I intended these to be separate estimates. The formulas involved are
6.44 and 6.45 (with estimates plugged in for the population variances and
covariances in 6.45) for the ratio estimate and standard error and 7.48 using 7.56
for the regression estimate with 7.58 for
its standard error. Using plots, or computing
correlation coefficients, I expect to
discover that the correlation between government income and total
employment income is negative so that ratio estimates are very poor.
The regression estimates will not be much better that the usual
N ybar_st unless the correlations are over about 0.3 or so. <\li>
- In this question I intended you to use combined estimates. For
ratio estimates the formulas are 6.48 and 6.51 plugging in estimates for
the needed variances and correlations. For regression use formulas on the middle of page 202
to estimate the slope and then 7.61 to estimate the variance of the
estimate.
<\li>
- There are a total of 9 possible samples. For each sample the separate
regression of the slope is simply the difference of the two y values divided
by the difference of the corresponding x values. The pooled slope is estimated
by the simple formula for bc' on page 202. We are trying to estimate Y=40. I enumerated
the samples in an obvious order and so I am not typing that here.
b1 | b2 | bc | sep est | comb est |
1.50 | 5.00 | 2.20 | 48.50 | 42.90 |
1.50 | 2.00 | 1.85 | 36.50 | 37.27 |
1.50 | 0.50 | 1.00 | 44.00 | 42.00 |
1.67 | 5.00 | 2.00 | 49.33 | 42.00 |
1.67 | 2.00 | 1.83 | 37.33 | 37.50 |
1.67 | 0.50 | 1.31 | 44.83 | 43.04 |
2.00 | 5.00 | 3.50 | 48.00 | 40.50 |
2.00 | 2.00 | 2.00 | 36.00 | 36.00 |
2.00 | 0.50 | 0.80 | 43.50 | 45.90 |
This leads to biases of 3.11 and 0.79 and Mean Squared errors (average of squares
of estimate - 40) of 34.58 and 10.05. This is not the answer in Cochran. Notice
that the bias component and the variance are both bigger for the separate estimates,
emphasizing the danger of these estimates in small samples. The bias of 3.11 is
squared to compare it the the variance or, alternatively compare bias to standard
error. You see that the bias of 3 is a substantial fraction of a standard error (around
5) so that the bias is unacceptably large in this tiny problem.
- In this question you to find 4 variances:
- for systematic there are ten possible values of the sample mean: 22.3, 18.2
and so on. You subtract 4155/200 from these, square, add and divide by 10.
-
- for a srs the variance is (23601/199)(19/20)/10.
- for a one per column sample you must work out the value of formula 5.30 which is
(sum of S_h^2) (19/20)/200. To compute the sum you go to 5.32 where the left hand side is
23601 and the last term on the right hand side is a sum of 10 terms beginning
(410/20 - 4155/200)^2 multiplied by Nh=20. Finally you solve for
sum of S_h^2 remembering that Nh-1 is 19.
- this is like the last one but you group two columns together to
form a stratum. Now the Nh's are 40 and so on.
- The value of P is 81/360, N is 360 and n=360/8=45. From this you work out the
variance of the usual p from an SRS of 45. To get the variance for a systematic
sample you have to consider each of the 8 possible samples and figure out how many
of the sample houses are in the list given. For instance if we sample houses 1, 9,
17, ... I get 7 houses in the list: 33,41,89, 313, 321, 329 and 337. This leads to
p=7/45. You get all 8 values of p, subtract 81/360, square, sum and divide by 8.
- The idea here is that with an average household size of 5 you may well get long stretches
in the sample where all the people sampled are children or conversely all are male heads
of household. This high correlation means that systematic sampling probably has
a large variance for estimating proportion of males or proportion of children. For Polish
origin the clumping of neighbourhoods with large contiguous groups of the same ethnic
background is exactly the situation where systematic sampling works best. Cochran
says that for proportion of males the situation is intermediate; this is because
the children are, in terms of sex, essentially in random order within families --
a simple fact of life. The adults are in periodic or at least quasi-periodic order
however, with the males listed first.
The questions.