STAT 410 96-2 Assignment 2 Solutions

I found S^2 to be 13.466... There are 20 samples and 20 corresponding values of s^2 to find as in the following table. I used Excel to compute all these values.

Sample	s^2	Sample	s^2	Sample	s^2	Sample	s^2
8 3 1	39/3	8 3 11	49/3	8 3 4	21/3	8 3 7	21/3
8 1 11	79/3	8 1 4	37/3	8 1 7	43/3	8 11 4	37/3
8 11 7	13/3	8 4 7	13/3	3 1 11	84/3	3 1 4	7/3
3 1 7	28/3	3 11 4	57/3	3 11 7	48/3	3 4 7	13/3
1 11 4	79/3	1 11 7	76/3	1 4 7	27/3	11 4 7	37/3

There are 36 possible results of 2 draws with replacement though, e.g., 8 followed by 3 gives the same value of y bar as 3 followed by 8. Here is part of my table of values of (y bar - Y bar)^2 with the corresponding probabilities. You multiply and add and check that the answer is the same as the formulas given. The point is that V(ybar) is an expected value which you compute by taking value time probability and adding.

Sample	Probability x 36	(y bar -Y bar)^2
8 8	1	(8-34/6)^2
8 3	2	(11/2-34/6)^2
8 1	2
8 11	2
8 4	2
8 7	2
3 3	1
3 1	2
3 11	2
3 4	2
3 7	2
1 1	1
1 11	2
1 4	2
1 7	2
11 11	1
11 4	2
11 7	2
4 4	1
4 7	2
7 7	1

The estimated population total is about 51,473 and 10% of that is 5147. You are therefore asked to compute the probability that an estimate comes within 5147 of the corresponding parameter value. You convert 5147 to standard units by estimating the standard error of the population total using formula 2.22 in the text and dividing 5147 by this figure. The result is about 1.55. Thus the probability asked for is the probability that a standard normal is between -1.55 and 1.55 which is about 88% or 0.9 roughly, as Cochran gives. Note that this probability is not estimated too precisely because you have had to plug in a guess for 10% of the population total.
You need to see that a sample size of 12 makes 2 standard errors less than $200 while a sample size of 11 does not. You are given the value of S^2 for the population of N=36 shelves today and will have to assume that the value of S^2 on the future occasion when the shelves are resampled will be about the same. Then you just plug 12 and 11 for n into 2.13 and compare to the desired value for 1 SE of $100.
When you can carry out separate surveys of owners and renters you are able to choose two separate sample sizes, say r and o which will add up to the total sample size n. The standard error for the difference is about 15/r+15/s and you must choose this to be no more than 1. There are many solutions so its up to you to see you should take the solution which minimizes n=r+s. This gives r=s=30. When you cannot separate out owners and renters in advance you will have to take a SRS of n from the total population. You will then get a value of r approximately equal to 0.25n and s equal to roughly 0.75n leading to the condition 15/(0.25n) + 15/(0.75n) = 1 giving n=80. Note that the actual sample sizes achieved will be random and there is a substantial probability that the number of renters will actually be less than 0.25n = 20 in which case the standard error of the difference will be larger than 1. <\li>
Let M be the number of distinct units in the sample. Then any student at this level should be able to compute P_i=Prob(M=i) and get Cochran's answer. Now given M=i the estimate ybar' is the sample mean for a SRS of size i. Thus its conditional mean is Ybar for all values of i and so its unconditional mean is Ybar. Hence you may compute its variance by averaging the conditional variances of ybar' given M=i over values of i. These conditional variances are given by formula 2.8 with n replaced by i. Multiplying the answer in 2.8 by the corresponding P_i and adding gives the formula in Cochran's question in the sentence beginning `One way ...'. The approximation in the previous formula is obtained by multiplying out (2N-1)(N-1)/(6N^2) and discarding the term 1/(6N^2) since this is smaller than terms with only 1/N in them. Finally the inequality V(ybar') < V(ybar) must be verified by comparing the formula for V(ybar') with 2.8 with n=3.
I will eventually distribute a detailed answer to this question. It can be done as follows: write the estimate as ybar + R where R is a random variable which is equal to c if we get y_1 but not y_N (event A), -c if we get y_N but not y_1 (event B) and 0 otherwise (event C). The expected value of our estimate is E(ybar)+E(R) = Ybar + [cP(A) -cP(B) +0P(C)] which is just Ybar because P(A)=P(B). Next var(ybar+R) = var(ybar) + var(R) + 2 cov(ybar,R). You have a formula, 2.8, for var(ybar) and you already know that E(R)=0 so var(R) = E(R^2) =c^2 P(A) + (-c)^2 P(B). Now you actually need to work out P(A) and P(B) and see that these are N-2 choose n-1 over N choose n. Finally computation of cov(ybar, R) = E(R x ybar) requires you to average separately over the events A and B. On the event A, for example ybar is (y_1 + total of n-1 chosen from the set {y_2,...y_(N-1)} ) divided by n. On the same event R is just c. So E(R x ybar | A ) is c(y_1+ (n-1) Ybar2)/n where Ybar2 is the mean of the N-2 numbers {y_2,...y_(N-1)}. Now you can finish the algebra, particularly since the Ybar2 values cancel out.
The hard part is the compution of the mean and variance of this estimate. But E(y_1 + 6 ybar2 + y_8) = E(y_1) + 6 E(ybar2) + E(y_8). Remember that y_1 and y_8 are parameters; they do not change from sample to sample. Moreover, E(ybar2) = Ybar2 and 6Ybar2 = y_2+...+y_7. This E(y_1 + 6 ybar2 + y_8) = Y and so the estimate in the problem is unbiased. Next var(y_1/8+ 6ybar2/8+ y_8/8) = var(3ybar2/4) because y_1 and y_8 are constants. Thus var(ybarst) = 9var(ybar2)/16 and the latter is the value of 2.8 for N=6, n=2 and the population {y_2,...y_7}. The rest of this problem was plugging in numbers.

The questions.