STAT 410 96-2 Assignment 2 Solutions
- I found S^2 to be 13.466... There are 20 samples and 20 corresponding
values of s^2 to find as in the following table. I used Excel to compute
all these values.
Sample | s^2 | Sample | s^2 | Sample | s^2 | Sample | s^2 |
8 3 1 | 39/3 | 8 3 11 | 49/3 | 8 3 4 | 21/3 | 8 3 7 | 21/3 |
8 1 11 | 79/3 | 8 1 4 | 37/3 | 8 1 7 | 43/3 | 8 11 4 | 37/3 |
8 11 7 | 13/3 | 8 4 7 | 13/3 | 3 1 11 | 84/3 | 3 1 4 | 7/3 |
3 1 7 | 28/3 | 3 11 4 | 57/3 | 3 11 7 | 48/3 | 3 4 7 | 13/3 |
1 11 4 | 79/3 | 1 11 7 | 76/3 | 1 4 7 | 27/3 | 11 4 7 | 37/3 |
- There are 36 possible results of 2 draws with replacement though,
e.g., 8 followed by 3 gives the same value of y bar as 3 followed by 8.
Here is part of my table of values of (y bar - Y bar)^2 with the corresponding
probabilities. You multiply and add and check that the answer is the same
as the formulas given. The point is that V(ybar) is an expected value
which you compute by taking value time probability and adding.
Sample | Probability x 36 | (y bar -Y bar)^2 |
8 8 | 1 | (8-34/6)^2 |
8 3 | 2 | (11/2-34/6)^2 |
8 1 | 2 |
8 11 | 2 |
8 4 | 2 |
8 7 | 2 |
3 3 | 1 |
3 1 | 2 |
3 11 | 2 |
3 4 | 2 |
3 7 | 2 |
1 1 | 1 |
1 11 | 2 |
1 4 | 2 |
1 7 | 2 |
11 11 | 1 |
11 4 | 2 |
11 7 | 2 |
4 4 | 1 |
4 7 | 2 |
7 7 | 1 |
- The estimated population total is about 51,473 and 10% of
that is 5147. You are therefore asked to compute the
probability that an estimate comes within 5147 of the corresponding
parameter value. You convert 5147 to standard units by estimating
the standard error of the population total using formula 2.22 in
the text and dividing 5147 by this figure. The result is about
1.55. Thus the probability asked for is the probability that a
standard normal is between -1.55 and 1.55 which is about 88% or
0.9 roughly, as Cochran gives. Note that this probability is not
estimated too precisely because you have had to plug in a guess
for 10% of the population total.
- You need to see that a sample size of 12 makes 2 standard errors
less than $200 while a sample size of 11 does not. You are given the
value of S^2 for the population of N=36 shelves today and will have to
assume that the value of S^2 on the future occasion when the shelves
are resampled will be about the same. Then you just plug 12 and 11
for n into 2.13 and compare to the desired value for 1 SE of $100.
- When you can carry out separate surveys of owners and renters
you are able to choose two separate sample sizes, say r and o which
will add up to the total sample size n. The standard error for the
difference is about 15/r+15/s and you must choose this to be no more
than 1. There are many solutions so its up to you to see you should
take the solution which minimizes n=r+s. This gives r=s=30. When
you cannot separate out owners and renters in advance you will have to
take a SRS of n from the total population. You will then get a value
of r approximately equal to 0.25n and s equal to roughly 0.75n leading
to the condition 15/(0.25n) + 15/(0.75n) = 1 giving n=80. Note
that the actual sample sizes achieved will be random and there is
a substantial probability that the number of renters will actually
be less than 0.25n = 20 in which case the standard error
of the difference will be larger than 1.
<\li>
- Let M be the number of distinct units in the sample. Then
any student at this level should be able to compute P_i=Prob(M=i)
and get Cochran's answer. Now given M=i the estimate ybar' is the
sample mean for a SRS of size i. Thus its conditional mean is Ybar
for all values of i and so its unconditional mean is Ybar.
Hence you may compute its variance by averaging the conditional variances
of ybar' given M=i over values of i. These conditional variances are
given by formula 2.8 with n replaced by i. Multiplying the answer in
2.8 by the corresponding P_i and adding gives the formula in Cochran's
question in the sentence beginning `One way ...'. The approximation in
the previous formula is obtained by multiplying out (2N-1)(N-1)/(6N^2)
and discarding the term 1/(6N^2) since this is smaller than terms with only
1/N in them. Finally the inequality V(ybar') < V(ybar) must be verified
by comparing the formula for V(ybar') with 2.8 with n=3.
- I will eventually distribute a detailed answer to this question.
It can be done as follows: write the estimate as ybar + R where R
is a random variable which is equal to c if we get y_1 but not y_N (event
A),
-c if we get y_N but not y_1 (event B) and 0 otherwise (event C).
The expected value
of our estimate is E(ybar)+E(R) = Ybar + [cP(A) -cP(B) +0P(C)] which
is just Ybar because P(A)=P(B). Next var(ybar+R) = var(ybar) + var(R)
+ 2 cov(ybar,R). You have a formula, 2.8, for var(ybar) and you already
know that E(R)=0 so var(R) = E(R^2) =c^2 P(A) + (-c)^2 P(B). Now
you actually need to work out P(A) and P(B) and see that these are
N-2 choose n-1 over N choose n. Finally computation of cov(ybar, R)
= E(R x ybar) requires you to average separately over the events A and
B. On the event A, for example ybar is (y_1 + total of n-1 chosen from
the set {y_2,...y_(N-1)} ) divided by n.
On the same event R is just c. So E(R x ybar | A ) is c(y_1+ (n-1) Ybar2)/n
where Ybar2 is the mean of the N-2 numbers {y_2,...y_(N-1)}. Now you
can finish the algebra, particularly since the Ybar2 values cancel out.
- The hard part is the compution of the mean and variance of this
estimate. But E(y_1 + 6 ybar2 + y_8) = E(y_1) + 6 E(ybar2) + E(y_8). Remember
that y_1 and y_8 are parameters; they do not change from sample to sample.
Moreover, E(ybar2) = Ybar2 and 6Ybar2 = y_2+...+y_7. This E(y_1 + 6 ybar2 +
y_8) = Y and so the estimate in the problem is unbiased. Next
var(y_1/8+ 6ybar2/8+ y_8/8) = var(3ybar2/4) because y_1 and y_8 are
constants. Thus var(ybarst) = 9var(ybar2)/16 and the latter is the
value of 2.8 for N=6, n=2 and the population {y_2,...y_7}. The rest
of this problem was plugging in numbers.
The questions.