next up previous

STAT 350: Lecture 29

Power and Sample Size Calculations

Definition: The power function of a test procedure in a model with parameters tex2html_wrap_inline101 is tex2html_wrap_inline103

Definition: The non-central distribution t with non-centrality parameter tex2html_wrap_inline107 and degrees of freedom tex2html_wrap_inline109 is the distribution of

displaymath111

where Z is tex2html_wrap_inline115 , U is tex2html_wrap_inline119 and Z and U are independent When tex2html_wrap_inline125 we get the usual, or central t distribution.

Fact: If a is a vector of length p and tex2html_wrap_inline133 is some scalar then

displaymath135

has a non-central t distribution with non-centrality parameter

displaymath139

Power of two sided tests from table B 5. Normally computed before experiment based on assumptions about tex2html_wrap_inline133 , tex2html_wrap_inline143 and tex2html_wrap_inline145 .

Sample Size determination

Before an experiment is run it is sensible, if the experiment is costly, to try to work out whether or not it is worth doing. You will only do an experiment if the probability of Type I and II errors are both reasonably low. The simplest case arises when you prespecify a level, say tex2html_wrap_inline147 and an acceptable probability of Type II error, tex2html_wrap_inline149 say 0.10, for testing a null hypothesis like tex2html_wrap_inline151 . Then you need to specify

F tests

Simplest example: regression through the origin (no intercept term.)

Suppose now that the null hypothesis is false.

FACT:

If W is a tex2html_wrap_inline217 random vector and Q is idempotent with rank p then tex2html_wrap_inline223 has a non-central tex2html_wrap_inline201 distribution with non-centrality parameter

displaymath227

and p degrees of freedom. This is the same distribution as that of

displaymath231

where the tex2html_wrap_inline233 are iid standard normals. An ordinary tex2html_wrap_inline201 variable is called central and has tex2html_wrap_inline125 .

FACT:

If U and V are independent tex2html_wrap_inline201 variables with degrees of freedom tex2html_wrap_inline245 and tex2html_wrap_inline247 , V is central and U is non-central with non-centrality parameter tex2html_wrap_inline253 then

displaymath255

is said to have a non-central F distribution with non-centrality parameter tex2html_wrap_inline253 and degrees of freedom tex2html_wrap_inline245 and tex2html_wrap_inline247 .

POWER CALCULATIONS

SAMPLE SIZE CALCULATIONS

Examples

POWER of t test: SAND and FIBRE example. See Lecture 11

Consider fitting the model

displaymath289

Compute power of t test of tex2html_wrap_inline293 for the alternative tex2html_wrap_inline295 . (This is roughly the fitted value. In practice, however, this value needs to be specified before collecting data so you just have to guess or use experience with previous related data sets or work out a value which would make a difference big enough to matter compared to the straight line.)

Need to assume a value for tex2html_wrap_inline143 . I take 2.5 - a nice round number near the fitted value. Again, in practice, you will have to make this number up in some reasonable way.

Finally tex2html_wrap_inline299 and tex2html_wrap_inline301 has to be computed. For the design actually used this is tex2html_wrap_inline303 . Now tex2html_wrap_inline107 is 2. The power of a two-sided t test at level 0.05 and with 18-4=14 degrees of freedom is 0.46 (from table B 5 page 1346).

Take notice that you need to specify tex2html_wrap_inline277 , tex2html_wrap_inline313 (or even tex2html_wrap_inline315 and tex2html_wrap_inline143 ) and the design!

SAMPLE SIZE NEEDED using t test: SAND and FIBRE example.

Now for the same assumed values of the parameters how many replicates of the basic design (using 9 combinations of sand and fibre contents) would I need to get a power of 0.95? The matrix tex2html_wrap_inline145 for m replicates of the design actually used is m times the same matrix for 1 replicate. This means that tex2html_wrap_inline301 will be 1/m times the same quantiity for 1 replicate. Thus the value of tex2html_wrap_inline107 for m replicates will be tex2html_wrap_inline335 times the value for our design, which was 2. With m replicates the degrees of freedom for the t-test will be 18m-6. We now need to find a value of m so that in the row in Table B 5 across from 18m-6 degrees of freedom and the column corresponding to

displaymath347

we find 0.95. To simplify we try just assuming that the solution m is quite large abd use the last line of the table. We get tex2html_wrap_inline107 between 3 and 4 - say about 3.75. Now set tex2html_wrap_inline353 and solve to find m=3.42 which would have to be rounded to 4 meaning a total sample size of tex2html_wrap_inline357 . For this value of m the non-centrality parameter is actually 4 (not the target of 3.75 because of rounding) and the power is 0.98. Notice that for this value of m the degrees of freedom for error is 66 which is so far down the table that the powers are not much different from the tex2html_wrap_inline363 line.

POWER of F test: SAND and FIBRE example.

Now consider the power of the test that all the higher order terms are 0 in the model

displaymath367

that is the power of the F test of tex2html_wrap_inline371 .

You will need to specify the non-centrality parameter for this F test. In general the noncentrality parameter for a F test based on tex2html_wrap_inline245 numerator degrees of freedom is given by

displaymath379

This quantity needs to be worked out algebraically for each separate case, however, some general points can be made.


next up previous



Richard Lockhart
Thu Mar 13 23:01:20 PST 1997