Stat403/650

More Complex Experimental Designs

In this section we are interested in recognizing and understanding three other types of experimental designs. These are designs with subsampling, split plot designs, and repeated measures designs.

Chapter 8 of Carl Schwarz' notes deals with subsampling and "pseudo-replication" in experiments. Chapter 10 covers split-plot experimental designs.

Some main ideas: In subsampling designs, the observational units are a random subsample from each of the experimental units to which treatments are assigned. In split plot designs, one treatment factor is applied to large experimental units while another treatment factor is assigned to smaller experimental units within the larger units (whole plots). In repeated measures designs, observations are made repeatedly over time on the same unit, often under different conditions.

Learning goals: Recognize these design elements when you see them and understand why the analysis may need to be different to accommodate these features.

Elements of these design features often show up in ecological experiments, environmental impact studies, resource management experiments, and many other types of studies. Analyses can be somewhat complex, involving mixed effects models, time series, and other methods.

Example: Experimental design with subsampling.

The sea urchin grazing experiment of Andrew and Underwood (Andrew, N.L., and Underwood, A.J. 1993. Density-dependent foraging in the sea urchin Centrostephanus rodgersii on shallow subtidal reefs in New South Wales, Australia. Marine Ecology Progress Series 99: 89-98) is described in the book Experimental Design and Data Analysis for Biologists by G.P. Quinn and M.J. Keough (2002), Cambridge University Press. The data were obtained from the book website maintained by the publisher. Since the columns of data are separated by commas rather than spaces, the data are read in with the "column separated variables" variation of the read.table function. (Alternatively, one can use "read.table" and use an option for this.) A brief description of the data variables is found here.

In the experiment, the density of sea urchins in the subtidal study region was manipulated to study the effects of grazing on the percentage cover of filamentous algae. The four treatment levels (no urchins, 33% of original, 66% of original, and 100% or original density) were assigned completely at random to 16 patches, so that each treatment was replicated 4 times. Then the response (algae percent cover) was measured in a simple random subsample of five quadrats in each of the patches. Thus, the experimental units are patches and the observational units are quadrats.

>urchins <- read.csv(file="http://www.stat.sfu.ca/~thompson/stat403-650/data/andrew.csv",header=T)

> anova(lm(ALGAE~TREAT*as.factor(PATCH),data=urchins))

Analysis of Variance Table

Response: ALGAE
                 Df Sum Sq Mean Sq F value    Pr(>F)
TREAT             3 14429.1 4809.7 16.1075 6.579e-08 ***
as.factor(PATCH) 12 21241.9 1770.2 5.9282 8.323e-07 ***
Residuals        64 19110.4   298.6
---
Signif. codes: 0 ¡Æ***¡Ç 0.001 ¡Æ**¡Ç 0.01 ¡Æ*¡Ç 0.05 ¡Æ.¡Ç 0.1 ¡Æ ¡Ç 1
>

The above analysis of variance table has the right mean squares and degrees of freedom, but does not have the right F value for testing treatments, because it has used the residual (subsampling) mean square in the denominator. This gives a highly inflated F value and unrealistically small p-value because the variance among nearby quadrats within a patch is small compared to the variability between patches. The correct F value is obtained by dividing the mean square for treatments by the mean square for patches. It is calculated below together with the correct p-value. So the experiment has not shown strong evidence for an effect of sea urchin density on algae cover, even though the pattern of the results are suggestive of such an effect.

> 4809.7/1770.2
[1] 2.717038
> pf(2.717038,3,12,lower.tail=F)
[1] 0.09126678

-----------------------------------------------------------------------
Below, the analysis is repeated with a transformation of the response variable to get more equal variances.

> anova(lm(log(ALGAE+1)~TREAT + as.factor(PATCH),data=urchins))
Analysis of Variance Table

Response: log(ALGAE + 1)
                 Df Sum Sq Mean Sq F value    Pr(>F)
TREAT             3 72.737 24.246 19.6046 3.972e-09 ***
as.factor(PATCH) 12 100.229   8.352 6.7536 1.145e-07 ***
Residuals        64 79.151   1.237
---
Signif. codes: 0 ¡Æ***¡Ç 0.001 ¡Æ**¡Ç 0.01 ¡Æ*¡Ç 0.05 ¡Æ.¡Ç 0.1 ¡Æ ¡Ç   1

> 24.246/8.352
[1] 2.903017
> ?qf
> pf(2.903,3,12)
[1] 0.9214252
> pf(2.903,3,12,lower.tail=F)
[1] 0.07857484
> 24.246/8.352

> treat <- c(rep(0,20),rep(33,20),rep(66,20),rep(100,20))
> plot(treat,ALGAE)

> plot(treat,log(ALGAE))

Example: Split-plot experiment

This split plot example is also described in Venables and Ripley (William N. Venables and Brian D. Ripley. Modern Applied Statistics with S. Fourth Edition. Springer, 2002).
The source of the oats data is Yates, F. (1935) Complex experiments, _Journal of the Royal Statistical Society Suppl._ *2*, 181-247. This example is also given in the book Yates, F. (1970) Experimental design: Selected papers of Frank Yates, C.B.E, F.R.S. London: Griffin.

The R help page ("?oats") gives the following description of the data set:

    The yield of oats from a split-plot field trial using three
     varieties and four levels of manurial treatment. The experiment
     was laid out in 6 blocks of 3 main plots, each split into 4
     sub-plots. The varieties were applied to the main plots and the
     manurial treatments to the sub-plots.

library(MASS)
oats
?oats

The example with the data set, based on the description in Venables and Ripley, gives a sequence of R commands based on the "aov" ananalysis of variance function, which allows specification of an error structure. More recent discussions of split plot analyses in R generally prefer the use of the linear mixed effects function "lme" in the package "nlme".