Example 43.4: Fisher Test with Permutation Resampling
These data, from Brown and Fears (1981), are the results from an
80-week carcinogenesis bioassay with female mice. Six tissue
sites are examined at necropsy; 1 indicates the presence of a tumor
and 0 the absence. A frequency variable Freq is included. A control
and four different doses of a drug (in parts per milliliter) make up
the levels of the grouping variable Dose.
data a;
input Liver Lung Lymph Cardio Pitui Ovary Freq Dose$;
datalines;
1 0 0 0 0 0 8 CTRL
0 1 0 0 0 0 7 CTRL
0 0 1 0 0 0 6 CTRL
0 0 0 1 0 0 1 CTRL
0 0 0 0 0 1 2 CTRL
1 1 0 0 0 0 4 CTRL
1 0 1 0 0 0 1 CTRL
1 0 0 0 0 1 1 CTRL
0 1 1 0 0 0 1 CTRL
0 0 0 0 0 0 18 CTRL
1 0 0 0 0 0 9 4PPM
0 1 0 0 0 0 4 4PPM
0 0 1 0 0 0 7 4PPM
0 0 0 1 0 0 1 4PPM
0 0 0 0 1 0 2 4PPM
0 0 0 0 0 1 1 4PPM
1 1 0 0 0 0 4 4PPM
1 0 1 0 0 0 3 4PPM
1 0 0 0 1 0 1 4PPM
0 1 1 0 0 0 1 4PPM
0 1 0 1 0 0 1 4PPM
1 0 1 1 0 0 1 4PPM
0 0 0 0 0 0 15 4PPM
1 0 0 0 0 0 8 8PPM
0 1 0 0 0 0 3 8PPM
0 0 1 0 0 0 6 8PPM
0 0 0 1 0 0 3 8PPM
1 1 0 0 0 0 1 8PPM
1 0 1 0 0 0 2 8PPM
1 0 0 1 0 0 1 8PPM
1 0 0 0 1 0 1 8PPM
1 1 0 1 0 0 2 8PPM
1 1 0 0 0 1 2 8PPM
0 0 0 0 0 0 19 8PPM
1 0 0 0 0 0 4 16PPM
0 1 0 0 0 0 2 16PPM
0 0 1 0 0 0 9 16PPM
0 0 0 0 1 0 1 16PPM
0 0 0 0 0 1 1 16PPM
1 1 0 0 0 0 4 16PPM
1 0 1 0 0 0 1 16PPM
0 1 1 0 0 0 1 16PPM
0 1 0 1 0 0 1 16PPM
0 1 0 0 0 1 1 16PPM
0 0 1 1 0 0 1 16PPM
0 0 1 0 1 0 1 16PPM
1 1 1 0 0 0 2 16PPM
0 0 0 0 0 0 14 16PPM
1 0 0 0 0 0 8 50PPM
0 1 0 0 0 0 4 50PPM
0 0 1 0 0 0 8 50PPM
0 0 0 1 0 0 1 50PPM
0 0 0 0 0 1 4 50PPM
1 1 0 0 0 0 3 50PPM
1 0 1 0 0 0 1 50PPM
0 1 1 0 0 0 1 50PPM
0 1 0 0 1 1 1 50PPM
0 0 0 0 0 0 19 50PPM
;
proc multtest data=a order=data notables out=p
permutation nsample=1000 seed=764511;
test fisher(Liver Lung Lymph Cardio Pitui Ovary /
lowertailed);
class Dose;
freq Freq;
run;
proc print data=p;
run;
In the PROC MULTTEST statement, the ORDER=DATA option is required to
keep the levels of Dose in the order in which they appear in the data set.
Without this option, the levels are sorted by their formatted value,
resulting in an alphabetic ordering. The NOTABLES option suppresses
the display of summary statistics, and the OUT=P option requests an
output data set containing p-values. The PERMUTATION option
specifies permutation resampling, NSAMPLE=1000 requests 1000
samples, and SEED=764511 provides a starting value for the
random number generator. You should specify a seed if you need to
duplicate resampling results.
The TEST statement requests a lower-tailed Fisher exact test for the
six tissue sites. The Fisher test is appropriate for comparing a
treatment and a control, but multiple testing can be a problem.
Brown and Fears (1981) use a multivariate permutation to evaluate the
entire collection of tests. PROC MULTTEST adjusts the p-values by
simulation.
The treatments make up the levels of the grouping variable
Dose, listed in the CLASS statement. Since no CONTRAST statement
is specified, PROC MULTTEST uses the default pairwise contrasts with
the first level of Dose. The FREQ statement is used since
this is summary data containing frequency counts of occurrences.
The results from this analysis are listed in Output 43.4.1.
Output 43.4.1: Fisher Test with Permutation Resampling
Model Information |
Test for discrete variables: |
Fisher |
Tails for discrete tests: |
Lower-tailed |
Strata adjustment? |
No |
P-value adjustment: |
Permutation |
Number of resamples: |
1000 |
Seed: |
764511 |
|
The preceding table lists the PROC MULTTEST specifications.
Contrast Coefficients |
Contrast |
Dose |
CTRL |
4PPM |
8PPM |
16PPM |
50PPM |
CTRL vs. 4PPM |
1 |
-1 |
0 |
0 |
0 |
CTRL vs. 8PPM |
1 |
0 |
-1 |
0 |
0 |
CTRL vs. 16PPM |
1 |
0 |
0 |
-1 |
0 |
CTRL vs. 50PPM |
1 |
0 |
0 |
0 |
-1 |
|
The preceding table lists the default contrasts for the Fisher
test. Note that each dose is compared with the control.
p-Values |
Variable |
Contrast |
Raw |
Permutation |
Liver |
CTRL vs. 4PPM |
0.2828 |
0.9640 |
Liver |
CTRL vs. 8PPM |
0.3069 |
0.9770 |
Liver |
CTRL vs. 16PPM |
0.7102 |
1.0000 |
Liver |
CTRL vs. 50PPM |
0.7718 |
1.0000 |
Lung |
CTRL vs. 4PPM |
0.7818 |
1.0000 |
Lung |
CTRL vs. 8PPM |
0.8858 |
1.0000 |
Lung |
CTRL vs. 16PPM |
0.5469 |
1.0000 |
Lung |
CTRL vs. 50PPM |
0.8498 |
1.0000 |
Lymph |
CTRL vs. 4PPM |
0.2423 |
0.9320 |
Lymph |
CTRL vs. 8PPM |
0.5898 |
1.0000 |
Lymph |
CTRL vs. 16PPM |
0.0350 |
0.2690 |
Lymph |
CTRL vs. 50PPM |
0.4161 |
0.9940 |
Cardio |
CTRL vs. 4PPM |
0.3163 |
0.9780 |
Cardio |
CTRL vs. 8PPM |
0.0525 |
0.3630 |
Cardio |
CTRL vs. 16PPM |
0.4506 |
0.9990 |
Cardio |
CTRL vs. 50PPM |
0.7576 |
1.0000 |
Pitui |
CTRL vs. 4PPM |
0.1250 |
0.7300 |
Pitui |
CTRL vs. 8PPM |
0.4948 |
1.0000 |
Pitui |
CTRL vs. 16PPM |
0.2157 |
0.9130 |
Pitui |
CTRL vs. 50PPM |
0.5051 |
1.0000 |
Ovary |
CTRL vs. 4PPM |
0.9437 |
1.0000 |
Ovary |
CTRL vs. 8PPM |
0.8126 |
1.0000 |
Ovary |
CTRL vs. 16PPM |
0.7760 |
1.0000 |
Ovary |
CTRL vs. 50PPM |
0.3689 |
0.9940 |
|
The preceding "p-Values" table lists p-values for the
Fisher exact tests and their permutation-based adjustments. As
noted by Brown and Fears, only one of the twenty-four tests is
significant at the 5% level (Lymph, CTRL vs. 16PPM). Brown and
Fears report a 12% chance of observing at least one significant raw
p-value for 16PPM and a 9% chance of observing at least one
significant raw p-value for Lymph (both at the 5% level).
Adjusted p-values exhibit much lower chances of false
significances. For this example, none of the adjusted p-values
are close to significant.
Obs |
_test_ |
_var_ |
_contrast_ |
_xval_ |
_mval_ |
_yval_ |
_nval_ |
raw_p |
perm_p |
sim_se |
1 |
FISHER |
Liver |
CTRL vs. 4PPM |
14 |
49 |
18 |
50 |
0.28282 |
0.964 |
0.005891 |
2 |
FISHER |
Liver |
CTRL vs. 8PPM |
14 |
49 |
17 |
48 |
0.30688 |
0.977 |
0.004740 |
3 |
FISHER |
Liver |
CTRL vs. 16PPM |
14 |
49 |
11 |
43 |
0.71022 |
1.000 |
0.000000 |
4 |
FISHER |
Liver |
CTRL vs. 50PPM |
14 |
49 |
12 |
50 |
0.77175 |
1.000 |
0.000000 |
5 |
FISHER |
Lung |
CTRL vs. 4PPM |
12 |
49 |
10 |
50 |
0.78180 |
1.000 |
0.000000 |
6 |
FISHER |
Lung |
CTRL vs. 8PPM |
12 |
49 |
8 |
48 |
0.88581 |
1.000 |
0.000000 |
7 |
FISHER |
Lung |
CTRL vs. 16PPM |
12 |
49 |
11 |
43 |
0.54685 |
1.000 |
0.000000 |
8 |
FISHER |
Lung |
CTRL vs. 50PPM |
12 |
49 |
9 |
50 |
0.84978 |
1.000 |
0.000000 |
9 |
FISHER |
Lymph |
CTRL vs. 4PPM |
8 |
49 |
12 |
50 |
0.24228 |
0.932 |
0.007961 |
10 |
FISHER |
Lymph |
CTRL vs. 8PPM |
8 |
49 |
8 |
48 |
0.58977 |
1.000 |
0.000000 |
11 |
FISHER |
Lymph |
CTRL vs. 16PPM |
8 |
49 |
15 |
43 |
0.03498 |
0.269 |
0.014023 |
12 |
FISHER |
Lymph |
CTRL vs. 50PPM |
8 |
49 |
10 |
50 |
0.41607 |
0.994 |
0.002442 |
13 |
FISHER |
Cardio |
CTRL vs. 4PPM |
1 |
49 |
3 |
50 |
0.31631 |
0.978 |
0.004639 |
14 |
FISHER |
Cardio |
CTRL vs. 8PPM |
1 |
49 |
6 |
48 |
0.05254 |
0.363 |
0.015206 |
15 |
FISHER |
Cardio |
CTRL vs. 16PPM |
1 |
49 |
2 |
43 |
0.45061 |
0.999 |
0.000999 |
16 |
FISHER |
Cardio |
CTRL vs. 50PPM |
1 |
49 |
1 |
50 |
0.75758 |
1.000 |
0.000000 |
17 |
FISHER |
Pitui |
CTRL vs. 4PPM |
0 |
49 |
3 |
50 |
0.12496 |
0.730 |
0.014039 |
18 |
FISHER |
Pitui |
CTRL vs. 8PPM |
0 |
49 |
1 |
48 |
0.49485 |
1.000 |
0.000000 |
19 |
FISHER |
Pitui |
CTRL vs. 16PPM |
0 |
49 |
2 |
43 |
0.21572 |
0.913 |
0.008912 |
20 |
FISHER |
Pitui |
CTRL vs. 50PPM |
0 |
49 |
1 |
50 |
0.50505 |
1.000 |
0.000000 |
21 |
FISHER |
Ovary |
CTRL vs. 4PPM |
3 |
49 |
1 |
50 |
0.94372 |
1.000 |
0.000000 |
22 |
FISHER |
Ovary |
CTRL vs. 8PPM |
3 |
49 |
2 |
48 |
0.81260 |
1.000 |
0.000000 |
23 |
FISHER |
Ovary |
CTRL vs. 16PPM |
3 |
49 |
2 |
43 |
0.77596 |
1.000 |
0.000000 |
24 |
FISHER |
Ovary |
CTRL vs. 50PPM |
3 |
49 |
5 |
50 |
0.36889 |
0.994 |
0.002442 |
|
The preceding table lists the OUT= data set. The _test_,
_var_, and _contrast_ variables provide the TEST
name, TEST variable, and CONTRAST label, respectively. The
_xval_, _mval_, _yval_, and _nval_
variables contain the components used to compute the Fisher exact
tests from the hypergeometric distribution. The raw_p
variable contains the p-values from the Fisher exact tests, and
the perm_p variable contains their permutation-based
adjustments. The variable sim_se is the simulation standard
error from the permutation resampling.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.