Chapter Contents |
Previous |
Next |
The PHREG Procedure |
For survival data with multiple failure outcomes for each subject, the failures may be repetitions of the same kind of event or they may be events of different nature.
The Andersen-Gill (AG) model and the Prentice, Williams, and Peterson (1981) models, also referred to as the PWP models, can be used in the analysis of repeated failure outcomes of the same kind, while the marginal analysis approach of Wei, Lin, and Weissfeld (1989), also referred to as the WLW analysis, can be applied to both multiple events of the same types and multiple events of different types. The bladder cancer data listed in Wei, Lin, and Weissfeld (1989) are used to illustrate these methods of analyses.
The data consist of 86 patients with superficial bladder tumors, which were removed when they entered the study. Of these patients, 48 were randomized into the placebo group, and 38 were randomized into the thiotepa group. Many patients had multiple recurrences of tumors during the study, and new tumors were removed at each visit. The data set contains the first four recurrences of the tumor for each patient, and each recurrence time was measured from the patient's entry time into the study.
The input data consist of the following eight variables:
In the data set Bladder, four observations are created for each patient, each corresponding to one of the four tumor recurrences. In addition to values of Trt, Number, and Size for the patient, each observation contains the following variables:
For instance, a patient with only one recurrence time at month 6, who was followed until month 10, will have values for Visit, TStart, TStop, and Status of (1,0,6,1), (2,6,10,0), (3,10,10,0), and (4,10,10,0). If the follow-up time of a patient is beyond the time of the fourth tumor recurrence, an extra observation is created to represent this last follow-up period. For instance, a patient with recurrences at months 2, 15, 34, and 50, who was followed until month 61, will have five observations with values for Visit, TStart, TStop, and Status of (1,0,2,1), (2,2,15,1), (3,15,34,1), (4,34,50,1), and (5,50,61,0). In the former situation, the last two observations are redundant for the AG model, but they are important for the WLW analysis. In the latter situation, the fifth observation is not needed for the WLW model, but it is indispensable to the AG analysis.
The following SAS statements create the data set Bladder:
data Bladder; keep ID TStart TStop Status Trt Number Size Visit; retain ID TStart 0; array tt T1-T4; infile datalines missover; input Trt Time Number Size T1-T4; ID + 1; TStart=0; do over tt; Visit=_i_; if tt = . then do; TStop=Time; Status=0; end; else do; TStop=tt; Status=1; end; output; TStart=TStop; end; if (TStart < Time) then do; TStop= Time; Status=0; Visit=5; output; end; datalines; 1 0 1 1 1 1 1 3 1 4 2 1 1 7 1 1 1 10 5 1 1 10 4 1 6 1 14 1 1 1 18 1 1 1 18 1 3 5 1 18 1 1 12 16 1 23 3 3 1 23 1 3 10 15 1 23 1 1 3 16 23 1 23 3 1 3 9 21 1 24 2 3 7 10 16 24 1 25 1 1 3 15 25 1 26 1 2 1 26 8 1 1 1 26 1 4 2 26 1 28 1 2 25 1 29 1 4 1 29 1 2 1 29 4 1 1 30 1 6 28 30 1 30 1 5 2 17 22 1 30 2 1 3 6 8 12 1 31 1 3 12 15 24 1 32 1 2 1 34 2 1 1 36 2 1 1 36 3 1 29 1 37 1 2 1 40 4 1 9 17 22 24 1 40 5 1 16 19 23 29 1 41 1 2 1 43 1 1 3 1 43 2 6 6 1 44 2 1 3 6 9 1 45 1 1 9 11 20 26 1 48 1 1 18 1 49 1 3 1 51 3 1 35 1 53 1 7 17 1 53 3 1 3 15 46 51 1 59 1 1 1 61 3 2 2 15 24 30 1 64 1 3 5 14 19 27 1 64 2 3 2 8 12 13 2 1 1 3 2 1 1 1 2 5 8 1 5 2 9 1 2 2 10 1 1 2 13 1 1 2 14 2 6 3 2 17 5 3 1 3 5 7 2 18 5 1 2 18 1 3 17 2 19 5 1 2 2 21 1 1 17 19 2 22 1 1 2 25 1 3 2 25 1 5 2 25 1 1 2 26 1 1 6 12 13 2 27 1 1 6 2 29 2 1 2 2 36 8 3 26 35 2 38 1 1 2 39 1 1 22 23 27 32 2 39 6 1 4 16 23 27 2 40 3 1 24 26 29 40 2 41 3 2 2 41 1 1 2 43 1 1 1 27 2 44 1 1 2 44 6 1 2 20 23 27 2 45 1 2 2 46 1 4 2 2 46 1 4 2 49 3 3 2 50 1 1 2 50 4 1 4 24 47 2 54 3 4 2 54 2 1 38 2 59 1 3 ;
The counting process MODEL specification is used to carry out the analysis of the AG model. Note that some of the observations in the data set Bladder have a degenerated interval of risk. The presence of these observations does not affect the results of the analysis since none of these observations are included in any of the risk sets. However, the procedure will run more efficiently without these observations; consequently, in the following SAS statements, the WHERE clause is used to eliminate these redundant observations.
*title 'Andersen-Gill Multiplicative Hazards Model'; proc phreg data=Bladder; model (TStart, TStop) * Status(0) = Trt Number Size; where TStart < TStop; run;
Results of fitting the AG model are shown in Output 49.8.1.
Output 49.8.1: Fitting Andersen-Gill Multiplicative Hazards Model
The WLW analysis regards the recurrence times
as multivariate
failure times, and it models the marginal distribution of each
component time with the Cox proportional hazards model.
No specific correlation structure is imposed on the
multiple failure times.
For the kth marginal model,
let denote the row vector of regression parameters,
let denote the maximum likelihood estimate of ,let denote the covariance matrix obtained by inverting
the information matrix, and let Ri denote the matrix of score
residuals. Wei, Lin, and
Weissfeld (1989) showed that the joint distribution of
can be approximated by a multivariate normal distribution
with mean
vector and robust covariance matrix
The PHREG procedure computes the DFBETA statistics, which are precisely the products . By outputting the DFBETA statistics into an OUTPUT data set, you can subsequently use SAS/IML software to compute the robust covariance matrix in a straightforward manner.
In this example, there are four marginal proportional hazards models, one for each recurrence time. Instead of fitting one model at a time, you can fit all four marginal models in one analysis by using the STRATA statement. A new input data set (named Bladder2) is created from the data set Bladder by eliminating observations that have a Visit value of 5. These observations contain the follow-up information beyond the fourth recurrence time and are irrelevant in the fitting of the four marginal models. In addition, four treatment variables, Trt1, Trt2, Trt3, and Trt4, are created from variables Trt and Visit, representing the treatment group for each of the four values of Visit. For instance, Trt1=Trt when the Visit value is 1, and Trt1=0 when the Visit value is not 1. Likewise, variables Number1, Number2, Number3, and Number4 are created from the variables Number and Visit; variables Size1, Size2, Size3, and Size4 are created from the variables Size and Visit.
The following SAS statements create the data set Bladder2:
data Bladder2; set Bladder; if Visit < 5; Trt1= Trt * (Visit=1); Trt2= Trt * (Visit=2); Trt3= Trt * (Visit=3); Trt4= Trt * (Visit=4); Number1= Number * (Visit=1); Number2= Number * (Visit=2); Number3= Number * (Visit=3); Number4= Number * (Visit=4); Size1= Size * (Visit=1); Size2= Size * (Visit=2); Size3= Size * (Visit=3); Size4= Size * (Visit=4); run;
The following SAS statements fit the marginal models. The parameter estimates are output to a data set named Est1; the DFBETA statistics for the treatment variables are output to a data set named Out1.
*title 'Fitting Marginal Proportional Hazards Models'; proc phreg data=Bladder2 outest=Est1; model TStop*Status(0)=Trt1-Trt4 Number1-Number4 Size1-Size4; output out=Out1 dfbeta=dt1-dt4 / order=data; strata Visit; id ID; run;
The output of this analysis is shown in Output 49.8.2
Output 49.8.2: Fitting Marginal Proportional Hazards Models
|
proc means data=Out1 noprint; by ID; var dt1-dt4; output out=Out2 sum=dt1-dt4; proc iml; use Out2; read all var{dt1 dt2 dt3 dt4} into x; v=x` * x; reset noname; vname={"Trt1","Trt2","Trt3","Trt4"}; print,"Estimated Covariance Matrix",, v[colname=vname rowname=vname format=10.5]; create RCov from v[colname=vname rowname=vname]; append from v[rowname=vname];
The estimated robust covariance matrix is displayed in Output 49.8.3.
Output 49.8.3: Robust Covariance Matrix for the Treatment Coefficients
|
proc iml; use Est1; read all var{Trt1 Trt2 Trt3 Trt4} into Trt; b= Trt`; use Out2; read all var{dt1 dt2 dt3 dt4} into x; v=x` * x; nparm= nrow(b); se=sqrt(vecdiag(v)); reset noname; stitle={"Estimate", " Std Error"}; vname={"Trt1","Trt2","Trt3","Trt4"}; tmpprt= b || se; print,tmpprt[colname=stitle rowname=vname format=10.5]; print,"Estimated Covariance Matrix",, v[colname=vname rowname=vname format=10.5]; /* H0: beta11=beta12=beta13=beta14=0 */ chisq= b` * inv(v) * b; df= nrow(b); p= 1-probchi(chisq,df); print ,,"Testing H0: no treatment effects", , "Wald Chi-Square = " chisq, "DF = " df, "p-value = "p[format=5.4],; /* Assume beta11=beta12=beta13=beta14 and estimate the common value */ c= {1 0 0 0, 0 1 0 0, 0 0 1 0, 0 0 0 1}; cb= c * b; si= c * v * t(c); e= j(4,1,1); isi=inv(si); h= inv(e` * isi * e) * isi * e; b1= t(h) * cb; se= sqrt(t(h) * si * h); zscore= b1 / se; p= 1- probchi( zscore # zscore, 1); print ,"Estimation of the Common Parameter for Treatment",, "Optimal Weights = "h, "Estimate = " b1, "Standard Error = " se, "z-score =" zscore, "2-sided p-value = " p[format=5.4]; quit;
The PHREG procedure can also be used to fit the PWP model. In the PWP model, the risk set for the (k+1) recurrence is restricted to those patients who have experienced the first k recurrences. For example, a patient who experienced only one recurrence is an event observation for the first recurrence; this patient is a censored observation for the second recurrence and should not be included in the risk set for the third or fourth recurrence. The following DATA step eliminates those observations that should not be in the risk sets, forming a new input data set (named Bladder3) for fitting the PWP models. The gap times between successive recurrences are also calculated.
The following SAS statements create the data set Bladder3:
data Bladder3(drop=lstatus); retain lstatus; set Bladder2; by ID; if first.ID then lstatus=1; if (Status=0 and lstatus=0) then delete; lstatus=Status; GapTime=TStop-TStart; run;
The following statements fit the PWP total time model with noncommon effects:
*title2 'PWP Total Time Model with Noncommon Effects'; proc phreg data=Bladder3; model TStop*Status(0)=Trt1-Trt4 Number1-Number4 Size1-Size4; strata Visit; run;
The following statements fit the PWP gap time model with noncommon effects:
*title2 'PWP Gap Time Model with Noncommon Effects'; proc phreg data=Bladder3; model GapTime*Status(0)=Trt1-Trt4 Number1-Number4 Size1-Size4; strata Visit; run;
Results of these two analyses are shown in Output 49.8.4 and Output 49.8.5, respectively.
Output 49.8.4: Fitting PWP Total Time Model with Noncommon Effects
|
|
|
|
*title2 'PWP Total Time Model with Common Effects'; proc phreg data=Bladder3; model TStop*Status(0)=Trt Number Size; strata Visit; run;
The following statements fit the PWP gap time model with common effects:
*title2 'PWP Gap Time Model with Common Effects'; proc phreg data=Bladder3; model GapTime*Status(0)=Trt Number Size;; strata Visit; run;
Results of these two analyses are not shown in this document.
Chapter Contents |
Previous |
Next |
Top |
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.