The RELIABILITY Procedure |
Parameter Estimation
Maximum likelihood estimation of the parameters of a statistical
model involves maximizing the likelihood or, equivalently,
the log likelihood
with respect to the parameters. The parameter values at which the
maximum occurs are the maximum likelihood estimates of the
model parameters.
The likelihood is a function of the parameters and
of the data.
Let x1, x2, ... , xn be the observations in a random sample,
including the failures and censoring times (if the data are censored).
Let
be the probability density of failure time,
be the reliability function,
and
be the cumulative distribution function,
where
is the vector of parameters to be estimated,
.The probability density, reliability function, and CDF are determined by
the specific distribution selected as a model for the data.
The log likelihood is defined as
![{L({\theta})} & = & \sum_{i}\log(f({\theta};x_{i})) +
{\sum_{i}}^{'}\log(S({\th...
...theta};x_{i})) +
{\sum_{i}}^{'''}[\log(F({\theta};x_{ui})-F({\theta};x_{li}))]](images/releq63.gif)
where
-
is the sum over failed units
-
is the sum over right-censored units
-
is the sum over left-censored units
-
is the sum over interval-censored units
and (xli, xui) is the interval in which the ith unit
is interval censored. Only the sums appropriate to the type of
censoring in the data are included when the preceding equation is used.
The RELIABILITY procedure maximizes the log likelihood with respect to the
parameters
using a Newton-Raphson algorithm.
The Newton-Raphson algorithm is a recursive method for computing the
maximum of a function.
On the rth iteration, the algorithm updates
the parameter vector
with
![{{\theta}}_{r+1} = {{\theta}}_{r} - H^{-1}g](images/releq70.gif)
where H is the Hessian
(second derivative) matrix, and
g is the gradient
(first derivative) vector of the
log likelihood function, both
evaluated at the current value of the parameter vector.
That is,
![g = [g_{j}] =
. [ \frac{\partial L}{\partial \theta_{j}}]
| _{{\theta}={\theta}_{r}}](images/releq71.gif)
and
![H = [{h_{ij}}] = . [ \frac{\partial^2L}
{\partial\theta_{i}\partial\theta_{j}} ]
| _{{\theta}={\theta}_{r}}](images/releq72.gif)
Iteration
continues until the parameter estimates
converge. The convergence criterion is
![|\theta^{r+1}_{i}-\theta^r_{i}| \leq c \;\; { if } \;\; |\theta^{r+1}_{i}| \lt ....
...ta^r_{i}}{\theta^{r+1}_{i}}| \leq c \;\; { if } \;\; |\theta^{r+1}_{i}| \geq .01](images/releq73.gif)
for all i = 1,2, ... ,p where c is the convergence criterion. The
default value of c is 0.001, and it can be specified with
the CONVERGE= option in the MODEL, PROBPLOT, RELATIONPLOT, and ANALYZE
statements.
After convergence by the above criterion, the quantity
-
tc = [(gH-1g)/(L)]
is computed. If tc > d then a warning is printed that the algorithm did
not converge. tc is called the relative Hessian convergence
criterion. The default value of d is .0001. You can specify other
values for d with the CONVH= option. The relative Hessian criterion is useful
in detecting the occasional case where no progress can be made in
increasing the log-likelihood, yet the gradient g is not zero.
A location-scale model has a CDF of the form
![F(x) = G(\frac{x-\mu}{\sigma})](images/releq74.gif)
where
is the location parameter,
is the scale parameter,
and G is a standardized form
of the
cumulative distribution function.
The parameter vector is
=(
).
It is more convenient computationally to maximize log likelihoods that
arise from location-scale models.
If you specify a distribution from Table 30.37 that is not
a location-scale model, it is transformed
to a location-scale model by taking the natural (base e)
logarithm of the response. If you specify the lognormal base 10 distribution,
the logarithm (base 10) of the response is used.
The Weibull, lognormal, and log-logistic distributions in Table 30.37
are not location-scale models.
Table 30.38 shows the corresponding location-scale
models that result from taking the logarithm of the response.
Maximum likelihood is the default method of estimating the location and
scale parameters in the MODEL, PROBPLOT, RELATIONPLOT, and ANALYZE statements.
If the Weibull distribution is specified,
the logarithms of the responses are used to obtain maximum likelihood
estimates (
) of the location and scale parameters
of the extreme value distribution. The maximum likelihood estimates
(
,
) of the Weibull scale and shape parameters
are computed as
and
.
Regression Models
In a regression model, the location parameter for the ith
observation of a location-scale
model is a linear function of parameters:
![\mu_{i} = {x_{i}}'{{\beta}}](images/releq82.gif)
where xi is a vector of explanatory variables
for the ith observation determined
by the experimental setup and
is a vector of parameters to
be estimated.
You can specify a regression model using the MODEL statement.
For example, if you want to relate the lifetimes of electronic
parts in a test to operating temperature using the Arrhenius relationship,
then an appropriate model might be
![\mu_{i} = \beta_{0} + x_{i}\beta_{1}](images/releq84.gif)
where xi= 1000/(Ti+273.15), and Ti is the
centigrade temperature at which the ith unit is tested.
Here, xi' =[ 1 xi ].
There are two types of explanatory variables:
continuous
variables and class (or classification)
variables. Continuous variables represent physical quantities, such as
temperature or voltage, and they must be numeric.
Continuous explanatory variables are sometimes called covariates.
Class variables identify classification
levels and are declared in the CLASS statement.
These are also referred to as categorical, dummy,
qualitative, discrete, or nominal variables. Class
variables can be either character or numeric. The values of class
variables are called levels. For example, the class
variable BATCH could have levels `batch1' and `batch2'
to identify items from two production batches.
An indicator (0-1) variable is generated for each level of a class
variable and is used as an explanatory variable.
See Nelson (1990, p.277) for
an example using an indicator variable in the analysis of
accelerated life test data.
In a model, an explanatory variable that is not declared in a
CLASS statement is assumed to be
continuous.
By default, all regression models automatically contain an
intercept term; that is, the model is of the form
![\mu_{i} = \beta_{0} + \beta_{1}x_{i1} + ...](images/releq85.gif)
where
does not have an explanatory variable multiplier.
The intercept term can be excluded from the model by specifying
the NOINT option in the MODEL statement.
For numerical stability,
continuous explanatory variables are centered and scaled internally to the procedure.
This transforms the parameters
in the original model to a new
set of parameters.
The parameter estimates
and covariances are
transformed back to the original scale before reporting, so that
the parameters should be interpreted in terms of the originally
specified model. Covariates that are indicator variables,
that is, those
specified in a CLASS statement, are not centered and scaled.
Initial values of the regression parameters used in the
Newton-Raphson method are computed by ordinary least squares.
The parameters
and the scale parameter
are jointly
estimated by maximum likelihood, taking a logarithmic transformation
of the responses,
if necessary, to get a location-scale model.
The generalized gamma distribution is fit using the log lifetimes.
The regression parameters
,the scale parameter
, and the shape parameter
are jointly estimated.
The Weibull distribution shape parameter estimate is computed
as
, where
is the scale
parameter from the corresponding extreme value distribution.
The Weibull scale parameter
is not computed by
the procedure. Instead, the regression parameters
and the
shape
are reported.
In a model with a single continuous explanatory variable x,
you can use the RELATION= option in the MODEL statement to
specify a transformation that is applied to the
variable before model fitting. Table 30.45 shows the
available transformations.
Table 30.45: Variable Transformations
Relation
|
Transformed variable
|
ARRHENIUS (Nelson parameterization) | 1000/(x+273.15) |
ARRHENIUS2 (activation energy parameterization) | 11605/(x+273.15) |
POWER | log(x) |
LINEAR | x |
Stable Parameters
The location and scale parameters
are estimated by maximizing
the likelihood function by numerical methods, as described previously.
An alternative parameterization that may have better numerical properties
for heavy censoring is
, where
and zp is the pth quantile of the standardized distribution.
See Meeker and Escobar (1998, p. 90) and Doganaksoy and Schmee (1993)
for more details on alternate parameterizations.
By default, RELIABILITY estimates a value of zp from the data that
will improve the numerical properties of the estimation. You can also specify
values of p from which the value of zp will be computed with
the PSTABLE= option in the ANALYZE, PROBPLOT, RELATIONPLOT, or MODEL
statements. Note that a value of p=0.632 for the Weibull and extreme value
and p=0.5 for all other distributions will give zp=0 and the
parameterization will then be the usual location-scale parameterization.
All estimates and related statistics are reported in terms of the location and scale parameters
. If you specify the ITPRINT option in the ANALYZE, PROBPLOT, or
RELATIONPLOT statement, a table showing the values of p,
,
,and the last evaluation of the gradient and Hessian for these parameters
is produced.
Covariance Matrix
An estimate of the covariance matrix of the
maximum likelihood estimators (MLEs)
of the parameters
is given by the inverse
of the negative of the matrix of second derivatives of the
log likelihood, evaluated at the final
parameter estimates:
![{\Sigma} = [\sigma_{ij}] = -H^{-1} =
-[ \frac{\partial^2L}
{\partial\theta_{i}\partial\theta_{j}} ]^{-1}_{{\theta}={\hat{\theta}}}](images/releq91.gif)
The negative of the matrix of second derivatives is called the
Fisher information matrix.
The diagonal term
is an estimate of the variance of
.Estimates of standard errors of the MLEs are provided by
![{\rm SE}_{\theta_{i}} = \sqrt{\sigma_{ii}}](images/releq94.gif)
An estimator of the correlation matrix is
![R = [
\frac{\sigma_{ij}}{\sqrt{\sigma_{ii}\sigma_{jj}}} ]](images/releq95.gif)
The covariance matrix for the Weibull distribution parameter
estimators is computed by a first-order approximation from the
covariance matrix
of the estimators of the corresponding extreme value parameters
as
![{\rm Var}(\hat{\alpha}) & = & [\exp(\hat{\mu})]^2{\rm Var}(\hat{\mu}) \{\rm Var}...
... & = &
-\frac{\exp(\hat{\mu)}}{\hat{\sigma}^2}{\rm Cov}(\hat{\mu},\hat{\sigma})](images/releq96.gif)
For the regression model, the variance of the Weibull shape
parameter estimator
is computed from the
variance of the estimator of the
extreme value scale parameter
as shown previously. The covariance
of the regression parameter estimator
and the
Weibull shape parameter estimator
is computed in terms
of the covariance between
and
as
![{\rm Cov}(\hat{\beta}_{i}, \hat{\beta}) =
-\frac{{\rm Cov}(\hat{\beta_{i}},\hat{\sigma})}{\hat{\sigma}^2}](images/releq98.gif)
Confidence Intervals for Distribution Parameters
Table 30.46 shows the method of computation of approximate two-sided
confidence limits for distribution parameters.
The default value of
confidence is
.Other values of confidence are
specified using the CONFIDENCE= option.
In Table 30.46,
represents the
percentile of the standard normal distribution, and
and
are the MLEs
of the location and scale parameters for the normal, extreme value, and logistic
distributions. For the lognormal, Weibull, and log-logistic distributions,
and
represent the MLEs of the
corresponding location and scale parameters of the location-scale distribution
that results when the logarithm of the lifetime is used as the response.
For the Weibull distribution,
and
are the
location and scale parameters of the extreme value distribution
for the logarithm of the lifetime.
denotes the standard error of the
MLE of
, computed as the square root of the appropriate diagonal element
of the inverse of the Fisher information matrix.
Table 30.46: Confidence Limit Computation
|
Parameters
|
Distribution
|
Location
|
Scale
|
Shape
|
| | | |
Normal | ![\mu_{L}=\hat{\mu}-K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq102.gif) | ![\sigma_{L}=\hat{\sigma}/\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq103.gif) | |
| ![\mu_{U}=\hat{\mu}+K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq104.gif) | ![\sigma_{U}=\hat{\sigma}\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq105.gif) | |
| | | |
Lognormal | ![\mu_{L}=\hat{\mu}-K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq102.gif) | ![\sigma_{L}=\hat{\sigma}/\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq103.gif) | |
| ![\mu_{U}=\hat{\mu}+K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq104.gif) | ![\sigma_{U}=\hat{\sigma}\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq105.gif) | |
| | | |
Lognormal | ![\mu_{L}=\hat{\mu}-K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq102.gif) | ![\sigma_{L}=\hat{\sigma}/\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq103.gif) | |
(base 10) | ![\mu_{U}=\hat{\mu}+K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq104.gif) | ![\sigma_{U}=\hat{\sigma}\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq105.gif) | |
| | | |
Extreme Value | ![\mu_{L}=\hat{\mu}-K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq102.gif) | ![\sigma_{L}=\hat{\sigma}/\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq103.gif) | |
| ![\mu_{U}=\hat{\mu}+K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq104.gif) | ![\sigma_{U}=\hat{\sigma}\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq105.gif) | |
| | | |
Weibull | | ![\alpha_{L}=\exp[\hat{\mu}-K_{\gamma}{\rm (SE}_{\hat{\mu}})]](images/releq106.gif) | ![\beta_{L}=\exp[-K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]/\hat{\sigma}](images/releq107.gif) |
| | ![\alpha_{U}=\exp[\hat{\mu}+K_{\gamma}{\rm (SE}_{\hat{\mu}})]](images/releq108.gif) | ![\beta_{U}=\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]/\hat{\sigma}](images/releq109.gif) |
| | | |
Exponential | | ![\alpha_{L}=\exp[\hat{\mu}-K_{\gamma}{\rm (SE}_{\hat{\mu}})]](images/releq106.gif) | |
| | ![\alpha_{U}=\exp[\hat{\mu}+K_{\gamma}{\rm (SE}_{\hat{\mu}})]](images/releq108.gif) | |
| | | |
Logistic | ![\mu_{L}=\hat{\mu}-K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq102.gif) | ![\sigma_{L}=\hat{\sigma}/\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq103.gif) | |
| ![\mu_{U}=\hat{\mu}+K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq104.gif) | ![\sigma_{U}=\hat{\sigma}\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq105.gif) | |
| | | |
Log-logistic | ![\mu_{L}=\hat{\mu}-K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq102.gif) | ![\sigma_{L}=\hat{\sigma}/\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq103.gif) | |
| ![\mu_{U}=\hat{\mu}+K_{\gamma}{\rm (SE}_{\hat{\mu}})](images/releq104.gif) | ![\sigma_{U}=\hat{\sigma}\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq105.gif) | |
| | | |
Generalized | | ![\sigma_{L}=\hat{\sigma}/\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq103.gif) | ![\mu_{L}=\hat{\lambda}-K_{\gamma}{\rm (SE}_{\hat{\lambda}})](images/releq110.gif) |
gamma | | ![\sigma_{U}=\hat{\sigma}\exp[K_{\gamma}{\rm (SE}_{\hat{\sigma}})/\hat{\sigma}]](images/releq105.gif) | ![\mu_{U}=\hat{\lambda}+K_{\gamma}{\rm (SE}_{\hat{\lambda}})](images/releq111.gif) |
| | | |
Regression Parameters
Approximate
confidence limits for the regression
parameter
are given by
![\beta_{iL}=\hat{\beta}_{i} - K_{\gamma}({\rm SE}_{\hat{\beta}_{i}})](images/releq113.gif)
![\beta_{iU}=\hat{\beta}_{i} + K_{\gamma}({\rm SE}_{\hat{\beta}_{i}})](images/releq114.gif)
Percentiles
The maximum likelihood estimate of the p ×100% percentile
xp for the extreme value, normal, and logistic distributions
is given by
![\hat{x}_{p} = \hat{\mu} + z_{p}\hat{\sigma}](images/releq115.gif)
where zp=G-1(p), G is the standardized CDF shown in Table 30.47,
and
are the maximum likelihood estimates
of the location and scale parameters of the distribution.
The maximum likelihood estimate of the percentile
tp for the Weibull, lognormal, and log-logistic distributions
is given by
![\hat{t}_{p} = \exp[\hat{\mu} + z_{p}\hat{\sigma}]](images/releq117.gif)
where zp=G-1(p), and G is the standardized CDF of the location-scale model
corresponding to the logarithm of the response.
For the lognormal (base 10) distribution,
![\hat{t}_{p} = 10^{[\hat{\mu} + z_{p}\hat{\sigma}]}](images/releq118.gif)
Table 30.47: Standardized Cumulative Distribution Functions
|
Location-Scale
|
Location-Scale
|
Distribution
|
Distribution
|
CDF
|
| | |
Weibull | Extreme Value | 1 - exp[-exp(z)] |
| | |
Lognormal | Normal | ![\int_{-\infty}^z\frac{1}{\sqrt{2\pi}}\exp(-\frac{u^2}2)du](images/releq119.gif) |
| | |
Log-logistic | Logistic | [exp(z)/(1+exp(z))] |
Confidence Intervals
The variance of the MLE of the p×100% percentile
for the normal, extreme value, or logistic distribution is
![Var(\hat{x}_{p}) = Var(\hat{\mu}) + z_{p}^2Var(\hat{\sigma}) +
2Cov(\hat{\mu},\hat{\sigma})](images/releq120.gif)
Two-sided approximate
confidence limits for xp
are
![x_{pL} & = & \hat{x}_{p} - K_{\gamma}\sqrt{Var(\hat{x}_{p})} \
x_{pU} & = & \hat{x}_{p} + K_{\gamma}\sqrt{Var(\hat{x}_{p})}](images/releq121.gif)
where
represents the
percentile of the standard normal distribution.
The limits for the lognormal, Weibull, or log-logistic distributions are
![t_{pL} & = & \exp(\hat{x}_{p} - K_{\gamma}\sqrt{Var(\hat{x}_{p})}) \
t_{pU} & = & \exp(\hat{x}_{p} + K_{\gamma}\sqrt{Var(\hat{x}_{p})})](images/releq123.gif)
where xp refers to the percentile of the corresponding
location-scale distribution (normal, extreme value, or logistic)
for the logarithm of the lifetime. For the lognormal (base 10) distribution,
![t_{pL} & = & 10^{(\hat{x}_{p} - K_{\gamma}\sqrt{Var(\hat{x}_{p})})} \ t_{pU} & = & 10^{(\hat{x}_{p} + K_{\gamma}\sqrt{Var(\hat{x}_{p})})}](images/releq124.gif)
Reliability Function
For the extreme value, normal, and logistic distributions shown in
Table 30.47,
the maximum likelihood estimate of the reliability function
R(x) = Pr{X>x} is given by
![\hat{R}(x)=1-F(\frac{x-\hat{\mu}}{\hat{\sigma}})](images/releq125.gif)
The MLE of the CDF is
.Confidence Intervals
Let
. The variance of u is
![Var(\hat{u})\approx \frac{Var(\hat{\mu})+\hat{u}^2Var(\hat{\sigma})+
2\hat{u}Cov(\hat{\mu},\hat{\sigma})}{\hat{\sigma}^2}](images/releq128.gif)
Two-sided approximate
confidence intervals
for R(x) are computed as
![R_{L}(x)=\hat{R}(u_{2})](images/releq129.gif)
![R_{U}(x)=\hat{R}(u_{1})](images/releq130.gif)
where
![u_{1}=\hat{u} - K_{\gamma}\sqrt{Var(\hat{u})}](images/releq131.gif)
![u_{2}=\hat{u} + K_{\gamma}\sqrt{Var(\hat{u})}](images/releq132.gif)
and
represents the
percentile of the standard normal distribution.
The corresponding limits for the CDF are
-
FL(x) = 1-RU(x)
-
FU(x) = 1-RL(x)
Limits for the Weibull, lognormal, and log-logistic reliability
function R(t) are the same as those for the corresponding
extreme value, normal, or logistic reliability R(y), where
y = log(t).
Estimation with the Binomial and Poisson Distributions
In addition to estimating the parameters of the distributions in
Table 30.37, you can estimate parameters, compute confidence
limits, compute predicted values and prediction limits, and
compute chi-squared tests for differences in groups for
the binomial and Poisson distributions using the ANALYZE statement.
Specify either BINOMIAL or POISSON
in the DISTRIBUTION statement to use one of these distributions.
The ANALYZE statement options available for the binomial and
Poisson distributions are given in Table 30.5.
See "Analysis of Binomial Data"
for an example of an analysis of binomial data.
Binomial Distribution
If r is the number of successes and n is the number of trials in
a binomial experiment, then the maximum likelihood estimator
of the probability p in the binomial distribution in Table 30.39
is computed as
![\hat{p} = r/n](images/releq133.gif)
Two-sided
confidence
limits for p are computed as in Johnson, Kotz, and Kemp (1992, p.130):
![p_{L}= \frac{\nu_{1}F[(1-\gamma)/2; \nu{_1}, \nu_{2}] }
{\nu_{2} + \nu_{1}F[(1-\gamma)/2; \nu{_1}, \nu_{2}] }](images/releq134.gif)
with
and
and
![p_{U}= \frac{\nu_{1}F[(1+\gamma)/2; \nu{_1}, \nu_{2}] }
{\nu_{2} + \nu_{1}F[(1+\gamma)/2; \nu{_1}, \nu_{2}] }](images/releq137.gif)
with
and
, where
is the
percentile of the
F distribution with
degrees of freedom in the numerator
and
degrees of freedom in the denominator.
You can compute a sample size required to estimate
p within a specified tolerance w with probability
.Nelson (1982, p. 206) gives the following formula for the approximate
sample size:
![n \approx \hat{p}(1-\hat{p})(\frac{K_{\gamma}}w)^2](images/releq144.gif)
where
is the
percentile of the
standard normal distribution.
The formula is based on the normal approximation for the distribution
of
. Nelson recommends using this formula if
np > 10 and np(1-p) > 10.
The value of
used for computing confidence limits is
used in the sample size computation. The default value of
confidence is
.Other values of confidence are
specified using the CONFIDENCE= option.
You specify a tolerance of number with the TOLERANCE(number)
option.
The predicted number of successes X in a future sample of size m,
based on the previous estimate of p, is computed as
![\hat{X}=m(r/n) = m\hat{p}](images/releq146.gif)
Two-sided approximate
prediction limits are computed
as in Nelson (1982, p. 208). The prediction limits are the solutions
XL and XU of
![X_{U}/m = [(r+1)/n]F[(1+\gamma)/2;2(r+1),2X_{U}]](images/releq148.gif)
![m/(X_{L}+1)=(n/r)F[(1+\gamma)/2;2(X_{L}+1),2r]](images/releq149.gif)
where
is the
% percentile of the
F distribution with
degrees of freedom in the numerator
and
degrees of freedom in the denominator.
You request predicted values and prediction limits
for a future sample of size number
with the
PREDICT(number) option.
You can test groups of binomial data for equality of their binomial
probability using the ANALYZE statement. You specify the K groups
to be compared with a group variable having K levels.
Nelson (1982, p.450)
discusses a chi-squared test statistic for comparing
K binomial proportions for equality.
Suppose there are ri successes in
ni trials for i = 1,2, ... , K. The grouped estimate of
the binomial probability is
![\hat{p}=\frac{r_{1}+r_{2}+ ... +r_{K}}{n_{1}+n_{2}+ ... +n_{K}}](images/releq151.gif)
The chi-squared test statistic for testing the hypothesis
p1 = p2 = ... = pK against
for some i and
j is
![Q=\sum_{i=1}^K\frac{(r_{i}-n_{i}\hat{p})^2}{n_{i}\hat{p}(1-\hat{p})}](images/releq153.gif)
The statistic Q has an asymptotic chi-squared distribution with K-1
degrees of freedom. The RELIABILITY procedure computes the contribution
of each group to Q, the value of Q, and the p-value for Q
based on the limiting chi-squared distribution with K-1 degreees
of freedom. If you specify the PREDICT option, predicted values
and prediction limits are computed for each group, as well as
for the pooled group.
The p-value is defined as
, where
is the chi-squared CDF with K-1
degrees of freedom, and Q is the observed value.
A test of the
hypothesis of equal binomial probabilities among the groups with
significance level
is
-
: do not reject the equality hypothesis
-
: reject the equality hypothesis
Poisson Distribution
You can use the ANALYZE statement to model data
using the Poisson distribution.
The data consists of a count Y of occurrences
in a "length" of observation T.
Observation T is typically an exposure time,
but it can
have other units, such as distance.
The ANALYZE statement enables you to compute
the rate of occurrences, confidence limits, and prediction limits.
An estimate of the rate
is computed as
![\hat{\lambda}=Y/T](images/releq158.gif)
Two-sided
confidence limits for
are computed as in Nelson (1982, p. 201):
![\lambda_{L}=.5\chi^2[(1-\gamma)/2;2Y]/T](images/releq159.gif)
![\lambda_{U}=.5\chi^2[(1+\gamma)/2;2(Y+1)]/T](images/releq160.gif)
where
is the
percentile of
the chi-squared distribution with
degrees of freedom.
You can compute a length T required to estimate
within a specified tolerance w with probability
.Nelson (1982, p. 202) provides the following approximate formula:
![\hat{T}\approx\hat{\lambda}(\frac{K_{\gamma}}w)^2](images/releq163.gif)
where
is the
percentile of the
standard normal distribution.
The formula is based on the normal approximation for
and is more accurate for larger values of
. Nelson recommends
using the formula when
.The value of
used for computing confidence limits is also
used in the length computation. The default value of
confidence is
. Other values of confidence are
specified using the CONFIDENCE= option.
You specify a tolerance of number with the TOLERANCE(number)
option.
The predicted future number of occurrences in a length S is
![\hat{X}=(Y/T)S = \hat{\lambda}S](images/releq167.gif)
Two-sided approximate
prediction limits are computed
as in Nelson (1982, p. 203). The prediction limits are the solutions
XL and XU of
![X_{U}/S = [(Y+1)/T]F[(1+\gamma)/2;2(Y+1),2X_{U}]](images/releq168.gif)
![S/(X_{L}+1)=(T/Y)F[(1+\gamma)/2;2(X_{L}+1),2Y]](images/releq169.gif)
where
is the
percentile of the
F distribution with
degrees of freedom in the numerator
and
degrees of freedom in the denominator.
You request predicted values and prediction limits
for a future exposure number
with the
PREDICT(number) option.
You can compute a chi-squared test statistic for comparing
K Poisson rates for equality.
You specify the K groups
to be compared with a group variable having K levels.
Refer to Nelson (1982, p.444) for
more information on this test.
Suppose that there are Yi Poisson counts in
lengths Ti for i = 1,2, ... , K and that the Yi are
independent. The grouped estimate of
the Poisson rate is
![\hat{\lambda}=\frac{Y_{1}+Y_{2}+ ... +Y_{K}}{T_{1}+T_{2}+ ... +T_{K}}](images/releq170.gif)
The chi-squared test statistic for testing the hypothesis
against
for some i and j is
![Q=\sum_{i=1}^K\frac{(Y_{i}-\hat{\lambda}T_{i})^2}{\hat{\lambda}T_{i}}](images/releq173.gif)
The statistic Q has an asymptotic chi-squared distribution with K-1
degrees of freedom. The RELIABILITY procedure computes the contribution
of each group to Q, the value of Q, and the p-value for Q
based on the limiting chi-squared distribution with K-1 degreees
of freedom.
If you specify the PREDICT option, predicted values
and prediction limits are computed for each group, as well as
for the pooled group. The p-value is defined as
, where
is the chi-squared CDF with K-1
degrees of freedom and Q is the observed value. A test of the
hypothesis of equal Poisson rates among the groups with
significance level
is
-
: accept the equality hypothesis
-
: reject the equality hypothesis
Least Squares Fit to the Probability Plot
Fitting to the probability plot by least squares is an alternative
to maximum likelihood estimation of the parameters of a life
distribution. Only the failure times are used. A least squares
fit is computed using points (x(i), mi), where
mi=F-1(ai) and ai are the plotting positions as defined in
sref[d]ppos.
The xi are either the lifetimes
for the normal, extreme value, or logistic distributions or the
log lifetimes for the lognormal, Weibull, or log-logistic distributions.
The ANALYZE, PROBPLOT, or RELATIONPLOT statement option FITTYPE=LSXY
specifies the x(i) as the
dependent variable (`y-coordinate') and the mi as the independent
variable (`x-coordinate'). You can optionally reverse the
quantities used as dependent and independent variables by specifying
the FITTYPE=LSYX option.
Weibayes Estimation
Weibayes estimation is a method of performing a Weibull analysis
when there are few or no failures. The FITTYPE=WEIBAYES option
requests this method. The method of Nelson (1985) is used to compute
a one-sided confidence interval for the Weibull scale parameter when
the Weibull shape parameter is specified.
Also refer to Abernethy (1996) for more discussion and examples.
The Weibull shape parameter
is assumed to be known
and is specified to the procedure with the SHAPE=number
option.
Let T1,T2, ... ,Tn be the failure and censoring times, and
let
be the number of failures in the data.
If there are no failures (r=0),
a lower
confidence
limit for the Weibull scale parameter
is computed as
![\alpha_{L}=\{\sum_{i=1}^nT_{i}^{\beta}/[-\log(1-\gamma)]\}^{1/\beta}](images/releq175.gif)
The default value of
confidence is
.Other values of confidence are
specified using the CONFIDENCE= option.
If
, the MLE of
is given by
![\hat{\alpha}=[\sum_{i=1}^nT_{i}^{\beta}/r]^{1/\beta}](images/releq177.gif)
and a lower
confidence
limit for the Weibull scale parameter
is computed as
![\alpha_{L}=\hat{\alpha}[2r/\chi^2(\gamma, 2r+2)]^{1/\beta}](images/releq178.gif)
where
is the
percentile of a chi-square
distribution with 2r+2 degrees of freedom.
The procedure uses the specified value of
and the
computed value of
to compute
distribution percentiles
and the reliability function.
Copyright © 1999 by SAS Institute Inc., Cary, NC, USA. All rights reserved.