Computational Formulas

The CATMOD Procedure

Computational Formulas

The following calculations are shown for each population and then for all populations combined.

Source Formula Dimension

Probability Estimates

jth response p_ij = [(n_ij)/(n_i)] 1 ×1

ith population $p_i = [ p_{i1} \ p_{i2} \ \vdots \ p_{ir} \]$ r ×1

all populations $p = [ p_1 \ p_2 \ \vdots \ p_s \]$ sr ×1

Variance of Probability Estimates

ith population V_i = [1/(n_i)] (DIAG(p_i) - p_i p_i') r ×r

all populations V = DIAG(V₁, V₂, ... , V_s ) sr ×sr

Response Functions

ith population F_i = F(p_i) q ×1

all populations $F = [ F_1 \ F_2 \ \vdots \ F_s \]$ sq ×1

Derivative of Function with Respect to Probability Estimates

ith population $H_i = \displaystyle\frac{\partial F(p_i)}{\partial p_i}$ q ×r

all populations H = DIAG(H₁, H₂, ... , H_s ) sq ×sr

Variance of Functions

ith population S_i = H_i V_i H_i' q ×q

all populations S = DIAG(S₁, S₂, ... , S_s ) sq ×sq

Inverse Variance of Functions

ith population Sⁱ = (S_i)^-1 q ×q

all populations S^-1 = DIAG(S¹, S², ... , S^s ) sq ×sq

Derivative Table for Compound Functions: Y=F(G(p))

In the following table, let G(p) be a vector of functions of p, and let D denote $\partial G / \partial p$ , which is the first derivative matrix of G with respect to p.

Function Y = F(G) Derivative $(\partial Y / \partial p)$

Multiply matrix Y = A*G A*D

Logarithm Y = LOG(G) DIAG^-1(G)*D

Exponential Y = EXP(G) DIAG(Y)*D

Add constant Y = G + A D

Default Response Functions: Generalized Logits

In the following table, subscripts i for the population are suppressed. Also denote f_j = log( [(p_j)/(p_r)] ) for j = 1, ... , r-1 for each population i = 1, ... , s.

Inverse of Response Functions for a Population

$\displaystyle p_j & = & \displaystyle \frac{\exp (f_j)}{1 + \sum_k \exp (f_k)} ... ...r } j = 1, ... , r-1 \ p_r & = & \displaystyle \frac{1}{1 + \sum_k \exp (f_k)}$

Form of F and Derivative for a Population

$F & = & {{K LOG}}(p) = (I_{r-1}, -j) {LOG} (p) \ H & = & \displaystyle \frac{\partial F}{\partial p} = ( {DIAG}_{r-1}^{-1} (p), \frac{-1}{p_r} j )$

Covariance Results for a Population

$S & = & {HVH}{'} \ & = & \displaystyle \frac{1}n ( {DIAG}_{r-1}^{-1}(p) + \frac... ... q \ F{'}S^{-1}F & = & \displaystyle n \sum_j p_j f_j^2 - n (\sum_j p_j f_j)^2$

The following calculations are shown for each population and then for all populations combined.

Source Formula Dimension

Design Matrix

ith population X_i q ×d

all populations $X = [ X_1 \ X_2 \ \vdots \ X_s \]$ sq ×d

Crossproduct of Design Matrix

ith population C_i = X_i' Sⁱ X_i d ×d

all populations $C = X{'} S^{-1} X = \sum_i C_i$ d ×d

Crossproduct of Design Matrix with Function

$R = X{'} S^{-1} F = \sum_i X_i{'} S^i F_i$ d ×1

Weighted Least-Squares Estimates

b = C^-1 R = (X' S^-1 X)^-1 (X' S^-1 F) d ×1

Covariance of Weighted Least-Squares Estimates

COV(b) = C^-1 d ×d

Predicted Response Functions

${\hat{F}} = {Xb}$ sq ×1

Covariance of Predicted Response Functions

${V_{\hat{F}}} = X{C}^{-1}X{'}$ sq ×sq

Residual Chi-Square

RSS $= F{'} S^{-1} F - {\hat{F}}{'} S^{-1} {\hat{F}}$ 1 ×1

Chi-Square for $H_0\colon L {\beta}= 0$

Q = (Lb)' (LC^-1 L')^-1 (Lb) 1 ×1

Maximum Likelihood Method

Let C be the Hessian matrix and G be the gradient of the log-likelihood function (both functions of ${\pi}$ and the parameters ${\beta}$ ). Let p_i^* denote the vector containing the first r-1 sample proportions from population i, and let ${\pi}_i^*$ denote the corresponding vector of probability estimates from the current iteration. Starting with the least-squares estimates b₀ of ${\beta}$ (if you use the ML and WLS options; with the ML option alone, the procedure starts with 0), the probabilities ${\pi}(b)$ are computed, and b is calculated iteratively by the Newton-Raphson method until it converges (see the EPSILON= option). The factor $\lambda$ is a step-halving factor that equals one at the start of each iteration. For any iteration in which the likelihood decreases, PROC CATMOD uses a series of subiterations in which $\lambda$ is iteratively divided by two. The subiterations continue until the likelihood is greater than that of the previous iteration. If the likelihood has not reached that point after ten subiterations, then convergence is assumed, and a warning message is displayed.

Sometimes, infinite parameters may be present in the model, either because of the presence of one or more zero frequencies or because of a poorly specified model with collinearity among the estimates. If an estimate is tending toward infinity, then PROC CATMOD flags the parameter as infinite and holds the estimate fixed in subsequent iterations. PROC CATMOD regards a parameter to be infinite when two conditions apply:

The absolute value of its estimate exceeds five divided by the range of the corresponding variable.
The standard error of its estimate is at least three times greater than the estimate itself.

The estimator of the asymptotic covariance matrix of the maximum likelihood predicted probabilities is given by Imrey, Koch, and Stokes (1981, eq. 2.18).

The following equations summarize the method:

$b_{k+1} = b_k - \lambda C^{-1} G$

where

$C & = & X{'}S^{-1}({\pi}) {X } \ N & = & [ n_1 ( p_1^* - {\pi}_1^* ) \ \vdots \n_s ( p_s^* - {\pi}_s^* ) \ ] \ & & \G & = & X{'}N \$

Chapter Contents
Previous
Next
Top

Source	Formula	Dimension
Probability Estimates
jth response	p_ij = [(n_ij)/(n_i)]	1 ×1
ith population	$p_i = [ p_{i1} \ p_{i2} \ \vdots \ p_{ir} \]$	r ×1
all populations	$p = [ p_1 \ p_2 \ \vdots \ p_s \]$	sr ×1
Variance of Probability Estimates
ith population	V_i = [1/(n_i)] (DIAG(p_i) - p_i p_i')	r ×r
all populations	V = DIAG(V₁, V₂, ... , V_s )	sr ×sr
Response Functions
ith population	F_i = F(p_i)	q ×1
all populations	$F = [ F_1 \ F_2 \ \vdots \ F_s \]$	sq ×1
Derivative of Function with Respect to Probability Estimates
ith population	$H_i = \displaystyle\frac{\partial F(p_i)}{\partial p_i}$	q ×r
all populations	H = DIAG(H₁, H₂, ... , H_s )	sq ×sr
Variance of Functions
ith population	S_i = H_i V_i H_i'	q ×q
all populations	S = DIAG(S₁, S₂, ... , S_s )	sq ×sq
Inverse Variance of Functions
ith population	Sⁱ = (S_i)^-1	q ×q
all populations	S^-1 = DIAG(S¹, S², ... , S^s )	sq ×sq

Function	Y = F(G)	Derivative $(\partial Y / \partial p)$
Multiply matrix	Y = A*G	A*D
Logarithm	Y = LOG(G)	DIAG^-1(G)*D
Exponential	Y = EXP(G)	DIAG(Y)*D
Add constant	Y = G + A	D