next up previous


Postscript version of these notes

STAT 804

Lecture 11

Likelihood Theory

First we review likelihood theory for conditional and full maiximum likelihood estimation.

Suppose the data is $ X=(Y,Z)$ and write the density of $ X$ as

$\displaystyle f(x\vert\theta) = f(y\vert z,\theta)f(z\vert\theta)
$

Differentiate the identity

$\displaystyle 1 = \int f(y\vert z,\theta) dy
$

with respect to $ \theta_j$ (the $ j$th component of $ \theta$) and pull the derivative under the integral sign to get

0 $\displaystyle = \int \frac{\partial f(y\vert z,\theta_j)}{\partial\theta} dy$    
  $\displaystyle = \int \frac{\partial \log f(y\vert z,\theta_j)}{\partial\theta} f(y\vert z,\theta) dy$    
  $\displaystyle =$   E$\displaystyle _\theta(U_{Y\vert Z;j}(\theta)\vert Z)$    

where $ U_{Y\vert Z;j}(\theta)$ is the $ j$th component of $ U_{Y\vert Z}(\theta)$, the derivative of the log conditional likelihood; $ U_{Y\vert Z}$ is called a conditional score. Since

   E$\displaystyle _\theta(U_{Y\vert Z;j}(\theta)\vert Z) = 0
$

we may take expected values to see that

   E$\displaystyle _\theta(U_{Y\vert Z;j}(\theta)) = 0
$

It is also true that the other two scores $ U_X(\theta)$ and $ U_Z(\theta)$ have mean 0 (when $ \theta$ is the true value of $ \theta$). Differentiate the identity a further time with respect to $ \theta_k$ to get

$\displaystyle 0 = \int \frac{\partial^2 \log
f(y\vert z,\theta)}{\partial\theta...
...rac{\partial \log f(y\vert z,\theta_k)}{\partial\theta}
f(y\vert z,\theta) dy
$

We define the conditional Fisher information matrix $ I_{Y\vert Z}(\theta)$ to have $ jk$th entry

   E$\displaystyle \left[- \frac{\partial^2 \ell}{\partial\theta_j\partial\theta_k}\vert Z\right]
$

and get

$\displaystyle I_{Y\vert Z}(\theta\vert Z) =$   Var$\displaystyle _\theta(U_{Y\vert Z}(\theta)\vert Z)
$

The corresponding identities based on $ f_X$ and $ f_Z$ are

$\displaystyle I_X(\theta) =$   Var$\displaystyle _\theta(U_X(\theta))
$

and

$\displaystyle I_Z(\theta) =$   Var$\displaystyle _\theta(U_Z(\theta))
$

Now let's look at the model $ X_t = \rho X_{t-1}+\epsilon_t$. Putting $ Y=(X_1,\ldots,X_{T-1})$ and $ Z=X_0$ we find

\begin{displaymath}
U_{Y\vert Z}(\rho,\sigma) =
\begin{array}{l}
\frac{\sum_1^{T...
..._t-\rho X_{t-1})^2}{\sigma^3} -\frac{T-1}{\sigma}
\end{array}
\end{displaymath}

Differentiating again gives the matrix of second derivatives

\begin{displaymath}
\left[
\begin{array}{cc}
-\frac{\sum_1^{T-1}X_{t-1}^2}{\sigm...
...X_{t-1})^2}{\sigma^4} +\frac{T-1}{\sigma^2}
\end{array}\right]
\end{displaymath}

Taking conditional expectations given $ X_0$ gives

\begin{displaymath}
I_{Y\vert Z}(\rho,\sigma) = \left[
\begin{array}{cc}
\frac{\...
...gma^2}
&
0 \\  0 &
\frac{2(T-1)}{\sigma^2}
\end{array}\right]
\end{displaymath}

To compute $ W_k \equiv$   E$ [X_k^2\vert X_0]$ write $ X_k = \rho X_{k-1}+\epsilon_k$ and get

$\displaystyle W_k =\rho^2 W_{k-1} + \sigma^2
$

with $ W_0=X_0^2$. You can check carefully that in fact $ W_k$ converges to some $ W_\infty$ as $ k\to\infty$. This $ W_\infty$ satisfies $ W_\infty = \rho^2W_\infty + \sigma^2
$ which gives

$\displaystyle W_\infty = \frac{\sigma^2}{1-\rho^2}
$

It follows that

\begin{displaymath}
\frac{1}{T} I_{Y\vert Z}(\rho,\sigma) \to
\left[
\begin{ar...
...{1}{1-\rho^2} & 0 \\  0 & \frac{2}{\sigma^2}\end{array}\right]
\end{displaymath}

Notice that although the conditional Fisher information might have been expected to depend on $ X_0$ it does not, at least for long series.


next up previous



Richard Lockhart
2001-09-30