Postscript version of these notes
Stat 804
Lecture 13 Notes
Non Gaussian series.
The fitting methods we have studied are based on the likelihood
for a normal fit. However, the estimates work reasonably well
even if the errors are not normal.
Example: AR(1) fit. We fit
using
which is consistent for non-Gaussian errors.
(In fact
divide by and apply the law of large numbers to
to
see that is consistent.)
Here is an outline of the logic of what follows. We will assume that
the errors are iid mean 0, variance and finite fourth
moment
E. We will not assume that
the errors have a normal distribution.
- The estimates of and are consistent.
- The score function satisfies
where
- The matrix of second derivatives satisfies
E
where
- If is the (conditional) Fisher information
then
- We can expand
about and get
negligible remainder
- So
where
- So
even for
non-normal errors.
- On the other hand the estimate of has a limiting
distribution which will be different for non-normal errors (because
it depends on which is for normal errors and
something else in general for non-normal errors).
Here are details.
Consistency: One of our many nearly equivalent
estimates of is
Divide both top and bottom by . You need essentially to prove
and
Each of these is correct and hinges on the fact that these
linear processes are ergodic -- long time averages converge
to expected values. For these particular averages it is possible to
compute means and variances and prove that the mean squared error
converges to 0.
Score function: asymptotic normality
The score function is
If and are the true values of the parameters then
I claim that
. This
is proved by the martingale central limit theorem. Technically
you fix an and study
,
proving that the limit is
. I do here only the
special cases and . The second of these is
simply
which converges by the usual CLT to
. For the claim is
that
because
.
To prove this assertion we define for each a martingale
for
where
with
The martingale property is that
E
The martingale central limit theorem (Hall, P. and Heyde, C. C. (1980).
Martingale limit theory and its application.
New York: Academic Press.) states that
provided that
and provided that an analogue of Lindeberg's condition holds. Here
I check only the former condition:
E
(by the ergodic theorem or you could compute means and variances).
Second derivative matrix and Fisher information:
the matrix of negative second derivatives is
If you evaluate at the true parameter value and divide by the matrix
and the expected value of the matrix converge to
(Again this uses the ergodic theorem or a variance calculation.)
Taylor expansion: In the next step we are supposed
to prove that a random vector has a MVN limit. The usual tactic to
prove this uses the so called Cramér-Wold device -- you prove that
each linear combination of the entries in the vector has a univariate
normal limit.
Then
and Taylor's theorem is
that
(Here we are using
and is a remainder
term -- a random variable with the property that
for each .) Multiply through by
and get
It is possible with care to prove that
Asymptotic normality: This is a consequence of
Slutsky's theorem applied to the Taylor expansion and the results
above for and . According to Slutsky's theorem the asymptotic
distribution of
is the same as that
of
which converges in distribution to
. Now since
Behaviour of : pick off the first component
and find
Notice that this answer is the same for normal and non-normal errors.
Behaviour of
: on the other hand
which has in it and will match the normal theory limit if and only
if
.
More general models: For an ARMA model the parameter
vector is
In general
the matrices and are of the form
and
where and is a function of the parameters
only and is the same for both
normal and non-normal data.
Model assessment.
Having fitted an ARIMA model you get (essentially automatically)
fitted residuals
. Most of the fitting methods lead to
fewer residuals than there were in the original series. Since the
parameter estimates are consistent (if the model fitted is correct,
of course) the fitted residuals should be essentially the true
which is white noise. We will assess this by plotting the estimated
ACF of
and then seeing if the estimates are all close
enough to 0 to pass for white noise.
To judge close enough we need asymptotic distribution theory for
autocovariance estimates.
Richard Lockhart
2001-09-30