Reading for Today's Lecture:
Goals of Today's Lecture:
Today's notes
Large Sample Theory of the MLE
Theorem: Under suitable regularity conditions there is a
unique consistent root
of the likelihood equations.
This root has the property
Note: If the square roots are replaced by matrix square roots we can let be vector valued and get MVN(0,I) as the limit law.
Why bother with all these different forms? We actually use the limit
laws to test hypotheses and compute confidence intervals.
We test
using one of the four quantities as
the test statistic. To find confidence intervals we use the quantities
as pivots. For example the second and fourth limits above lead to
confidence intervals
Estimating Equations
An estimating equation is unbiased if
Theorem: Suppose
is a consistent root of sthe
unbiased estimating equation
Method of Moments
Basic strategy: set sample moments equal to population moments and solve for the parameters.
Definition: The
sample moment (about the origin)
is
(Central moments are
If we have p parameters we can estimate the parameters
by solving the system of p
equations:
Gamma Example
The Gamma(
)
density is
These equations are much easier to solve than the likelihood equations. The
latter
involve the function
Why bother doing the Newton Raphson steps? Why not just use the method of moments estimates? The answer is that the method of moments estimates are not usually as close to the right answer as the mles.
Rough principle: A good estimate of is usually close to if is the true value of . Closer estimates, more often, are better estimates.
This principle must be quantified if we are to ``prove'' that the mle is a good estimate. In the Neyman Pearson spirit we measure average closeness.
Definition: The Mean Squared Error (MSE) of an estimator
is the function
Standard identity:
Primitive example: I take a coin from my pocket and toss it
6 times. I get HTHTTT. The MLE of the probability of heads is
An alternative estimate is
.
That is,
ignores the data and guesses the coin is fair. The
MSEs of these two estimators are
Now suppose I did the same experiment with a thumbtack. The tack can land point up (U) or tipped over (O). If I get UOUOOO how should I estimate p the probability of U? The mathematics is identical to the above but it seems clear that there is less reason to think is better than since there is less reaon to believe than with a coin.
The problem above illustrates a general phenomenon. An estimator can
be good for sme values of
and bad for others. When comparing
and
,
two estimators of
we will say
that
is better than
if it has uniformly
smaller MSE:
The definition raises the question of the existence of a best
estimate - one which is better than every other estimator. There is
no such estimate. Suppose
were such a best estimate. Fix
a
in
and let
.
Then the
MSE of
is 0 when
.
Since
is
better than
we must have
is a constant