The Bienaymé-Galton-Watson (BGW) process was introduced by Irénée-Jules Bienaymé (1845) to explain mathematically the observed phenomenon (Malthus, 1817, de Châteauneuf, 1845) that family names, both among aristocracy and among bourgeoisie, tend to become extinct. The process has since found uses in many other areas, such as genetics (Ewens, 1969), epidemiology (Becker, 1977), queueing theory (Kendall, 1951), and demography (Keyfitz, 1985). Statistical methods for these branching processes were first developed by Harris (1948), and have been the target of extensive research for the last two decades.
To define the process, let , be independent, identically distributed random variables, taking values in the nonnegative integers, with probability generating function . We assume throughout that we are in the supercritical case, i.e., that . The BGW branching process with offspring distribution , starting from ancestors, is defined recursively by and
where is taken to mean 0. We will assume that . It is clear from (1) that is a Markov chain with transition probabilities given by
where are the probabilities of a -fold convolution of the offspring distribution.
Since, unless , the process has positive probability of hitting the absorbing state 0, it is not possible to estimate any parameters consistently. Usually, therefore, inferences are made conditional upon non- extinction (Sweeting, 1986, argues the approximate ancillarity of this conditioning). Lockhart (1982) showed that two branching processes with the same mean, variance, and lattice, and finite moments cannot be distinguished, even on the set of non-extinction, on the basis of a path of observed generation sizes. In essence, large generation sizes behave like normal random variables, since they are the sum of a large number of iid random variables. Dion (1974,1975) exhibited conditionally consistent estimators of the offspring mean and the offspring variance. The estimator of the mean, namely
which Harris derived as the maximum likelihood estimator based on observing the entire family tree, was shown to be a nonparametric maximum likelihood estimator even when only observing generation sizes by Feigin (1977) and, independently, by Keiding and Lauritzen (1978). In general, there are no (conditionally) consistent estimates of other interesting parameters such as the extinction probability or the offspring distribution.
Dion et al. (1982) discussed maximum likelihood estimation of the offspring distribution of the BGW-process. They concentrated on the case where the distribution is supported on three points. The results of Lockhart show that in this case it may be possible to consistently estimate the offspring distribution on the explosion set.
Guttorp (1991) gave an algorithm for the computation of the offspring distribution mle which is considerably simpler than the method proposed by Dion et al. The algorithm is given and the induced estimate of the offspring variance is described in section 2. We know that the variance is consistently estimable, and that the offspring distribution is not, but we cannot immediately deduce that the mle of the variance (based on the inconsistent mle of the offspring distribution) is consistent. Guttorp (1991) proved consistency for the case of distributions with finite support and having all positive probabilities bounded below by some . The latter condition was needed to allow the assumption of fixed lattice size. In this paper we remove this condition. Section 3 is devoted to a technical tool needed to establish consistency; namely a local limit theorem for discrete random variables which does not (as do most such results in the literature) assume that the lattice size of the random variables are known and equal. In section 4 we apply this result to show consistency, still under fairly restrictive regularity conditions, and section 5 consists of some discussion of possible extensions.