Bandwidth Selection

The KDE Procedure

Bandwidth Selection

Several different bandwidth selection methods are available in PROC KDE in the univariate case. Following the recommendations of Jones, Marron, and Sheather (1996), the default method follows a plug-in formula of Sheather and Jones.

This method solves the fixed-point equation

$h = [ \frac{R(\varphi)}{nR(\hat{f}^{''}_{g(h)}) (\int x^2 \varphi(x) dx)^2} ] ^{1/5}$

where $R(\varphi) = \int \varphi^2(x) dx$ .

PROC KDE solves this equation by first evaluating it on a grid of values spaced equally on a log scale. The largest two values from this grid that bound a solution are then used as starting values for a bisection algorithm.

The simple normal reference rule works by assuming $\hat f$ is Gaussian in the preceding fixed-point equation. This results in

$h = {\hat \sigma} [4/(3n)]^{1/5}$

where ${\hat \sigma}$ is the sample standard deviation.

Silverman's rule of thumb (1986, §3.4.2) is computed as

$h = 0.9 \min [{\hat \sigma},(Q_{3}-Q_{1})/1.34] n^{-1/5}$

where Q₃ and Q₁ are the third and first sample quartiles, respectively.

The oversmoothed bandwidth is computed as

$h = 3{\hat \sigma} [1/(70 \sqrt{\pi} n)]^{1/5}$

When you specify a WEIGHT variable, PROC KDE uses weighted versions of Q₃, Q₁, and ${\hat \sigma}$ in the preceding expressions. The weighted quartiles are computed as weighted order statistics, and the weighted variance takes the form

${\hat \sigma}^2=\frac{\sum_{i=1}^n W_{i}(X_{i}-\overline{X})^2}{\sum_{i=1}^n W_{i}}$

where $\overline{X} = (\sum_{i=1}^n W_{i}X_{i}) / (\sum_{i=1}^n W_{i})$ is the weighted sample mean.

For the bivariate case, Wand and Jones (1993) note that automatic bandwidth selection is both difficult and computationally expensive. Their study of various ways of specifying a bandwidth matrix also shows that using two bandwidths, one in each coordinate's direction, is often adequate. PROC KDE enables you to adjust the two bandwidths by specifying a multiplier for the default bandwidths recommended by Bowman and Foster (1992):

$h_{X} &=& {\hat \sigma}_{X}n^{-1/6} \ h_{Y} &=& {\hat \sigma}_{Y}n^{-1/6}$

Here ${\hat \sigma}_{X}$ and ${\hat \sigma}_{Y}$ are the sample standard deviations of X and Y, respectively. These are the optimal bandwidths for two independent normal variables that have the same variances as X and Y. They are, therefore, conservative in the sense that they tend to oversmooth the surface.

You can specify the BWM= option to adjust the aforementioned bandwidths to provide the appropriate amount of smoothing for your application.

Chapter Contents
Previous
Next
Top