No Title

STAT 801 Lecture 20

Reading for Today's Lecture:

Goals of Today's Lecture:

Introduce concepts of Hypothesis Testing
State and prove the Neyman Pearson Lemma
Use the NP lemma to find UMP tests

Today's notes

Hypothesis Testing

Jargon defined so far: hypothesis, power function, level, critical region, null hypothesis, simple hypothesis, Type I error, Type II error.

Simple versus Simple testing

Theorem: For each fixed $\lambda$ the quantity $\beta+\lambda\alpha$ is minimized by any $\phi$ which has

$\begin{displaymath}\phi(x) =\left\{\begin{array}{ll} 1 & \frac{f_1(x)}{f_0(x)} >... ...bda \\ 0 & \frac{f_1(x)}{f_0(x)} < \lambda \end{array}\right. \end{displaymath}$

Neyman and Pearson suggested that in practice the two kinds of errors might well have unequal consequences. They suggested that rather than minimize any quantity of the form above you pick the more serious kind of error, label it Type I and require your rule to hold the probability $\alpha$ of a Type I error to be no more than some prespecified level $\alpha_0$ . (This value $\alpha_0$ is typically 0.05 these days, chiefly for historical reasons.)

The Neyman and Pearson approach is then to minimize $\beta$ subject to the constraint $\alpha \le \alpha_0$ . Usually this is really equivalent to the constraint $\alpha=\alpha_0$ (because if you use $\alpha<\alpha_0$ you could make R larger and keep $\alpha \le \alpha_0$ but make $\beta$ smaller. For discrete models, however, this may not be possible.

Example: Suppose X is Binomial(n,p) and either p=p₀=1/2 or p=p₁=3/4. If R is any critical region (so R is a subset of $\{0,1,\ldots,n\}$ ) then

$\begin{displaymath}P_{1/2}(X\in R) = \frac{k}{2^n} \end{displaymath}$

for some integer k. If we want $\alpha_0=0.05$ with say n=5 for example we have to recognize that the possible values of $\alpha$ are 0, 1/32=0.03125, 2/32=0.0625 and so on. For $\alpha_0=0.05$ we must use one of three rejection regions: R₁ which is the empty set, R₂ which is the set x=0 or R₃ which is the set x=5. These three regions have $\alpha$ equal to 0, 0.3125 and 0.3125 respectively and $\beta$ equal to 1, 1-(1/4)⁵ and 1-(3/4)⁵ respectively so that R₃ minimizes $\beta$ subject to $\alpha<0.05$ . If we raise $\alpha_0$ slightly to 0.0625 then the possible rejection regions are R₁, R₂, R₃ and a fourth region $R_4=R_2\cup R_3$ . The first three have the same $\alpha$ and $\beta$ as before while R₄ has $\alpha=\alpha_0=0.0625$ an $\beta=1-(3/4)^5-(1/4)^5$ . Thus R₄ is optimal! The trouble is that this region says if all the trials are failures we should choose p=3/4 rather than p=1/2 even though the latter makes 5 failures much more likely than the former.

The problem in the example is one of discreteness. Here's how we get around the problem. First we expand the set of possible values of $\phi$ to include numbers between 0 and 1. Values of $\phi(x)$ between 0 and 1 represent the chance that we choose H₁ given that we observe x; the idea is that we actually toss a (biased) coin to decide! This tactic will show us the kinds of rejection regions which are sensible. In practice we then restrict our attention to levels $\alpha_0$ for which the best $\phi$ is always either 0 or 1. In the binomial example we will insist that the value of $\alpha_0$ be either 0 or $P_{\theta_0} ( X\ge 5)$ or $P_{\theta_0} ( X\ge 4)$ or ...

Here is a smaller example. There are 4 possible values of X and 2⁴ possible rejection regions. Here is a table of the levels for each possible rejection region R:

R	$\alpha$
{}	0
{3}, {0}	1/8
{0,3}	2/8
{1}, {2}	3/8
{0,1}, {0,2}, {1,3}, {2,3}	4/8
{0,1,3}, {0,2,3}	5/8
{1,2}	6/8
{0,1,3}, {0,2,3}	7/8
{0,1,2,3}

The best level 2/8 test has rejection region $\{0,3\}$ and $\beta = 1-[(3/4)^3+(1/4)^3] = 36/64$ . If, instead, we permit randomization then we will find that the best level test rejects when X=3 and, when X=2 tosses a coin which has chance 1/3 of landing heads, then rejects if you get heads. The level of this test is 1/8+(1/3)(3/8) = 2/8 and the probability of a Type II error is $\beta =1-[(3/4)^3 +(1/3)(3)(3/4)^2(1/4)] = 28/64$ .

Definition: A hypothesis test is a function $\phi(x)$ whose values are always in [0,1]. If we observe X=x then we choose H₁ with conditional probability $\phi(X)$ . In this case we have

$\begin{displaymath}\pi(\theta) = E_\theta(\phi(X)) \end{displaymath}$

$\begin{displaymath}\alpha = E_0(\phi(X)) \end{displaymath}$

and

$\begin{displaymath}\beta = E_1(\phi(X)) \end{displaymath}$

The Neyman Pearson Lemma

Theorem: In testing f₀ against f₁ the probability $\beta$ of a type II error is minimized, subject to $\alpha \le \alpha_0$ by the test function:

$\begin{displaymath}\phi(x) =\left\{\begin{array}{ll} 1 & \frac{f_1(x)}{f_0(x)} >... ...bda \\ 0 & \frac{f_1(x)}{f_0(x)} < \lambda \end{array}\right. \end{displaymath}$

where $\lambda$ is the largest constant such that

$\begin{displaymath}P_0( \frac{f_1(x)}{f_0(x)} \ge \lambda) \ge \alpha_0 \end{displaymath}$

and

$\begin{displaymath}P_0( \frac{f_1(x)}{f_0(x)}\le \lambda) \ge 1-\alpha_0 \end{displaymath}$

and where $\gamma$ is any number chosen so that

$\begin{displaymath}E_0(\phi(X)) = P_0( \frac{f_1(x)}{f_0(x)} > \lambda) + \gamma P_0( \frac{f_1(x)}{f_0(x)} =\lambda) = \alpha_0 \end{displaymath}$

The value of $\gamma$ is unique if $P_0( \frac{f_1(x)}{f_0(x)} = \lambda) > 0$ .

Example: In the Binomial(n,p) with p₀=1/2 and p₁=3/4the ratio f₁/f₀ is

3^x 2^-n

Now if n=5 then this ratio must be one of the numbers 1, 3, 9, 27, 81, 243 divided by 32. Suppose we have $\alpha = 0.05$ . The value of $\lambda$ must be one of the possible values of f₁/f₀. If we try $\lambda = 343/32$ then

$\begin{displaymath}P_0(3^X 2^{-5} \ge 343/32) = P_0(X=5) = 1/32 < 0.05 \end{displaymath}$

and

$\begin{displaymath}P_0(3^X 2^{-5} \ge 81/32) = P_0(X \ge 4) = 6/32 > 0.05 \end{displaymath}$

This means that $\lambda=81/32$ . Since

P₀(3^X 2^-5 > 81/32) =P₀( X=5) =1/32

we must solve

$\begin{displaymath}P_0(X=5) + \gamma P_0(X=4) = 0.05 \end{displaymath}$

for $\gamma$ and find

$\begin{displaymath}\gamma = \frac{0.05-1/32}{5/32}= 0.12 \end{displaymath}$

NOTE: No-one ever uses this procedure. Instead the value of $\alpha_0$ used in discrete problems is chosen to be a possible value of the rejection probability when $\gamma=0$ (or $\gamma=1$ ). When the sample size is large you can come very close to any desired $\alpha_0$ with a non-randomized test.

If $\alpha_0=6/32$ then we can either take $\lambda$ to be 343/32 and $\gamma=1$ or $\lambda=81/32$ and $\gamma=0$ . However, our definition of $\lambda$ in the theorem makes $\lambda=81/32$ and $\gamma=0$ .

When the theorem is used for continuous distributions it can be the case that the cdf of f₁(X)/f₀(X) has a flat spot where it is equal to $1-\alpha_0$ . This is the point of the word ``largest'' in the theorem.

Example: If $X_1,\ldots,X_n$ are iid $N(\mu,1)$ and we have $\mu_0=0$ and $\mu_1 >0$ then

$\begin{displaymath}\frac{f_1(X_1,\ldots,X_n)}{f_0(X_1,\ldots,X_n)} = \exp\{\mu_1 \sum X_i -n\mu_1^2/2 - \mu_0 \sum X_i + n\mu_2^2/2\} \end{displaymath}$

which simplifies to

$\begin{displaymath}\exp\{\mu_1 \sum X_i -n\mu_1^2/2 \} \end{displaymath}$

We now have to choose $\lambda$ so that

$\begin{displaymath}P_0(\exp\{\mu_1 \sum X_i -n\mu_1^2/2 \}> \lambda ) = \alpha_0 \end{displaymath}$

We can make it equal because in this case f₁(X)/f₀(X) has a continuous distribution. Rewrite the probability as

$\begin{displaymath}P_0(\sum X_i > [\log(\lambda) +n\mu_1^2/2]/\mu_1) =1-\Phi([\log(\lambda) +n\mu_1^2/2]/[n^{1/2}\mu_1]) \end{displaymath}$

If $z_\alpha$ is notation for the usual upper $\alpha$ critical point of the normal distribution then we find

$\begin{displaymath}z_{\alpha_0} = [\log(\lambda) +n\mu_1^2/2]/[n^{1/2}\mu_1] \end{displaymath}$

which you can solve to get a formula for $\lambda$ in terms of $z_{\alpha_0}$ , n and $\mu_1$ .

The rejection region looks complicated: reject if a complicated statistic is larger than $\lambda$ which has a complicated formula. But in calculating $\lambda$ we re-expressed the rejection region in terms of

$\begin{displaymath}\frac{\sum X_i}{\sqrt{n}} > z_{\alpha_0} \end{displaymath}$

The key feature is that this rejection region is the same for any $\mu_1 >0$ . [WARNING: in the algebra above I used $\mu_1 >0$ .] This is why the Neyman Pearson lemma is a lemma!

Proof of the Neyman Pearson lemma: Given a test $\phi$ with level strictly less than $\alpha_0$ we can define the test

$\begin{displaymath}\phi^*(x) = \frac{1-\alpha_0}{1-\alpha} \phi(x) + \frac{\alpha_0-\alpha}{1-\alpha} \end{displaymath}$

has level $\alpha_0$ and $\beta$ smaller than that of $\phi$ . Hence we may assume without loss that $\alpha=\alpha_0$ and minimize $\beta$ subject to $\alpha=\alpha_0$ . However, the argument which follows doesn't actually need this.

Lagrange Multipliers

Suppose you want to minimize f(x) subject to g(x) = 0. Consider first the function

$\begin{displaymath}h_\lambda(x) = f(x) + \lambda g(x) \end{displaymath}$

If $x_\lambda$ minimizes $h_\lambda$ then for any other x

$\begin{displaymath}f(x_\lambda) \le f(x) +\lambda[ g(x) - g(x_\lambda)] \end{displaymath}$

Now suppose you can find a value of $\lambda$ such that the solution $x_\lambda$ has $g(x_\lambda) = 0$ . Then for any x we have

$\begin{displaymath}f(x_\lambda) \le f(x) +\lambda g(x) \end{displaymath}$

and for any x satisfying the constraint g(x) = 0we have

$\begin{displaymath}f(x_\lambda) \le f(x) \end{displaymath}$

This proves that for this special value of $\lambda$ the quantity $x_\lambda$ minimizes f(x) subject to g(x)=0.

Notice that to find $x_\lambda$ you set the usual partial derivatives equal to 0; then to find the special $x_\lambda$ you add in the condition $g(x\lambda)=0$ .

Proof of NP lemma

For each $\lambda> 0$ we have seen that $\phi_\lambda$ minimizes $\lambda\alpha+\beta$ where $\phi_\lambda=1(f_1(x)/f_0(x) \ge \lambda)$ .

As $\lambda$ increases the level of $\phi_\lambda$ decreases from 1 when $\lambda=0$ to 0 when $\lambda = \infty$ . There is thus a value $\lambda_0$ where for $\lambda < \lambda_0$ the level is less than $\alpha_0$ while for $\lambda > \lambda_0$ the level is at least $\alpha_0$ . Temporarily let $\delta=P_0(f_1(X)/f_0(X) = \lambda_0)$ . If $\delta = 0$ define $\phi=\phi_\lambda$ . If $\delta > 0$ define

$\begin{displaymath}\phi(x) =\left\{\begin{array}{ll} 1 & \frac{f_1(x)}{f_0(x)} >... ...0 \\ 0 & \frac{f_1(x)}{f_0(x)} < \lambda_0 \end{array}\right. \end{displaymath}$

where $P_0(f_1(X)/f_0(X) < \lambda_0)+\gamma\delta = \alpha_0$ . You can check that $\gamma\in[0,1]$ .

Now $\phi$ has level $\alpha_0$ and according to the theorem above minimizes $\alpha+\lambda_0\beta$ . Suppose $\phi^*$ is some other test with level $\alpha^* \le \alpha_0$ . Then

$\begin{displaymath}\lambda_0\alpha_\phi+ \beta_\phi \le \lambda_0\alpha_{\phi^*} + \beta_{\phi^*} \end{displaymath}$

We can rearrange this as

$\begin{displaymath}\beta_{\phi^*} \ge \beta_\phi +(\alpha_\phi-\alpha_{\phi^*})\lambda_0 \end{displaymath}$

Since

$\begin{displaymath}\alpha_{\phi^*} \le \alpha_0 = \alpha_\phi \end{displaymath}$

the second term is non-negative and

$\begin{displaymath}\beta_{\phi^*} \ge \beta_\phi \end{displaymath}$

which proves the Neyman Pearson Lemma.

Definition: In the general problem of testing $\Theta_0$ against $\Theta_1$ the level of a test function $\phi$ is

$\begin{displaymath}\alpha = \sup_{\theta\in\Theta_0}E_\theta(\phi(X)) \end{displaymath}$

The power function is

$\begin{displaymath}\pi(\theta) = E_\theta(\phi(X)) \end{displaymath}$

A test $\phi^*$ is a Uniformly Most Powerful level $\alpha_0$ test if

1.

$\phi^*$ has level $\alpha \le \alpha_o$

2.

If $\phi$ has level $\alpha \le \alpha_0$ then for every $\theta\in \Theta_1$ we have

$\begin{displaymath}E_\theta(\phi(X)) \le E_\theta(\phi^*(X)) \end{displaymath}$

Application of the NP lemma: In the $N(\mu,1)$ model consider $\Theta_1=\{\mu>0\}$ and $\Theta_0=\{0\}$ or $\Theta_0=\{\mu \le 0\}$ . The UMP level $\alpha_0$ test of $H_0: \mu\in\Theta_0$ against $H_1:\mu\in\Theta_1$ is

$\begin{displaymath}\phi(X_1,\ldots,X_n) = 1(n^{1/2}\bar{X} > z_{\alpha_0}) \end{displaymath}$

Proof: For either choice of $\Theta_0$ this test has level $\alpha_0$ because for $\mu\le 0$ we have
$\begin{align*}P_\mu(n^{1/2}\bar{X} > z_{\alpha_0}) & = P_\mu(n^{1/2}(\bar{X}-\mu... ..._0}-n^{1/2}\mu) \\ & \le P(N(0,1) > z_{\alpha_0}) \\ & = \alpha_0 \end{align*}$
(Notice the use of $\mu\le 0$ . The central point is that the critical point is determined by the behaviour on the edge of the null hypothesis.)

Now if $\phi$ is any other level $\alpha_0$ test then we have

$\begin{displaymath}E_0(\phi(X_1,\ldots,X_n)) \le \alpha_0 \end{displaymath}$

Fix a $\mu > 0$ . According to the NP lemma

$\begin{displaymath}E_\mu(\phi(X_1,\ldots,X_n)) \le E_\mu(\phi_\mu(X_1,\ldots,X_n)) \end{displaymath}$

where $\phi_\mu$ rejects if $f_\mu(X_1,\ldots,X_n)/f_0(X_1,\ldots,X_n) > \lambda$ for a suitable $\lambda$ . But we just checked that this test had a rejection region of the form

$\begin{displaymath}n^{1/2}\bar{X} > z_{\alpha_0} \end{displaymath}$

which is the rejection region of $\phi^*$ . The NP lemma produces the same test for every $\mu > 0$ chosen as an alternative. So we have shown that $\phi_\mu=\phi^*$ for any $\mu > 0$ .

This phenomenon is somewhat general. What happened was this. For any $\mu > \mu_0$ the likelihood ratio $f_\mu/f_0$ is an increasing function of $\sum X_i$ . The rejection region of the NP test is thus always a region of the form $\sum X_i > k$ . The value of the constant k is determined by the requirement that the test have level $\alpha_0$ and this depends on $\mu_0$ not on $\mu_1$ .

Definition: The family $f_\theta;\theta\in \Theta\subset R$ has monotone likelikelood ratio with respect to a statistic T(X)if for each $\theta_1>\theta_0$ the likelihood ratio $f_{\theta_1}(X) / f_{\theta_0}(X)$ is a monotone increasing function of T(X).

Theorem: For a monotone likelihood ratio family the Uniformly Most Powerful level $\alpha$ test of $\theta \le \theta_0$ (or of $\theta=\theta_0$ ) against the alternative $\theta>\theta_0$ is

$\begin{displaymath}\phi(x) =\left\{\begin{array}{ll} 1 & T(x) > t_\alpha \\ \gamma & T(X)=t_\alpha \\ 0 & T(x) < t_\alpha \end{array}\right. \end{displaymath}$

where $P_0(T(X) > t_\alpha)+\gamma P_0(T(X) = t_\alpha) = \alpha_0$ .

A typical family where this will work is a one parameter exponential family. In almost any other problem the method doesn't work and there is no uniformly most powerful test. For instance to test $\mu=\mu_0$ against the two sided alternative $\mu\neq\mu_0$ there is no UMP level $\alpha$ test. If there were its power at $\mu > \mu_0$ would have to be as high as that of the one sided level $\alpha$ test and so its rejection region would have to be the same as that test, rejecting for large positive values of $\bar{X} -\mu_0$ . But it also has to have power as good as the one sided test for the alternative $\mu < \mu_0$ and so would have to reject for large negative values of $\bar{X} -\mu_0$ . This would make its level too large.

The favourite test is the usual 2 sided test which rejects for large values of $\vert\bar{X} -\mu_0\vert$ with the critical value chosen appropriately. This test maximizes the power subject to two constraints: first, that the level be $\alpha$ and second that the test have power which is minimized at $\mu=\mu_0$ . This second condition is really that the power on the alternative be larger than it is on the null.

Definition: A test $\phi$ of $\Theta_0$ against $\Theta_1$ is unbiased level $\alpha$ if it has level $\alpha$ and, for every $\theta\in \Theta_1$ we have

$\begin{displaymath}\pi(\theta) \ge \alpha \, . \end{displaymath}$

When testing a point null hypothesis like $\mu=\mu_0$ this requires that the power function be minimized at $\mu_0$ which will mean that if $\pi$ is differentiable then

$\begin{displaymath}\pi^\prime(\mu_0) =0 \end{displaymath}$

We now apply that condition to the $N(\mu,1)$ problem. If $\phi$ is any test function then

$\begin{displaymath}\pi^\prime(\mu) = \frac{\partial}{\partial\mu} \int \phi(x) f(x,\mu) dx \end{displaymath}$

We can differentiate this under the integral and use

$\begin{displaymath}\frac{\partial f(x,\mu)}{\partial\mu} = \sum(x_i-\mu) f(x,\mu) \end{displaymath}$

to get the condition

$\begin{displaymath}\int \phi(x) \bar{x} f(x,\mu_0) dx = \mu_0 \alpha_0 \end{displaymath}$

Consider the problem of minimizing $\beta(\mu)$ subject to the two constraints $E_{\mu_0}(\phi(X)) = \alpha_0$ and $E_{\mu_0}(\bar{X} \phi(X)) = \mu_0 \alpha_0$ . Now fix two values $\lambda_1>0$ and $\lambda_2$ and minimize

$\begin{displaymath}\lambda_1\alpha + \lambda_2 E_{\mu_0}[(\bar{X} - \mu_0)\phi(X)] + \beta \end{displaymath}$

The quantity in question is just

$\begin{displaymath}\int [\phi(x) f_0(x)(\lambda_1+\lambda_2(\bar{X} - \mu_0)) + (1-\phi(x))f_1(x)] dx \end{displaymath}$

As before this is minimized by

$\begin{displaymath}\phi(x) =\left\{\begin{array}{ll} 1 & \frac{f_1(x)}{f_0(x)} >... ...(x)} < \lambda_1+\lambda_2(\bar{X} - \mu_0) \end{array}\right. \end{displaymath}$

The likelihood ratio f₁/f₀ is simply

$\begin{displaymath}\exp\{ n(\mu_1-\mu_0)\bar{X} + n(\mu_0^2-\mu_1^2)/2\} \end{displaymath}$

and this exceeds the linear function

$\begin{displaymath}\lambda_1+\lambda_2(\bar{X} - \mu_0) \end{displaymath}$

for all $\bar{X}$ sufficiently large or small. That is, the quantity

$\begin{displaymath}\lambda_1\alpha + \lambda_2 E_{\mu_0}[(\bar{X} - \mu_0)\phi(X)] + \beta \end{displaymath}$

is minimized by a rejection region of the form

$\begin{displaymath}\{\bar{X} > K_U\} \cup \{ \bar{X} < K_L\} \end{displaymath}$

To satisfy the constraints we adjust K_U and K_L to get level $\alpha$ and $\pi^\prime(\mu_0) = 0$ . The second condition shows that the rejection region is symmetric about $\mu_0$ and then we discover that the test rejects for

$\begin{displaymath}\sqrt{n}\vert\bar{X} - \mu_0\vert > z_{\alpha/2} \end{displaymath}$

Now you have to mimic the Neyman Pearson lemma proof to check that if $\lambda_1$ and $\lambda_2$ are adjusted so that the unconstrained problem has the rejection region given then the resulting test minimizes $\beta$ subject to the two constraints.

A test $\phi^*$ is a Uniformly Most Powerful Unbiased level $\alpha_0$ test if

1.

$\phi^*$ has level $\alpha \le \alpha_0$ .

2.

$\phi^*$ is unbiased.

3.

If $\phi$ has level $\alpha \le \alpha_0$ then for every $\theta\in \Theta_1$ we have

$\begin{displaymath}E_\theta(\phi(X)) \le E_\theta(\phi^*(X)) \end{displaymath}$

Conclusion: The two sided z test which rejects if

$\begin{displaymath}\vert Z\vert > z_{\alpha/2} \end{displaymath}$

where

$\begin{displaymath}Z=n^{1/2}(\bar{X} -\mu_0) \end{displaymath}$

is the uniformly most powerful unbiased test of $\mu=\mu_0$ against the two sided alternative $\mu\neq\mu_0$ .

$next$ $up$ $previous$

Richard Lockhart
2000-03-15