Reading for Today's Lecture:
Goals of Today's Lecture:
Today's notes
Example: Decide between 4 modes of transportation to work:
Ingredients of Decision Problem: No data case.
In the example we might use the following table for L:
C | B | T | H | |
R | 3 | 8 | 5 | 25 |
S | 5 | 0 | 2 | 25 |
Notice that if it rains I will be glad if I drove. If it is sunny I will be glad if I rode my bike. In any case staying at home is expensive.
In general we study this problem by comparing various functions of .
In this
problem a function of
has only two values, one for rain and one for sun and
we can plot any such function as a point in the plane. We do so to indicate the geometry
of the problem before stating the general theory.
C | B | T | H | |
R | 3 | 8 | 5 | 25 |
S | 5 | 0 | 2 | 25 |
Maximum | 5 | 8 | 5 | 25 |
Smallest maximum: take car or bus.
Minimax action: take car or public transit to minimize worst case loss.
Now imagine: toss coin with probability
of getting Heads, take my car if Heads, otherwise take transit.
Long run average daily loss would be
when it rains and
when it is Sunny. Call this procedure
;
add it to
graph for each value of
.
Varying
from 0 to 1 gives a straight line running from (2,5) to
(5,3). The two losses are equal when
.
For smaller
worst case loss is for sun; for larger
worst case loss is for rain.
Added to graph: loss functions for each ,
(straight line) and set of (x,y) pairs for which
-- worst case loss for
when
.
In general we might consider using a 4 sided coin where we
took action B with probability ,
C with
probability
and so on. The loss function of such
a procedure is a convex combination of the losses of the four basic
decisions making the set of losses achievable with the aid of randomization
look like the following:
SO: replace decision space D with
Graph shows many points in the picture correspond to bad decision procedures. Rain or shine taking my car to work has lower loss than staying home; staying home is inadmissible.
Definition: A decision
is inadmissible if there is
a decision
such that
Admissible decisions have risks on lower left of graphs, i.e., lines connecting B to T and T to C are the admissible decisions.
There is a connection between Bayes decisions and
admissible decisions. A prior distribution
in our example problem is specified by two probabilities,
and
which
add up to 1. If
L=(LR,LS) is the loss function for
some decision then the Bayes risk is
Statistical problems have another ingredient, the data. We observe
X a random variable taking values in say .
We may make our decision d depend on X. A
decision rule is a function
from
to D.
We will want
to be small for all
.
Since
X is random we quantify this by averaging over X and compare procedures
in terms of the risk function
To compare two procedures we must compare two functions of
and
pick ``the smaller one''. But typically the two functions will cross each
other and there won't be a unique `smaller one'.
Example: In estimation theory to estimate a real parameter
we used
,