Reading for Today's Lecture: Chapter 1 of Casella and Berger.
Goals of Today's Lecture:
Course outline:
Standard view of scientific inference has a set of theories which make predictions about the outcomes of an experiment:
Theory | Prediction |
A | 1 |
B | 2 |
C | 3 |
Conduct experiment, see outcome 2: we infer that Theory B is correct (or at least that A and C are wrong).
Add Randomness
Theory | Prediction |
A | Usually 1 sometimes 2 never 3 |
B | Usually 2 sometimes 1 never 3 |
C | Usually 3 sometimes 1 never 2 |
See outcome 2: infer Theory B probably correct, Theory A probably not correct, Theory C is wrong.
Probability Theory: construct table: compute likely outcomes of experiments.
Statistics: inverse process. Use table to draw inferences from outcome of experiment. How should we do it and how wrong are our inferences likely to be?
Probability Space (or Sample Space): ordered triple .
Axioms guarantee we can compute probabilities by usual rules, including approximation without fear of contradiction.
Vector valued random variable: function
with the property that,
writing,
In almost all of probability and statistics the dependence of a random variable on a point in the probability space is hidden! You almost always see X not .
Now for formal definitions:
Borel -field in Rp: smallest -field in Rp containing every open ball.
Every common set is a Borel set, that is, in the Borel -field.
An Rp valued random variable is a map such that when A is Borel then .
Fact: this is equivalent to
Jargon and notation: we write
for
and define the distribution of X to be the map
Cumulative Distribution Function (or CDF) of X: function FX on Rpdefined by
Properties of FX (or just F when there's only one CDF under consideration) for p=1:
The distribution of a random variable X is discrete
(we also call the random variable discrete) if there
is a countable set
such that
The distribution of a random variable X is absolutely continuous
if there is a function f such that
Example: X is exponential.
General Problem: Start with assumptions about the density or CDF of a random vector . Define to be some function of X (usually some statistic of interest). How can we compute the distribution or CDF or density of Y?
Univariate Techniques
Method 1: compute the CDF by integration and differentiate to find fY.
Example:
and .
Example:
,
i.e.
We will find indicator notation useful:
Notice: I never evaluated FY before differentiating it. In fact
FY and FZ are integrals I can't do but I can differentiate then anyway.
Remember fundamental theorem of calculus: