STAT 330 Lecture 32
Reading for Today's Lecture: 12.4, 12.5.
Goals of Today's Lecture:
Today's notes
Correlation Analysis
Our model will be that pairs
are sampled from a bivariate normal population of pairs.
Properties of the bivariate normal density:
Facts about
Sample Version of this theory:
Definition: the sample correlation coefficient is
(and many other formulae are possible) where
and
Facts about r:
Here are some plots of data sets in which the sample correlation is varied from picture to picture.
Here is a scatterplot in which the correlation is 0 even though X and Y are quite strongly related; the point is that the relation is not a straight line. The best line through the picture is flat.
Here is a simulated data set which illustrates the following points:
You are supposed to see that the histogram is reasonably normal. I have superimposed a normal curve whose mean is given by the regression line and whose SD is given by this root mean square error . The curve follows the histogram pretty well.