2. What is the difference between SD and SE, and how do these quantities relate to Bayesian prior and posterior distributions?
I will explain this issue by copying here a message received via email
and my answer.
> I want to clarify my understanding of setting up a prior for an
Inference for a single normal mean. When we specify
the prior, we use the notation N(mean, sd or variance
or SE), but I'm never sure if I should be stating the standard
deviation or the variance or the standard error.
In frequentist analyses, SDs are used for population description purposes,
i.e., to describe how one person varies from another on the measure of
interest and within the population of interest. SEs are used to estimate
how accurately a model parameter has been estimated. So, you might have SD
= 10 to describe how blood pressure varies from one person to the next in a
population, and then SE = 1, for example, once a sample has been collected,
representing how accurately we now know the mean value.
Note that the SD is a population characteristic, and stays constant
regardless of sample size, while the SE depends on the sample size. In
general, the larger the sample size the smaller the SE.
When I used to teach 607, I handed out an article on this topic each year,
and if you want to read more, you can find it on page 106 (or 109 of the pdf
file) of my 607 course notes, which are here:
607 course notes
All of the above is for frequentist analyses. Bayesian analysis is quite
different, since the term SE is never used, only SD, which is used for all
variability related terms, and always represents the variability in whatever
distribution is under discussion. So, if one is discussing a distribution
representing person to person variability in a population, then the SD
represents that variability. If one is discussing a prior distribution for
a mean parameter, then the SD of that prior represents how accurately the
researcher knows that mean value, before analyzing the data. If one is
discussing the SD from the posterior distribution, then the SD represents
how accurately one knows the mean value after analyzing the data.
So, to draw an analogy between frequentist and Bayesian analysis, the SD
from the freq viewpoint is the same as the Bayesian SD when discussing
person to person variability in a population. The SE from a freq viewpoint
is similar to the Bayesian SD for the posterior distribution (but will be
numerically similar only if little prior information is used). There is of
course no freq equivalent to the Bayesian prior SD, as freqs ignore prior
information.
> Also, is this normal distribution we specify the distribution of
where we think
the mean value lies, or where we think and individual's value is likely to be?
If it is a distribution for the mean, we would use the SE, but if it is a distribution
for an individual's possible value, we would use either sd or variance, correct?
Yes, I think you have the right idea, but see above explanations, as
Bayesians would not call it the SE. If setting up a prior distribution for
the mean, the SD you need to represent your uncertainty about the mean
value, and if discussing person to person variability, you would use the SD
that represents population variability.
> Also, I'm confused about generally the same issue for the posterior
distribution... is
the posterior distribution a probability distribution for where the mean lies, or the
probability distribution of where an idividual's value may lie... and therefore what
is the second parameter specified in the posterior notation N(71.69, 2.68)
for example?
Your first guess is correct, the posterior distribution is a probability
distribution for where the mean lies, after analyzing your data. So, in the
above example, if 2.68 is the posterior variance, then sqrt(2.68) = 1.64 is
the posterior SD, which represents how accurately the mean is known after
data analysis.
> My thoughts are that the prior is a distribution for the possible
values
for an individual, and therefore the second parameter should be the sd
or variance,
No, this is not correct, as per above discussion. The prior is a
distribution representing your knowledge about the mean parameter value.
> and the posterior distribution
is the probability distribution for the mean, and therefore the second
parameter is the SE,
but I would really like to hear your clarification on this.
Yes, that is correct, but, again, Bayesians would not call it the SE, and it
would not be the same numerical value as the freq SE unless lile prior info
is used.
One last point: Bayesians never need to use SE = SD/sqrt(n), since the SD
from the posterior distribution already incorporates information from the
data, so the sample size is "built-in" already.