EPIB-621 / FAQ index

I will explain this issue by copying here a message received via email and my answer.

> I want to clarify my understanding of setting up a prior for an Inference for a single normal mean. When we specify the prior, we use the notation N(mean, sd or variance or SE), but I'm never sure if I should be stating the standard deviation or the variance or the standard error.

In frequentist analyses, SDs are used for population description purposes, i.e., to describe how one person varies from another on the measure of interest and within the population of interest. SEs are used to estimate how accurately a model parameter has been estimated. So, you might have SD = 10 to describe how blood pressure varies from one person to the next in a population, and then SE = 1, for example, once a sample has been collected, representing how accurately we now know the mean value.

Note that the SD is a population characteristic, and stays constant regardless of sample size, while the SE depends on the sample size. In general, the larger the sample size the smaller the SE.

When I used to teach 607, I handed out an article on this topic each year, and if you want to read more, you can find it on page 106 (or 109 of the pdf file) of my 607 course notes, which are here:

607 course notes

All of the above is for frequentist analyses. Bayesian analysis is quite different, since the term SE is never used, only SD, which is used for all variability related terms, and always represents the variability in whatever distribution is under discussion. So, if one is discussing a distribution representing person to person variability in a population, then the SD represents that variability. If one is discussing a prior distribution for a mean parameter, then the SD of that prior represents how accurately the researcher knows that mean value, before analyzing the data. If one is discussing the SD from the posterior distribution, then the SD represents how accurately one knows the mean value after analyzing the data.

So, to draw an analogy between frequentist and Bayesian analysis, the SD from the freq viewpoint is the same as the Bayesian SD when discussing person to person variability in a population. The SE from a freq viewpoint is similar to the Bayesian SD for the posterior distribution (but will be numerically similar only if little prior information is used). There is of course no freq equivalent to the Bayesian prior SD, as freqs ignore prior information.

> Also, is this normal distribution we specify the distribution of where we think the mean value lies, or where we think and individual's value is likely to be? If it is a distribution for the mean, we would use the SE, but if it is a distribution for an individual's possible value, we would use either sd or variance, correct?

Yes, I think you have the right idea, but see above explanations, as Bayesians would not call it the SE. If setting up a prior distribution for the mean, the SD you need to represent your uncertainty about the mean value, and if discussing person to person variability, you would use the SD that represents population variability.

> Also, I'm confused about generally the same issue for the posterior distribution... is the posterior distribution a probability distribution for where the mean lies, or the probability distribution of where an idividual's value may lie... and therefore what is the second parameter specified in the posterior notation N(71.69, 2.68) for example?

Your first guess is correct, the posterior distribution is a probability distribution for where the mean lies, after analyzing your data. So, in the above example, if 2.68 is the posterior variance, then sqrt(2.68) = 1.64 is the posterior SD, which represents how accurately the mean is known after data analysis.

> My thoughts are that the prior is a distribution for the possible values for an individual, and therefore the second parameter should be the sd or variance,

No, this is not correct, as per above discussion. The prior is a distribution representing your knowledge about the mean parameter value.

> and the posterior distribution is the probability distribution for the mean, and therefore the second parameter is the SE, but I would really like to hear your clarification on this.

Yes, that is correct, but, again, Bayesians would not call it the SE, and it would not be the same numerical value as the freq SE unless lile prior info is used.

One last point: Bayesians never need to use SE = SD/sqrt(n), since the SD from the posterior distribution already incorporates information from the data, so the sample size is "built-in" already.

I have not taken any formal surveys. Nevertheless, the disadvantages of p-values have been discussed almost since they were proposed. There are literally hundreds of articles in the statistical and medical literature explaining the disadvantages of p-values, and expressing preferences for confidence intervals and/or Bayesian methods. As discussed in class (see page 13 of notes from January 6th lecture), major journals including the leading journal in our field, Epidemiology virtually ban the use of p-values or even discussions based on "significance" within their pages.

Check out these web pages for many references concerning p-values:

www.fharrell.com/post/pval-litany/

www.indiana.edu/~stigtsts/

warnercnr.colostate.edu/~anderson/thompson1.html

www.npwrc.usgs.gov/resource/methods/pressugg/intro.htm

www.cnr.colostate.edu/~anderson/null.html

Here is a small sampling of other articles (and check their reference lists for many, many others):

Malakoff D. Bayes offers a �new� way to make sense of numbers. Science. 1999 Nov 19;286(5444):1460-4. (Review from Science about the rise of Bayesian analysis in Science.)

Dunson DB. Practical advantages of Bayesian analysis of epidemiologic data. American Journal of Epidemiology 2001 Jun 15;153(12):1222-6.

Goodman SN. Of P-values and Bayes: A modest proposal. Epidemiology. 2001 May;12(3):295-297.

Evans J, Mills P, Dawson.J. The end of the p-value? British Heart Journal 1988; 60:177-180.

Altman DG. Why we need confidence intervals. World J Surgery 2005;29(5):554-556.

Gardner MJ, Altman DG. Confidence intervals rather than P values: estimation rather than hypothesis testing. Br Med J. 1986 Mar 15;292(6522):746-750.

Altman DG, Gardner MJ. Calculating confidence intervals for regression and correlation. Br Med J. 1988 Apr 30;296(6631):1238-1242.

Gardner MJ, Altman DG. Using confidence intervals. Lancet. 1987 Mar 28;1(8535):746.

Taken together, you can see that the major scientific journals such as Science, as well as the two most respected epidemiology journals, including American Journal of Epidemiology and Epidemiology and major medical journals such as BMJ and Lancet, have published editorials encouraging methods other than frequentist hypothesis testing. As one last example, I have published an article in JAMA with Dr. James Brophy, another professor in our Department, comparing Bayesian credible intervals to frequentist p-values for clinical decision making. It summarizes many of the arguments made in our 621 course about why p-values are not useful for making any important clinical decisions.

Brophy JM, Joseph L. Placing trials in context using Bayesian analysis: GUSTO revisited by Reverend Bayes. Journal of the American Medical Association 1995;273(11):871-875.

If you search the web, you will find many other examples, as well as lecture notes from courses at other universities that contain similar material.