**November lectures**

**Dear Dr. Hanley.
Just a few quick questions regarding your epidemiology lecture to our med2 class**.

NOT CORRECT

inversely related to the square ROOT of the n

see eqns. on page 2 of notes, left column ...

does this mean the SE is reduced by the square of 4?

NO,

it is reduced by a factor of sqrt[2] = 1.41 !

It goes from proportional to 1/sqrt[4] to proportional to 1/sqrt[8]

that is from 1/2 = 0.5 to 1/sqrt[8]= 0.353

that is NOT reduced by the square of 4 !

it IS reduced by the square ROOT of 2 !

If you went from 4 to 16 (quadruple), you would reduce it by a factor of 2

ie from 1/sqrt[4] to 1/sqrt[16]

ie from 0.5 to 0.25! that's a factor of 2 !

Do I have this right 'As you increase the n, you decrease the SE (by the square of n?) ',

YES you decrease the SE , BUT not in the way you say it..

SE if use n

is proportional to 1/sqrt[n] (1)

SE if use k times n

is proportional to square root of (k times n )

ie proportional to a value which is sqrt[k] times smaller than (1)

INDEED!

INDEED!

Remember my example on last page of my handout

n=1000 (all canada) margin of error = 3 percentage points

n= 250 (Quebec only) margin of error = 6 percentage points

quebec sample size is 4 times smaller

so SE (and thus the margin of error in a 95% CI, namely 1.96 x SE)

is sqrt[4] = 2 times larger

that the confidence in this CI is less?

YES

(eg 1.645 rather than 1.96) of the SAME SE to get

a smaller margin of error, then you CAN'T have as much

confidence in it

how about i ask you to estimate my age to within +/- 10 years

HOW much, with same info, would you be willing to bet that your interval is correct?

What if I said you must estimate it to within +/- 1 year?

Would you be willing to bet as much?

if i ask you for a CI for the average salary of all Quebec md's

and I ask you to give a number +/- $1000

you (or i ) won't have as much confidence in that interval

as if you could use a margin of error of say +/- $10000.

BUT kept the multiplier (eg 1.96) the SAME

then your confidence (95%) is the same..

but you now have a smaller margin of error...

that is a bit like asking me some more info

like when i got my PhD and what my BP is

and the age of one of my sibs ! more info

lets you narrow the interval,

while keeping the same degree of confidence

ie you would now bet same amount on narrower interval

or keep the same interval but have more confidence (bet more)

the degree of confidence is something you set to suit

the demands of the situation and unless you can change

one of the basic inputs (such as the amount of info)

you must settle for less confidence in narrower intervals

or more confidence in wider ones

CI, constructed in this way, contain the true value

95% of the time, and donít 5% of the time. We don't

know which one though!

EXCELLENT!

**How do you get 1.96 for 95%CI. Ie. What is it for 90%?**

In my notes I said it was 1.645 for 90%

These values come from the Gaussian (Normal) distribution

we know that 95% of the values in a Gaussian (Normal) disrn. fall within 1.96 SD's
of the mean

we know that 90% of the values in a Gaussian (Normal) disrn. fall within 1.645 SD's of the mean

we know that 68% of the values in a Gaussian (Normal) disrn. fall within 1.0
SD's of the mean

etc

these are tabulated and available on calculators

and before the euro, the equation of the normal curve used to be on the German 10 mark note along with a portrait of Gauss

this distribution is usually used for individual values (eg heights)

BUT it also applied to STATISTICS (ie numbers calculated from aggregate

ie samples)

so it is the same business.. except we use SE to describe the

variability of a statistic.

I would be grateful for any feedback on this or the lectures..

J Hanley (James.Hanley@McGill.CA)

=====================

**September lectures...**

Dear Dr. Hanley.

I am a student in the med/dent 2 class. I have two questions I would like to ask
you.

**1.
At the beginning of todays lecture (Friday), you gave us material that was not addressed
in the lecture. As you know, the material was about statistics. I read it, and had
great diffeculty understanding the concepts. My question is, will you explain this
material in subsequent lectures, or would you like us to understand it now for the
upcoming exam?
**

agree that difficult without me to explain it in person -- not suitable for review by yourself..

I will indeed have to go over it again in nov and so it won't be on the upcoming exam

[just wasnt enough time to get thru all this in 5 hrs]

Am I correct with the following terminology/concepts:

-experimental: investigator selects and allocates experiment groups.

YES.. AND WITH A VIEW TO LEARNING SOMETHING..

ie it can't be just in course of usual clinical activities

(where investiagtor could also select and allocate.. )

CORRECT .. subjects select their own lifestyle, environment, etc.. (or luck and their parents pass on certain genes or blood groups etc to them!)

that's why some people call them "observational" studies [they would be better to say "observation

correct.. there can be many factors that differ b/w the group "exposed" to the "agent" of interest and that

could distort the comparison

(unless investigator is VERY FORTUNATE -- as was John Snow)

People don't say "cohort EXPERIMENT"

cohort is a group that is followed up (a clin trial has at least 2 such groups, and IS experimental...)

BUT there are many cohorts whose "exposure" is NOT allocated by the investigator

(as you correctly state above, to be an experiment, it is the investigator who allocates the Rx)

ie most cohort studies are NON-EXPERIMENTAL and indeed some see cohorts as a subtype of

NONEXPERIMENTAL studies If you read the excerpt from Rothman and Greenland, you will see that they put cohort studies as a subtype of NON_EXPERIMENTAL studies.

the main thing that does defines the cohort is the "start with denominator.. attitude". There is nothing about the definition that says who put which persons in the "exposed" subgroup, and which in the "unexposed" subgroup

-- it could be the subjects themselves (non-exptl) OR the investigator (experimental eg clin or field or community trial)

YES, but

PS: we were talking of medical reside

Comparison between groups.

careful here.. CONCEPTUALLY we always compare rates in the "exposed" and the "not exposed"

even if we have to use quasi-denomiinators to do so ( and limit ourselves to RATIOS
of rates)

**Numerator selected first. **

YES (ie start with the cases of the disease / outcome of interest)

then sample from the "base" that generated the cases in order to estimate
the relative sizes of the "exposed" denominator and the "unexposed
denominator"

[denominator series is usually called "control series" but as I and others
point out, the word "control" can be quite confusing here ]

it is much more descriptive to refer to the sample used to estimate the relative
sizes of the 2 denominators as the "denominator series"

**It is nonexperiemental.**

YES case-ctl studies are ALWAYS NON-EXPTL

if one STARTS with numerators (ie cases) then by definition it is already after the
fact, and (presumably) the first time that the investigator has "come on the
scene" .. so investigator could not even have been present earlier on when the
subjects choose their "exposure".

JH