Hi Dr. Hanley:
Here is the data you requested.  Will is sending his regards.  He was at
Cornell before he came was enrolled in our medical school.  He will be
graduating from our medical school this May.  Will did the paper as part of
a research requirement for the MD degree with Distinction in Research.  I am
his advisor on the project.

*************************************
Kitaw Demissie, MD, PhD
Associate Professor
University of Medicine and Dentistry of New Jersey
732.235.5267

-----Original Message-----
From: William Kostis [mailto:kostiswj@umdnj.edu] 
Sent: Thursday, March 15, 2007 4:34 PM
To: Kitaw Demissie
Subject: Re: Article

Here are the first week mortality data (with better precision) as  
requested by Dr. Hanley.  Please send him my regards.

Will

----------- data and R code  -- jh -----last update march 21----

n.weekdays=44244; n.weekends=15542;

cath.weekdays=c(14.0,20.5,24.3,28.6,32.3,34.4,35.7,56.2);
cath.weekends=c( 8.3,12.3,19.9,26.9,30.9,32.9,33.4,55.3);

pci.weekdays=c(10.0,12.9,14.1,15.1,15.8,16.2,16.5);
pci.weekends=c( 6.7, 8.6,10.9,12.3,13.1,13.6,13.7);

cabg.weekdays=c(0.9,2.0,2.8,3.4,4.1,4.6,5.1);
cabg.weekends=c(0.4,0.8,1.4,2.3,3.1,3.6,3.8);


ci.weekdays=c(1.10, 2.72, 3.83, 4.70, 5.40, 5.97, 6.57,  9.4, 10.9, 12.0, 18.9, 22.9);
ci.weekends=c(1.27, 3.31, 4.67, 5.77, 6.37, 6.95, 7.54, 10.4, 11.8, 12.9, 20.0, 23.9);

n.points=length(ci.weekends);

admission.day=c(1:7,14,21,30,180,365); 
period.lengths=c(1,1,1,1,1,1,1, 7,7,9,150,185);

s.weekdays=100-ci.weekdays;
s.weekends=100-ci.weekends;

at.risk.weekdays = c(100,s.weekdays[1:(n.points-1)]);
at.risk.weekends = c(100,s.weekends[1:(n.points-1)]);

h.weekdays=( ci.weekdays - c(0,ci.weekdays[1:(n.points-1)]) ) /
           ( at.risk.weekdays * period.lengths );
h.weekends=( ci.weekends - c(0,ci.weekends[1:(n.points-1)]) ) /
           ( at.risk.weekends * period.lengths );
           
## note: in above period calculations, the real person time in 
##       denominator of the hazard is slightly smaller that what 
##       is used (technically, we should take out person time
##       from the times of death to the end of the periods, but
##       this small correction wouldn't greatly distort hazard ratios)

mid.point=c(1:7,11, 18, 26,105, 273);

plot(1:n.points,h.weekends/h.weekdays,type="l");

plot(mid.point,h.weekends/h.weekdays,type="l");

# unweighted average of ratio, over 7 days

  mean((h.weekends/h.weekdays)[1:7])

end.point=c(1:7,14, 21, 30,180, 365);
plot(end.point,ci.weekends,type="l");
lines(end.point,ci.weekdays);

# a check on proportionality

# cumulative hazard curves

cum.hazard.weekdays = -log(0.01*s.weekdays);
cum.hazard.weekends = -log(0.01*s.weekends);

plot (1:7,cum.hazard.weekdays[1:7],type="l");
lines(1:7,cum.hazard.weekends[1:7],col="red");

# log[-log[S] ] .ie. log[cumulative hazard], curves

log.cum.hazard.weekdays = log( cum.hazard.weekdays ); 
log.cum.hazard.weekends = log( cum.hazard.weekends );

plot (mid.point,log.cum.hazard.weekdays,type="l");
lines(mid.point,log.cum.hazard.weekends,col="red");

1st 7 days

plot (1:7,log.cum.hazard.weekdays[1:7],type="l");
lines(1:7,log.cum.hazard.weekends[1:7],col="red");  

# ################################################################################

# --dataset for Cox Model - use frequencies to denote # copies of each record..

# ################################################################################

deaths.weekdays=round(0.01*(n.weekdays*h.weekdays*at.risk.weekdays)[1:7]); deaths.weekdays;
deaths.weekends=round(0.01*(n.weekends*h.weekends*at.risk.weekends)[1:7]); deaths.weekends;

persons.weekdays=round(0.01*(n.weekdays*at.risk.weekdays)[1:7]); persons.weekdays;
persons.weekends=round(0.01*(n.weekends*at.risk.weekends)[1:7]); persons.weekends;

deaths =c(deaths.weekdays,  deaths.weekends); deaths
persons=c(persons.weekdays,persons.weekends); persons

# set up dataset with SPLIT records.. ie TAKE IT ONE DAY AT A TIME

# numbers of deaths first, then survivors

  freq=c(deaths,persons-deaths); length(freq) ; 
  
# 1s for deaths, 0 for survivors
  
  event=c( rep(1,14),  rep(0,14) ); length(event); 
 
# weekend indicator

 weekend=c( rep(0,7), rep(1,7), rep(0,7), rep(1,7) ) ; length(weekend);
  
# "start,stop" format

  day.start=c( (1:7), (1:7), (1:7), (1:7) ) ; length(days); 
  day.stop= day.start + 0.99;

# IF had 60,000 individual records, we would have 1 record 
# for each patient for each day lived
# so for 7 days, just less than 42,000 person-days

# for interaction term

  t.weekend.product    = (day.start-1)*weekend;
  
  # its square
                 
  t.sq.weekend.product = (day.start-1)*(day.start-1)*weekend; 

# form the data set..

ds=data.frame(t.start=day.start, t.stop=day.stop,  
      weekend=weekend, t.weekend.product = t.weekend.product, 
      t.sq.weekend.product =t.sq.weekend.product, dead=event, freq = freq);

# print in sorted order..

  ds[order(ds$t.start),] ; # 4 records per day (like 2 x 2 table)


# the empirical HR's

   plot(1:7,(h.weekends/h.weekdays)[1:7])
   lines(1:7,rep(1,7),col="blue")
   points(1:7,(h.weekends/h.weekdays)[1:7],col="red")

# load survival package

  require(survival)

 # constant HR over 7 days
 
 # don't ask for exact methods with this many ties!!
 
   fit.constantHR = coxph(Surv(t.start,t.stop,dead)~ weekend, 
         weights=freq, data=ds); 
         
   summary(fit.constantHR);
   
   lines( 1:7, rep(exp(fit.constantHR$coefficients[1]),7)  )
   
   
## commented out by JH after he understood why it didnt work!

## ds$day = ds$t.start-1; ds

## fit.unfittablemodel = coxph(Surv(t.start,t.stop,dead)~ weekend + day, 
         weights=freq, data=ds); 
         

 # HR whose log increases or decreases linearly over 7 days
   
   fit.LinearLogHR=coxph(Surv(t.start,t.stop,dead) ~ 
                                    weekend + t.weekend.product, 
         weights=freq, data=ds); 
   
   summary(fit.LinearLogHR);

   exp.LinearLogHR=exp(fit.LinearLogHR$coefficients[1] + 
                       fit.LinearLogHR$coefficients[2]*(0:6)
                      )
   lines(1:7,exp.LinearLogHR)
       
 # HR whose log follows quadratic course over 7 days

  fit.CurviLinearLogHR = coxph(Surv(t.start,t.stop,dead) ~ 
                weekend + t.weekend.product + t.sq.weekend.product, 
                weights=freq, data=ds) ;
                  
  summary(fit.CurviLinearLogHR);
  
  exp.CurviLinearLogHR=exp(fit.CurviLinearLogHR$coefficients[1]+
                           fit.CurviLinearLogHR$coefficients[2]*(0:6) +    
                           fit.CurviLinearLogHR$coefficients[3]*(0:6)*(0:6));
  exp.CurviLinearLogHR
  
  lines(1:7,exp.CurviLinearLogHR)

       
 # plot the exp(HR) for this most complex one of the 3 fitted ..

   plot(0:6,exp( 0.1884 + 0.0230*(0:6) - 0.0112*(0:6)*(0:6) ));
   
   
   


  # ======= for comparison.. model the rates and rate ratios via Poisson model
  
  #  (not needed for assignment )

  week.end = c(rep(0,7),rep(1,7) ); 
  admit.day = c(1:7,1:7); 
  product = week.end*(admit.day-1) ;
  
  ds.poisson=data.frame(week.end=week.end, deaths=deaths, persons=persons,
                        product=product, admit.day=admit.day);
                        
  ds.poisson$log.rate=log(deaths/persons)
  ds.poisson
  
  plot(admit.day[1:7], c( rep(-3.5,3),rep(-5.5,4) ),col="white" )
  points(admit.day[1:7], ds.poisson$log.rate[1: 7]           )
  points(admit.day[8:14],ds.poisson$log.rate[8:14] ,col="red")
  
  # first, regress the log rates via lm( ) 
  
  fit.lm=lm(log.rate ~ week.end,data=ds.poisson) ;   # not very good
  summary(fit.lm); exp(fit.lm$coefficients[2])
  lines(1:7,fit.lm$fitted[1:7],col="black");
  lines(1:7,fit.lm$fitted[8:14],col="red")
  
  summary(fit.lm); exp(fit.lm$coefficients[2])
  
  fit.lm2=lm(log.rate ~ week.end + product,data=ds.poisson) ;  # even worse
   
  summary(fit.lm2); exp(fit.lm2$coefficients[2]+fit.lm2$coefficients[2]*(0:6))
  lines(1:7,fit.lm2$fitted[1:7], col="black")
  lines(1:7,fit.lm2$fitted[8:14],col="red");
  
  # could try to model black (weekday) hazard curve, but NOT EASY
  # people like to leave form of "rate function for those with x=0" unspecified
  
  # Poisson regression (gives more weight where more deaths are) not much better.. 
  
  fit0 = glm(deaths~week.end,family=poisson,offset=log(persons) );
  summary(fit0);
  exp( fit0$coefficients[2] )
  
  fitL = glm(deaths ~ week.end + product,family=poisson,offset=log(persons) );
  summary(fitL);
  
  exp( fitL$coefficients[2] + fitL$coefficients[3]*(0:6) )
  
  

=========

-----Original Message-----
From: james.hanley [mailto:james.hanley@mcgill.ca] 
Sent: Wednesday, March 14, 2007 11:20 PM
To: demisski@umdnj.edu
Cc: self
Subject: Article

Hi Kitaw

Greetings..

Great to see a McGill and Dept. grad in the NEJM !!

And Fig 1 and table 3 are going to figure big in my teaching this Friday

I am doing a 1-credit survival analysis and related topics course
http://www.epi.mcgill.ca/hanley/c634

First year students (lawrence is teaching 621 data analysis
(regular and logistic regression.. Michal is on sabbatical
and lawrence has taken over 621)

And last and this week am covering the cox model

Last week's assignment was the data from the uganda rct of circumcision
and prevention of hiv  -- no effect in 1st 6 mo, big effect thereafter.
So hazard ratio not constant, so the average of 50% reduction is probably
too low

For this week I will ask students to 'take apart' (day by day) the 2
cumulative columns in Fig and table 3, and check proportionality .. I think
in your case the action is mainly in the 1st few days..

Indeed I think the CBC newssite
http://www.cbc.ca/health/story/2007/03/14/heart-attacks-weekend.html
underestimates the effect when they say

"The risk of death was about five per cent higher in the month after the
attack, the team reports in Thursday's issue of the New England Journal of
Medicine. In the U.S., the rate translates into several thousand people
each"

I think they got that from the 1.048 HR at bottom of last column of Table 3
Or maybe from the abstract [Olli would say that in 30 years time, for these
2 cohorts, the hr will be close to 1..  Since no further perturbations after
say 30 days or even after 14]

Is there any chance of getting the actual numbers of deaths for days 1 to
say day 30 (or even day 7) in each of the two groups for 1999-2002.. ? The
rounding in the %s makes the day by days hazards a bit too coarse, and the
day by day HR's a bit wobbly for the exercise I am planning)

Or if takes too long to run these these over again,  I would be happy with
the cumulative mortality to even one more decimal place

Thanks, and again, great work... And its great that the paper gives so much
of the (almost) raw data .. And of course I don't expect the adjustments for
the very minor confounding are worth fussing about .. The students can learn
the big picture from these data alone...

IF you can put your hands on the counts or 2 decimal places cumulative %s
and fax or email them to me right away, I would put them in the assignment
I will be giving out on Friday! IF NOT, I can still make do (I can imagine
what it must be like with an international media story of this size ..

And I love the way you guys 'fingered' the lower use of interventions..
by putting this in the model and explianing away the difference... Our
students need to understand the word 'mediate" {I don't think we even used
that work correctly when you were here!}

Regards, and congratulations again..

Jim