From Hosmer and Lemeshow text Applied Logistic regression 1.5 The ICU Study A data set which will be used in exercises throughout the text consists of a sample of 200 subjects who were part of a much larger study on survival of patients following admission to an adult intensive care unit (ICU). The major goal of this study was to develop a logistic regression model to predict the probability of survival to hospital discharge of these patients. A number of publications have appeared which have focused on various facets of the problem. The reader wishing to learn more about the clinical aspects of this study should start with Lemeshow, Teres, Avrunin, and Pastides (1988). A code sheet for the variables to be considered in this text is given below in Table 1.4. A listing of the data is provided in Appendix 2. Table 1.4 Code Sheet for the ICU Data. Column Heading # Name Codes/Values Appendix2 1 Identification Code ID Number ID 2 Vital Status 0 = Lived STA 1 = Died 3 Age Years AGE 4 Sex 0 = Male SEX 1 = Female 5 Race 1 = White RACE 2 = Black 3 = Other 6 Service at ICU Admission 0 = Medical SER 1 = Surgical 7 Cancer Part of Present 0 = No CAN Problem 1 = Yes 8 History of Chronic Renal 0 = No CRN Failure 1 = Yes 9 Infection Probable at 0 = No INF ICU Admission 1 = Yes 10 CPR Prior to ICU 0 = No CPR Admission 1 = Yes 11 Systolic Blood Pressure mm Hg SYS at ICU Admission 12 Heart Rate at ICU Beats/min HRA Admission 13 Previous Admission to 0 = No PRE an ICU within 6 Months 1 = Yes 14 Type of Admission 0 = Elective TYP 1 = Emergency 15 Long Bone, Multiple, , 0 = No FRA Neck, Single Area, 1 = Yes or Hip Fracture 16 PO2 from Initial 0 = > 60 PO2 Blood Gases 1 = <= 60 17 PH from Initial 0 = >= 7.25 PH Blood Gases 1 = < 7.25 18 PCO2 from Initial 0 = <= 45 PCO Blood Gases 1 = > 45 19 Bicarbonate from Initial 0 = >= 18 BIC Blood Gases 1 = < 18 20 Creatinine from Initial 0 = <= 2.0 CRE Blood Gases 1 = > 2.0 21 Level of Consciousness 0 = No Coma LOC at ICU Admission or Stupor 1 = Deep Stupor 2 = Coma Exercises 1 In the ICU data described in Section 1.5 the primary outcome variable is vital status at hospital discharge, STA. Clinicians associated with the study felt that a key determinant of survival was the patient's age at admission, AGE. Write down the equation for the logistic regression model of STA on AGE. Write down the equation for the logit transformation of this logistic regression model. What characteristic of the outcome variable, STA, leads us to consider the logistic regression model as opposed to the usual linear regression model to describe the relationship between STA and AGE? 1.2 Form a scatterplot of STA versus AGE. 1.3 Using intervals based on the empirical octiles (eighths) of AGE, compute the STA mean over subjects within each AGE interval. Plot these values of mean STA versus the midpoint of the AGE interval using the same set of axes as was used in problem 1.2. 1.4 Write down an expressions for the likelihood and log-likelihood for the logistic regression model in problem 1.1 using the ungrouped, n = 200, data. Obtain expressions for the two likelihood equations. 1.5 Using a logistic regression package of your choice obtain the maximum likelihood estimates of the parameters of the logistic regression model in problem 1.1. These estimates should be based on the ungrouped, n = 200, data. Using these estimates, write down the equation for the fitted values, that is, the estimated logistic probabilities. Plot the equation for the fitted values on the axes used in the scatterplots in problems 1.2 and 1.3. 1.6 Summarize (describe in words) the results presented in the plot obtained from problems 1.2, 1.3, and 1.5. 1.7 Using the results of the output from the logistic regression package used for problem 1A, assess the significance of the slope coefficient for AGE using the likelihood ratio test, the Wald test, and, if possible, the Score test. What assumptions are needed for the pvalues computed for each of these tests to be valid? Are the results of these tests consistent with one another? What is the value of the deviance for the fitted model? 1.8 Compute the values of discriminant function estimates of the parameters in the logistic regression model of STA on AGE and compare them to the estimates obtained in problem 1.4. Briefly summarize the assumptions necessary for the discriminant function estimators to be valid and compare them to the assumptions necessary for the conditional maximum likelihood estimators used in problem 1.3. Repeat problems 1.1, 1.2, and 1.4-1.8 using the variable "type of admission," TYP, as the covariate. The variable TYP is dichotomous so the scatterplot is actually a 2 x 2 contingency table.