Course 678: Analysis of Multivariable Data. June 1999 Homework to be handed in by Wednesday June 2 1. KKM Ch 5, Question 6 pp 69-70 a Import (or type) the data into your statistical package Make a plot -- as on p 69 Use your software to verify the least squares estimates of the slope and intercept shown in the output on page 70. Make a plot with the fitted line drawn in and comment on the fit. b-e Answer KKMN's questions b,c,d,e f Use the output on page 70, or your software, to answer KKMN's question f. g Explain why, as a parent, it would have made more sense to pose a question on the material in section 5-10, than the question they posed (which relates to section 5-9). You might want to read my notes on section 5-10. Question d on Blood Alcohol and Eye Movements (below) gets at the same issue. [If we have not yet covered sections 5-9 and 5-10 in class, defer questions f and g] h From your own experience, and from the authors' description of the ATST data (e.g. a value of 461.75 based on an average of 3 nights!), do you have any doubts about the authenticity of these data!! 2 A British study examined the distribution of weight and stature (height) of 4,995 women. The slope of the linear regression of weight on height was reported to be 2.7, but the units were not given. Imagine what the scatter plot looked like. From the reported slope, do you think the weight and height were (a) (b) (c) (d) (e) Weight in: lbs. Kg lbs. Kg other Height in: inches cm cm inches other Explain your reasoning. None of (a)-(d) may be that realistic, but it should be possible to rule out some possibilities! 3. KKM Ch 5, Question 12 pp 78-79 a Import (or type) the data into your statistical package Make a plot b Use your software to verify the least squares estimates of the slope and intercept shown in the output on page 78. Make a plot with the fitted line drawn in and comment on the fit. c,d Answer KKMN's questions c,d e Answer KKMN's question e. [Elective] Fit your suggested model and comment on the fit. You might -- in INSIGHT -- try "fitting" the highest possible degree polynomial to the data -- and then having to explain why the vocabulary "dips" between ages 5 and 6!! Remember to ask me about the investigator who fitted the highest possible degree polynomial to daily WBC's (White Blood Counts)! And ask yourself HOW the investigator determined that at age 5 the child has a vocabulary of EXACTLY 2072 words. f I expect that they are worried about INDEPENDENCE of the "error" components in the model. If these were 15 observations fron 15 DIFFERENT children, and one had a good model for the expected (MEAN) values, this clearly would not be a problem. But it our purpose is simply to describe the progression of THIS ONE child, and to use the model for interpolation FOR THIS ONE CHILD, the "errors" may well be independent. In (even a good) model for THIS ONE CHILD, why might the "errors" be somewhat (serially) correlated? ======================================================================== OPTIONAL Blood Alcohol and Eye Movements (see under datasets on web page) Why do old men have big ears? (see under Chapter 5 on web page) Sleeping through the Night (also under Chapter 5) Difference in Bone Density over 2 centuries (also under Chapter 5)