Description of 'bodyfat0.dat' (free format) Variable --------------------------------------- Triceps cm Triceps SkinFold Thickness Thigh cm Thigh Circumference Midarm cm Midarm Circumference BodyFat % BODY FAT DATA Source: Neter Example p 260 Table 7.1 p 261 An extra sum of squares measures the marginal reduction in the error sum of squares when one or several predictor variables are added to the regression model, given that other predictor variables are already in the model. Equivalently, one can view an extra sum of squares as measuring the marginal increase in the regression sum of squares when one or several predictor variables are added to the regression model. We first utilize an example to illustrate these ideas, and then we present definitions of extra sums of squares and discuss a variety of uses of extra sums of squares in tests about regression coefficients. Example. Table 7.1 contains a portion of the data for a study of the relation of amount of body fat (Y) to several possible predictor variables, based on a sample of 20 healthy females 25-34 years old. The possible predictor variables are triceps skinfold thickness (X1 ), thigh circumference (X2), and midarm circumference (X3). The amount of body fat in Table 7.1 for each of the 20 persons was obtained by a cumbersome and expensive procedure requiring the immersion of the person in water. It would therefore be very helpful if a regression model with some or all of these predictor variables could provide reliable estimates of the amount of body fat since the measurements needed for the predictor variables are easy to obtain. Table 7.2 on pages 262 and 263 contains some of the main regression results when body fat (Y) is regressed (1) on triceps skinfold thickness (X1) alone, (2) on thigh circumference (X2) alone, (3) on X1 and X2 only, and (4) on all three predictor variables. To keep track of the regression model that is fitted, we shall modify our notation slightly. The regression sum of squares when X1 only is in the model is, according to Table 7.2a, 352.27. This sum of squares will be denoted by SSR(X1). The error sum of squares for this model will be denoted by SSE(X1); according to Table 7.2a it is SSE(X1) = 143.12. Similarly, Table 7.2c indicates that when Xl and X2 are in the regression model, the regression sum of squares is SSR(X1, X2) = 385.44 and the error sum of squares is SSE(X1, X2) = 109.95. Notice that the error sum of squares when X1 and X2 are in the model, SSE(X1, X2) = 109.95, is smaller than when the model contains only X1, SSE(X1) = 143.12. The difference is called an extra sum of squares and will be denoted by SSR(X2 | X1): SSR(X2 | X1) = SSE(X1) Ð SSE(X1, X2) = 143.12 - 109.95 = 33.17. This reduction in the error sum of squares is the result of adding X2 to the regression model when Xl is already included in the model. Thus, the extra sum of squares SSR(X2 | X1) measures the marginal effect of adding X2 to the regression model when Xl is already in the model. The notation SSR(X2 | X1) reflects this additional or extra reduction in the error sum of squares associated with X2, given that Xl is already included in the model. The extra sum of squares SSR(X2 | X1) equivalently can be viewed as the marginal increase in the regression sum of squares: SSR(X2 | X1) = SSR(X2 ,X1) Ð SSR(X1) = 385.44 Ð 352.27 = 33.17 The reason for the equivalence of the marginal reduction in the error sum of squares and the marginal increase in the regression sum of squares is the basic analysis of variance identity (2.50): SSTO = SSR + SSE Since SSTO measures the variability of the Y observations and hence does not depend on the regression model fitted, any reduction in SSE implies an identical increase in SSR. We can consider other extra sums of squares, such as the marginal effect of adding X3 to the regression model when Xl and X2 are already in the model. We find from Tables 7.2c and 7.2d that: SSR(X3 | X1, X2) = SSE(X1, X2) Ð SSE(X1, X2, X3) = 109.95 Ð 98.41 = 11.54 or, equivalently: SSR(X3 | X1, X2) = SSR(X1, X2, X3) Ð SSR(X1, X2) = 396.98 Ð 385.44 = 11.54 We can even consider the marginal effect of adding several variables, such as adding both X2 and X3 to the regression model already containing X1 (see Tables 7.2a and 7.2d): SSR(X2, X3 | X1) = SSE(X1) Ð SSE(X1, X2, X3) = 143.12 Ð 98.41 = 44.71 or, equivalently: SSR(X2, X3 | X1) = SSR(X1, X2, X3) Ð SSR(X1) = 396.98 Ð 352.27 = 44.71. Triceps Thigh Midarm Body SkinFold Circumference Circumference Fat Thickness X1 X2 X3 Y 19.5 43.1 29.1 11.9 24.7 49.8 28.2 22.8 30.7 51.9 37.0 18.7 29.8 54.3 31.1 20.1 19.1 42.2 30.9 12.9 25.6 53.9 23.7 21.7 31.4 58.5 27.6 27.1 27.9 52.1 30.6 25.4 22.1 49.9 23.2 21.3 25.5 53.5 24.8 19.3 31.1 56.6 30.0 25.4 30.4 56.7 28.3 27.2 18.7 46.5 23.0 11.7 19.7 44.2 28.6 17.8 14.6 42.7 21.3 12.8 29.5 54.4 30.1 23.9 27.7 55.3 25.7 22.6 30.2 58.6 24.6 25.4 22.7 48.2 27.1 14.8 25.2 51.0 27.5 21.1