Q1


Refer to the article Exposure to Scientific Theories Affects Women’s Math Performance by Ilan DarNimrod
and Steven J. Heine. You can find the article
in the .pdf file [link]. The file contains  courtesy of the first author  the
pre and postmanipulation mathematics scores on which the article is based, along
with some supplementary material (and analyses and notes from JH). If you have trouble
extracting the data from the pdf file, they, and some SAS code, can also be found
in this text file [link].
The analyses done by JH at the end of this .pdf file used the data from all 4 groups.
For this assignment, restrict attention to two groups i.e. the 'ND vs. S' comparison,
and redo the requested analyses 'from scratch'. Note that for some portions below,
rather than work with the raw math1 scores, it may be easier to work with 'centered'
math1 scores (JH called the variable math1c) whose average across the combined
ND and S groups is 0. He got these by first obtaining the overall math1 mean in the
two groups combined, and subtracting this mean from each individual's math1 score.
a. Check 'how well the randomization worked' by computing the mean premanipulation
math score in each of these two groups, and the difference of the two means. On this
basis, which of the two groups has an 'math advantage' even before the manipulation?
b. Compute the mean and SD of the postmanipulation math scores in each of
these two groups, and the (crude) difference of the two means. By hand, compute the
tstatistic (common variance version). Verify your calculation by running the ttest
in your favourite statistical package {it is called TTEST in SAS and ttest (or
the immediate form ttesti) in Stata}. Comment on the pvalue [ or the CI for
the difference in means ].
c. As suggested by some, 'level the playing field' by working with the postminuspre
difference in math scores rather than the postmanipulation scores you used in (b),
i.e. repeat step (b) but using the change scores. Comment on the pvalue.
d. Even within the ND group (or within the S group), there isn't a perfect
100% correlation between the pre and post scores. For each group, plot the post
vs. pre scores. Obtain the (withingroup) correlations of pre and post scores,
and the (again, withingroup) regression equations of the postscores on the prescores
(for the regressions: if in SAS, you can for example use PROC REG; if in Stata
you can use 'regress' ).
e. For two groups of ND subjects, based on the regression equation fitted
to the scores in the ND group, how far apart would you predict their averages to
be on the postmanipulation scores if on average they were (i) 1 point apart on the
premanipulation math exam? (ii) 0.9 points apart pre ?
Make the same type of calculation for two groups of S individuals 1 point apart premanipulation.
f. Given that the mean prescores of the S and ND groups were in fact just
about 0.9 points apart, how far apart would you expect the mean postscores to be
IF the manipulation had NO effect? Do the calculation twice, the first time using
an 'exchange rate' {slope} for the value of 1 extra point premanipulation based
on what you saw in the ND group, and the second time using the slope from the S group.
g. Given the crude difference you did see in the postscores in (b), and the
advantage calculated in (f), what differences in the mean postscores would you arrive
at if you have leveled the playing field using these two different correction factors?
{you can think of using the prescores as giving each person a different 'handicap'
in the second competition  just as if the contest between ND and S groups involved
golf rather than math!
h. Repeat step (c) but using an intermediate (common) exchange rate to obtain
an adjusted postscore for each subject, i.e.,
adjustedpostscore
= postscore  0.58 x (prescore  average prescore in combined groups)
where 0.58 is the (assumed common) regression coefficient (slope) obtained in (i)
below.
{this approach uses parallel regression lines for the two groups, Next term,
you will learn how to test whether this assumption of parallel lines, i.e. a common
slope  an assumption in the 'analysis of covariance' that the authors referred
to 6 lines from bottom of second column of article  is justified by the data}.
Comment on the betweengroup difference in the means
of the adjusted values, and its associated pvalue.
i. For each subject, use an indicator variable S=1 if in group S and 0 if
not i.e. if in group ND. Then run the following (multiple) regression equation:
(average) postscore = B0 + B1*prescore + B2*S.
In SAS: PROC REG; MODEL Postscore = prescore
S;
In Stata: regress Postscore prescore S
How close is the B2 coefficient for S to the difference
in (adjusted) means shown in Figure 1 (Left) in the article?
j. Draw the pair of fitted parallel lines (obtained by setting S=0 and S=1
respectively in the fitted equation in (i)) in a diagram similar to that in the 'confounding:
reducing it by regression' notes found at the end of the .pdf file.
Interpret the 'crude' and 'adjusted' differences in the light of this diagram, or
in light of the 'anatomy of the adjustment' section of the same notes.
