Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The determinants and measurement of human capital
(USC Thesis Other)
The determinants and measurement of human capital
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
THE DETERMINANTS AND MEASUREMENT OF HUMAN CAPITAL by Teresa Molina 1 A dissertation presented to the faculty of the University of Southern California Graduate School in partial fulllment of the requirements for the degree of Doctor of Philosophy in Economics May 2017 1 Copyright 2017 Teresa Molina Abstract This thesis explores questions related to the early-life determinants of human capital and the measure- ment of one crucial component of human capital { health. The long-term eects of early-life health shocks on later-life human capital are well-documented, but the reasons why men and women often respond dierently to these shocks are less well-studied. In Chapter 2, using data from Mexico, I show that exposure to pollution in the second trimester of gestation leads to signicantly lower cognitive ability in adulthood for both men and women. For women only, however, this shock to cognitive ability also leads to lower high school completion and income. I identify two labor market features that explain why women adjust their schooling decisions more than men: (1) women sort into the white-collar sector at higher rates, and (2) schooling and ability are more complementary in the white-collar sector than in the blue-collar sector. I verify the higher degree of complementarity in white-collar jobs by structurally estimating the wage parameters for each sector, using a dynamic discrete choice model of education and occupational choice. Can investing in children who faced adverse events in early childhood help them catch up? In Chapter 3, Achyuta Adhvaryu, Anant Nyshadham, Jorge Tamayo, and I answer this question using two orthogonal sources of variation { resource availability at birth (local rainfall) and cash incentives for school enrollment { to identify the interaction between early endowments and investments in children. We nd that adverse rainfall in the year of birth decreases grade attainment, post-secondary enrollment, and employment outcomes. But children whose families were randomized to receive conditional cash transfers experienced a much smaller decline: each additional year of program exposure during childhood mitigated more than 20 percent of early disadvantage. Self-reported measures of health are becoming more widely used to study health inequalities both across and within countries, but comparisons of these subjective measures can be distorted by the use of dierent response thresholds across individuals. In Chapter 4, I use anchoring vignettes from Indonesia, the U.S., England, and China to study the extent to which dierences in self-reported health across genders and education levels can be explained by the use of dierent response thresholds. To determine whether statistically signicant dierences between groups remain after adjusting thresholds, I calculate standard errors for the simulated probabilities, largely ignored in previous literature. Accounting for reporting heterogeneity reduces the gender gap in many health domains across the four countries, but to varying degrees. Health disparities across education levels persist and even widen after equalizing thresholds across the two groups. ii Acknowledgements This dissertation would not exist in its current form without the support of a wonderful group of mentors, colleagues, family, and friends. The guidance and support I received from my advisor, John Strauss, has been crucial to the growth and improvement of each chapter of this thesis, as well as my skills as a researcher. For all of the valuable time he devoted to thoroughly reading numerous drafts, providing feedback at seminars, and meeting to discuss various research projects, I am extremely grateful. I would also like to thank my mentors and co-authors, Achyuta Adhvaryu and Anant Nyshadham, for giving me early hands-on experience with rigorous academic research and sharing their contagious passion for this work. The diverse expertise of the rest of my qualifying exam and dissertation commit- tees { Arie Kapteyn, Je Nugent, Neeraj Sood, and Geert Ridder { was invaluable throughout various stages of each project. All of my work has beneted immensely from conversations with other USC graduate students and faculty members, always generous with their time and willing to answer technical questions or provide insightful comments. I am beyond fortunate to have had the most incredible family, anc e, and friends supporting me throughout the process. I gratefully acknowledge funding from the USC Provost's Ph.D. Fellowship, the USC Dornsife INET graduate student fellowship, the Oakley Endowed Fellowship, and the Gold Family Fellowship. Many thanks to Graciela Teruel and the MxFLS Support Team for providing me with the restricted-use data used in Chapter 2. All errors are my own. Contents 1 Introduction 1 2 Pollution, Ability, and Gender-Specic Investment Responses to Shocks 5 2.1 Introduction to Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Background and Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.1 Pollution and Thermal Inversions . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.4 Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.4.1 Reduced Form Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.4.2 Structural Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5.1 Main Reduced Form Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 2.5.2 Labor Market Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.5.3 Threats to Identication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 2.5.4 Structural Estimates of Wage Function . . . . . . . . . . . . . . . . . . . . . . . 42 2.6 Chapter 2 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3 Helping Children Catch Up: Early Life Shocks and the Progresa Experiment 1 45 3.1 Introduction to Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2 Program Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.3.1 Progresa Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 1 This paper was co-authored with Achyuta Adhvaryu, Anant Nyshadham, and Jorge Tamayo. i 3.3.2 Rainfall Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.3.3 Outcome Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.3.4 Progresa Exposure Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.3.5 Rainfall Shock Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 3.3.6 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.4 Empirical Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.5.1 Education Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.5.2 Employment Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 3.5.3 Robustness Checks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.6 Chapter 3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 4 Reporting Heterogeneity and Health Disparities across Gender and Education Lev- els: Evidence from Four Countries 2 79 4.1 Introduction to Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.2 Anchoring Vignettes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.3 Econometric Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4.4.1 Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.5 Estimation Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.5.1 Estimating the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.5.2 Simulating Distributions and Standard Errors for Predicted Probabilities . . . . 92 4.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.6.1 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.6.2 Standard Errors for Simulated Probabilities . . . . . . . . . . . . . . . . . . . . . 96 4.7 Chapter 4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 5 Conclusion 103 Bibliography 105 2 Accepted for publication at Demography. The nal publication is available at http://link.springer.com/article/ 10.1007/s13524-016-0456-z. ii Appendices 115 A Appendix to Chapter 2 (Pollution, Ability, and Gender Dierences) 116 A.1 Appendix Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 A.2 Data Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 A.3 Structural Estimation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 B Appendix to Chapter 3 (Helping Children Catch Up) 138 B.1 Appendix Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 C Appendix to Chapter 4 (Reporting Heterogeneity and Health Disparities) 147 C.1 Description of Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 C.2 Anchoring Vignette Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 C.3 Estimation Details and Likelihood Function . . . . . . . . . . . . . . . . . . . . . . . . . 158 C.4 Standard Error Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 C.5 Pooled HOPIT Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 C.6 Vignette Equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 iii Chapter 1 Introduction The economic well-being of an individual depends on how productive she is in the labor market. This productivity is in large part determined by human capital investments, which \improve the physical and mental abilities of people and therefore raise real income prospects" (Becker, 1962). Schooling, on the job training, exercise, and healthy eating are just a few examples of ways in which people can invest in their own human capital. For economists interested in understanding why incomes dier so drastically across individuals, both across and within countries, understanding the process of human capital formation is a crucial step. There is a large and growing literature documenting that early life circumstances play a crucial role in shaping human capital endowments and investment decisions. Health shocks in early childhood and in utero have been found to have lasting eects on many adult outcomes, including health, educational attainment, and labor market performance (Heckman, 2006; Almond and Currie, 2011; Currie and Vogl, 2013). 1 Motivated by this literature, the rst two chapters of this dissertation investigate the eects of early life shocks on the human capital investment decisions made throughout an individual's life. Understanding these investment responses can help shed light on why some individuals are hit harder than others and whether we can remediate for early life disadvantage with later-life interventions. In chapter 2, I address the rst question { why some individuals are hit harder than others { by studying gender dierences in the long-run eects of in utero pollution exposure in Mexico. I rst document that exposure to air pollution in the second trimester of gestation leads to signicantly lower 1 In the medical and epidemiological literature, the hypothesis that fetal conditions can have long-term consequences was popularized by Barker (1990), but research on this topic dates back to studies linking birth defects with in utero conditions (Jones et al., 1973; McBride, 1961; Von Lenz and Knapp, 1962) and studies on the long-run eects of exposure to the Dutch famine in 1944 (Stein et al., 1975). 1 cognitive test scores measured in adolescence and adulthood, for both men and women. For women only, however, I nd that this shock also leads to signicantly lower high school completion and income. By documenting these long-run eects, I am expanding the existing literature on the health eects of early life exposure to air pollution. Most of the previous work in this topic has focused on the eects of in utero exposure on short-term birth outcomes (like infant mortality, birth weight, and gestation length). 2 This chapter joins a smaller set of studies that attempt to explore longer-run impacts (Sanders, 2012; Isen et al., 2014; Bharadwaj et al., 2014; Peet, 2016). The primary reason so few papers have been able to estimate these long run eects is simply lack of data { most countries do not have pollution measurements that date back far enough for us to link adult respondents in current surveys with their in utero exposure. In order to circumvent this issue, I rely on exogenous variation in pollution levels driven by weather events known as thermal inversions, which worsen air quality, and for which I have monthly, spatially granular data going back to 1979 for the entire country of Mexico. After documenting that exposure to thermal inversions (and therefore air pollution) in the second trimester of gestation is a negative shock to cognitive ability that has larger eects on female schooling and income, I investigate whether gender-specic labor market conditions might be driving this dier- ence. Indeed, I nd that the gender dierence in the schooling response is almost fully explained by the fact that women have a higher likelihood of entering the white-collar sector, where schooling and cognitive ability are more complementary (compared to in the blue-collar sector). I verify the higher degree of complementarity in the white-collar wage function by structurally estimating a dynamic dis- crete choice model that endogenizes both schooling and occupational choice. By providing empirical evidence that labor market conditions can explain why men and women respond dierently to early life health shocks, this study oers an explanation for heterogeneous eects in the literature more generally, shedding light on why the estimated eects of early life shocks often dier quite drastically both across and within studies. The eects of early life shocks do not only vary based on individual characteristics like gender. How individuals are ultimately aected by early life conditions in the long run might also depend on the types of policies and interventions that aect them later in life. It is important to understand how these later-life interventions interact with early life conditions, as this could tell us whether it is possible to 2 See Chay and Greenstone (2003), Currie and Neidell (2005), Jayachandran (2009), and others summarized in Currie et al. (2014). 2 remediate for early life disadvantage with investments later in childhood. Though important, this question is not easy to answer empirically, primarily because human capital investment decisions are almost always endogenous responses to human capital endowments. In Chapter 3, Achyuta Adhvaryu, Anant Nyshadham, Jorge Tamayo, and I address this issue by exploiting two orthogonal sources of variation: rainfall at birth (which drives resource availability in early childhood) and the experimental rollout of Mexico's Progresa program, which oered cash to mothers to incentivize their children's enrollment in school. We are interested in understanding whether the eects of a negative rainfall shock at birth were less severe for those exposed to this conditional cash transfer program, which would be an indication of Progresa's ability to help disadvantaged children catch up. We nd that adverse rainfall in the year of birth decreases educational attainment and employment outcomes, but exposure to Progresa lessened this decline: each additional year of program exposure during childhood mitigated about 20 percent of the eect of adverse rainfall at birth. These results oer reason to be optimistic about the ability of well-designed interventions to help disadvantaged children from falling too far behind. The analysis conducted in both of these chapters was facilitated by the availability of straightforward measures of individual human capital levels and investments (cognitive test scores and educational attainment). However, some components of human capital, like physical health, are much harder to measure and compare across individuals. Self-reported health is a relatively simple and widely available measure that can be used to investigate what drives dierences in health, but comparisons of self- reported health can be confounded by the use of dierent response scales across individuals. In Chapter 5, I use a survey tool called anchoring vignettes to provide valid comparisons of self-reported health, adjusted for dierences in reporting thresholds, across genders and education levels. When used with a hierarchical probit model, anchoring vignettes allow us to simulate the self- reported health distribution for a particular group using either their own response thresholds or the thresholds of another group. In order to make valid comparisons that are not confounded by hetero- geneous response scales, we can compare the proportion of healthy individuals in two dierent groups, predicted using the same set of response thresholds. In existing literature, standard errors are ignored in comparisons of these predicted probabilities across groups, which precludes any statements about statistical signicance. This chapter is the rst paper to analytically derive standard errors for these predicted probabilities, which I use to determine whether statistically signicant dierences between groups remain after adjusting thresholds. I nd that accounting for reporting heterogeneity reduces the 3 gender gap in many health domains across the four countries I analyze (the United States, England, Indonesia, and China). On the other hand, health disparities across education levels persist and even widen after equalizing thresholds across the two groups. 4 Chapter 2 Pollution, Ability, and Gender-Specic Investment Responses to Shocks 2.1 Introduction to Chapter 2 Shocks experienced early in life can impact a wide range of outcomes in adulthood, 1 by directly aecting physical and cognitive ability (Almond and Currie, 2011), as well as by in uencing the human capital investment decisions made throughout an individual's life (Almond and Mazumder, 2013). Early life exposure to pollution has become a particularly salient example of this. Many studies show that in utero exposure to air pollution can negatively aect birth outcomes (like birth weight or infant mortality), 2 but there are only a handful that look at the longer-term impact of this particular early-life shock (Sanders, 2012; Isen et al., 2014; Bharadwaj et al., 2014; Peet, 2016). 3 There is still much to be learned, particularly in developing countries, about the eects of pollution exposure on human capital in adulthood. Focusing on Mexico, this paper estimates the long-term eects of in utero exposure to air pollution on adult cognitive ability, educational attainment, and income. I use thermal inversions, a meteorological 1 In the medical and epidemiological literature, research on the hypothesis (popularized by Barker (1990)) that fetal conditions can have long-term consequences dates back to discoveries of links between birth defects and in utero conditions (Jones et al., 1973; McBride, 1961; Von Lenz and Knapp, 1962). Around the same time, a series of studies on the Dutch famine in 1944 linked exposure to the famine with negative outcomes in adulthood (Stein et al., 1975). Almond and Currie (2011) provide a more detailed history of the literature and the economic evidence that followed. See Heckman (2006) and Currie and Vogl (2013) for other reviews of the evidence. 2 See Chay and Greenstone (2003), Currie and Neidell (2005), Jayachandran (2009), and others summarized in Currie et al. (2014). 3 The bulk of existing research on long-term eects of pollution has focused on exposure to radiation from nuclear accidents (Almond et al., 2009; Black et al., 2014), a much more extreme case of air pollution than what we might be interested in for policy reasons. 5 phenomenon that negatively impacts air quality, as an exogenous source of variation in pollution levels. I nd that men and women exposed to more thermal inversions (and thus worse pollution) during their second trimester in utero score signicantly lower on Raven's tests of uid intelligence as young adults. For women only, however, this cognitive shock also leads to lower high school completion and income. Larger female schooling responses to early-life health shocks have been commonly documented in existing literature (Bobonis et al., 2006; Maluccio et al., 2009; Field et al., 2009; Maccini and Yang, 2009; Anderson, 2008; Hoynes et al., 2016), but the reasons for these gender dierences are less well- explored. My paper nds evidence that the dierent labor market conditions faced by men and women in uence how they respond to early-life shocks. Specically, women exhibit larger schooling responses because they are more likely to enter the white-collar sector, where I show that the optimal schooling response to an ability shock is larger than in the blue-collar sector. In fact, once I allow for the eect of pollution to vary by the gender-specic availability of white-collar opportunities, the gender dierence in the schooling response disappears. This result rules out other common explanations for gender dierences in the eects of shocks, including son preference or gender-specic norms regarding high school completion. Sectoral choice 4 plays an important role in determining optimal schooling responses to early-life shocks because white-collar jobs (which tend to be favored by women) reward schooling and ability dierently than blue-collar ones do. Figure 2.1 oers descriptive evidence to support this idea, using data from the Mexican Family Life Survey (MxFLS). In the left panel, I plot the relationship between annual income and Raven's test scores, separately for four dierent schooling-sector combinations. The right panel plots the dierence between the two schooling lines for each sector. The striking dierence between the two sectors is clear: the income boost enjoyed by high school graduates is increasing in ability in the white-collar sector, but decreasing in ability in the blue-collar sector. Although these gures do not take into account selection into sectors or schooling, they oer suggestive evidence that the complementarity between schooling and ability may be higher in the white-collar than in the blue- collar sector. Sectoral dierences in the complementarity between schooling and ability could generate important gender dierences in responses to early-life shocks because men and women make dierent occupational choices, as demonstrated by Figure 2.2. In all but one of the eight countries shown, working women participate in the white-collar sector at much higher rates than working men do. Taken together, the 4 In this paper, I will use the word sector to describe occupational rather than industry sectors. 6 two features of the labor market illustrated in Figures 2.1 and 2.2 have important implications for how men and women respond to their cognitive endowments. A boy and girl of the same age growing up in the same village could respond dierently to an early-life cognitive shock because of the dierent sectors they expect to enter. Figure 2.1 Income-Ability Relationship Notes: These local linear regressions use individuals aged 30 to 50 in the MxFLS. The white-collar and blue-collar sectors are dened using the classications in Vogl (2014), summarized in Table A1. Income is measured using the inverse hyperbolic sine of total earned annual income. To formalize this hypothesis, I develop a model in which the key parameters that drive dierential schooling responses to endowment shocks are the cross-partials between schooling and ability in the white-collar wage function and the blue-collar wage function. In order to take into account the en- dogeneity of schooling and sectoral choice, which the descriptive exercise in Figure 2.1 does not do, I use a dynamic discrete choice model to structurally estimate these parameters of interest. I nd that schooling and ability are complements in the the white-collar sector, but not in the blue-collar sector, which is consistent with the nding that schooling responses to pollution are largest for those likely to go into white-collar jobs. In the existing literature, it is common for authors to use gender-specic labor market conditions to explain gender dierentials in the estimated eects of early-life shocks (Bhalotra and Venkataramani, 2013; Cutler et al., 2010; Hoddinott et al., 2008), but very few studies directly test for this link. My 7 Figure 2.2 Cross-Country White-Collar Shares, by Gender Weighted shares calculated from employed adults aged 30 to 50 in the 2010 censuses of the listed countries. White-collar jobs are identied using the ISCO occupation codes, which are dened as white-collar or blue-collar using the classications in Vogl (2014), summarized in Table A1. paper joins two recent exceptions (Pitt et al., 2012; Rosenzweig and Zhang, 2013), which formally model the idea that the sectoral sorting tendencies of men and women might explain dierential schooling responses across genders. The authors nd evidence of gender-specic schooling responses to a physical health shock, similar to what I nd in the context of a cognitive ability shock. A major contribution of the model I develop in this paper is the identication of sector-specic complementarities, along with gender dierences in occupational choice, as a reason why men and women may adjust their schooling dierently in response to a shock. To my knowledge, I am the rst to estimate these sector-specic complementarities and show that the gender-specic schooling responses I nd are driven primarily by dierences in male and female sectoral choice. By emphasizing the interaction between early-life shocks and labor market opportunities more gen- erally, this paper addresses two other broad questions in the early life literature. First, how do early-life shocks interact with policy interventions or economic conditions later in life (Adhvaryu et al., 2015; Bharadwaj et al., 2014; Rossin-Slater and W ust, 2015; Gunnsteinsson et al., 2014)? Second, what can explain the substantial heterogeneity { both across and within studies { in the estimated schooling 8 responses reported in the existing literature? 5 Given that labor market opportunities vary over time, across space, and across groups, this heterogeneity can be partially explained by the main result of this paper: individuals facing dierent labor market opportunities respond dierently to early-life shocks. In the next section, I outline a conceptual framework that illustrates how local labor market con- ditions can in uence the way schooling decisions respond to an early-life shock. The remainder of the paper provides empirical support for the model predictions, using pollution exposure as a shock to cognitive ability. In section 2.3, I provide background information on pollution and describe my data sources. The reduced form and structural estimation strategies are described in section 2.4. Results are discussed in section 2.5. Section 2.6 concludes. 2.2 Model The model outlined in this section illustrates that local labor market opportunities can aect how schooling decisions respond to a cognitive endowment shock. Because men and women tend to sort into sectors dierently (as shown in Figure 2.2), male and female schooling decisions should respond dierently to an early-life shock to cognitive ability. Individuals are born with an ability endowment . As adults, individuals can work in one of two sectors: a white-collar (k = w) or a blue-collar sector (k = b), which I dene using the sector cate- gorizations in Vogl (2014), described in Table A1. Each sector has a dierent wage function, where educational attainment E and ability are rewarded dierently. This captures the idea that worker characteristics command dierent prices in dierent sectors, which is re ected in the descriptive evi- dence in Figure 2.1, as well as in the existing economic literature (Heckman and Scheinkman, 1987). I denote the sector-specic expected wage functions as W k (E;; k ); (2.1) where @W k @E > 0, @W k @ > 0, @ 2 W k @E@E < 0, and @ 2 W k @@ < 0 fork =w;b: k are the parameters that map ability and schooling to wages (i.e. the returns to each of these inputs and the cross-partial between the two). 5 Many studies nd that negative (positive) shocks decrease (increase) schooling (Almond, 2006; Bleakley, 2007), while others nd no eect (Venkataramani, 2012; Cutler et al., 2010), or much smaller eects for certain groups (Maluccio et al., 2009; Maccini and Yang, 2009; Field et al., 2009; Bleakley, 2010). 9 The opportunity cost of schooling takes the following form: c(E;;); where @c @E > 0 and @ 2 c @E@E > 0. Individuals pick the optimalE to maximize their expected future wages, net of the cost of schooling, as in the maximization problem below. By choosing to model only the schooling decision, which is the focus of this paper, I assume that any major investments parents might make to change take place before the crucial schooling decisions are made. This assumption is consistent with the well-documented nding that there are higher returns to investing in a child's skill formation early in life (before primary school) compared to later on (Cunha et al., 2010; Heckman, 2006). Moreover, for children in Mexico, the end of primary school marks the rst critical schooling transition period when many drop out (Behrman et al., 2011). The maximization problem can be written max E p jg W w (E;; w ) + (1p jg )W b (E;; b )c(E;;): p jg represents the probability that an individual goes into the white-collar sector. In this simple set-up, this probability only depends on the child's gender g and locationj, but I later explicitly model the sectoral choice and allow this to be endogenously determined. The location j subscript captures variation in the availability of white-collar jobs across space (and over time). The g subscript captures dierent sectoral sorting across gender. The rst order conditions for optimal schooling are: p jg @W w @E + (1p jg ) @W b @E @c @E = 0 Using the implicit function theorem, I can show how optimal schooling will respond to a positive shock to : dE d = p jg @ 2 W w @E@ + (1p jg ) @ 2 W b @E@ @ 2 c @E@ p jg @ 2 W w @E@E + (1p jg ) @ 2 W b @E@E @ 2 c @E@E 1 ; where the denominator is negative by assumption. With wage functions that dier across sectors, it 10 is clear that dierences in p jg , which captures expectations about the local labor market, will result in dierent schooling responses. In particular, if schooling and ability are more complementary in white- collar than blue-collar jobs ( @ 2 Ww @E@ > @ 2 W b @E@ ), then individuals exposed to higher p jg will increase their schooling more in response to a positive ability shock (higher dE d ). The intuition is simple: higher p jg places more weight on the cross-partial between schooling and ability in the white-collar wage function ( @ 2 Ww @E@ ) and less weight on the cross-partial in the blue-collar wage function ( @ 2 W b @E@ ). If the opposite is true ( @ 2 Ww @E@ < @ 2 W b @E@ ), then higher p jg will translate to a lower schooling response ( dE d ). It should be noted that these model predictions remain unchanged if I relax the assumption of a xed p jg across individuals within location-gender groups. In particular, I can allow p jg to take the following form: p jg = p jg +p(E;), which means that there is a gender- and location-specic constant that does not depend on schooling or ability, and a separate term (which does not vary overj org) that governs how sectoral choice depends on schooling and ability. Using this functional form for p jg , it can be shown that individuals facing a higher p jg will exhibit higher dE d if schooling and ability are more complementary in the white-collar than blue-collar sector. In the structural estimation described in section 2.4.2, I do not rely on this functional form assumption and instead explicitly model the sectoral choice. As Figure 2.2 and Table A1 clearly show, women are on average more likely to enter the white- collar sector than men, implying that p jf > p jm . Given this, women should exhibit larger dE d than men if there is a higher degree of complementarity between schooling and ability in the white-collar sector. I test this hypothesis in two steps. First, I document that there are gender dierences in the schooling response to an exogenous cognitive shock (in utero exposure to pollution) which are driven by dierences in the white-collar opportunities available to men and women. Second, I structurally estimate the sector-specic cross-partials ( @ 2 W k @E@ ) in order to conrm that the signs of these parameters are consistent with the gender dierences that I nd. In the next section, I outline the biological reasons for considering pollution exposure as a shock to cognitive ability and describe the data I use to estimate these eects. 11 2.3 Background and Data 2.3.1 Pollution and Thermal Inversions Substantial medical and epidemiological evidence demonstrates that in utero exposure to pollution can be harmful to the fetus (Lacasa~ na et al., 2005; Peterson et al., 2015; Le et al., 2012; Saenen et al., 2015; Backes et al., 2013). Concrete evidence that pins down the biological mechanisms is more limited, but there are a few commonly cited suspected pathways that primarily relate to two types of pollutants: carbon monoxide (CO) and particulate matter (PM-10 or PM-2.5). CO is a colorless and odorless gas that binds more readily to hemoglobin than oxygen and hinders the body's ability to carry oxygen. CO is produced in combustion, and its main source (especially in urban areas) is vehicle emissions. In a pregnant woman, CO can hinder the delivery of oxygen to the fetus, leading to long-term neurological and skeletal damage (Aubard and Magne, 2000). Particulate matter refers to a mixture of solid and liquid particles in the air, which includes ne particles known as PM-2.5 (with diameters less than 2.5 micrometers) and inhalable coarse particles known as PM-10 (with diameters less than 10 but greater than 2.5 micrometers). These particles can be emitted directly from a source, like res or construction sites. They can also form as a result of chemical reactions in the atmosphere. When inhaled by a pregnant woman, particulate matter can cause in ammation or infection. This can thicken blood and plasma, hindering blood ow and glucose transport to the placenta (Lacasa~ na et al., 2005). The eects of one particular component of particulate matter, polycyclic aromatic hydrocarbons (PAHs), can be especially dangerous. PAHs are thought to increase the prevalence of DNA adducts, which are associated with negative birth outcomes like low birth weight and decreased head circumference (Perera et al., 1998; Le et al., 2012; Lacasa~ na et al., 2005). Moreover, PAHs can cross the placental barrier and damage the fetal brain by causing in ammation, oxidative stress, or damaging blood vessels. Recent evidence has shown this can result in lower cognition later in childhood (Peterson et al., 2015). Disrupting the transport of blood, glucose, or oxygen to the fetus could in theory have negative impacts on both the physical and cognitive aspects of fetal development. Whether pollution exposure results in primarily physical or cognitive damage likely depends on the timing of exposure (Dobbing and Sands, 1973). For instance, medical and economic studies on exposure to radiation (Otake, 1998; Almond et al., 2009; Black et al., 2014) ag the second trimester as the most sensitive period for brain 12 development. 6 Although day-to-day air pollution and radiation are very dierent types of pollution, these radiation studies highlight generalizable ndings regarding critical periods in brain development, which appear to also be relevant for other external stressors, like in uenza (Schwandt, 2016). In fact, the critical period highlighted by these studies coincides with crucial processes in the development of the fetal brain. The migration of neurons, from their place of origin to their nal location in the brain, peaks in the second trimester and is largely complete by the beginning of the third trimester. Similarly, synaptic connections in the cortex are rened and become more permanent starting in the second trimester; this process peaks by the beginning of the third trimester (Tau and Peterson, 2010). As outlined in section 2.2, individuals make dierent schooling decisions and earn dierent wages partially because of heterogeneous levels of skill. Any eect that in utero exposure to pollution has on schooling and labor market outcomes is likely working through its biological eect on this unobserved endowment, of which cognitive ability is an important component. A major obstacle to identifying the eects of exposure at birth on later life outcomes is the lack of high quality historical data going back far enough to link adults with their in utero exposure. In order to circumvent this issue, I rely on thermal inversions, a meteorological phenomenon known to worsen air quality, as an exogenous source of variation in pollution levels for which there is data dating back to 1979. Air temperature typically falls with altitude, but when a thermal inversion occurs, this relationship reverses, which results in a warm layer of air sitting above cooler air, trapping pollutants released near the surface. That thermal inversions can negatively impact air quality is well-documented, both in the atmospheric sciences literature (Jacobson, 2002) as well as more recently in the economics literature (Jans et al., 2014; Arceo et al., 2016). There are three common types of inversions that are associated with worsened air quality; they form in slightly dierent circumstances but all result in a warm layer of air above a cooler layer. 7 Radiation inversions take place at night, as the surface cools by emitting thermal infrared radiation. Unlike during the day, when radiation from the sun tends to have a stronger opposing eect, this results in cooler air near the surface than further above ground. Radiation inversions are more common during long, calm, and dry nights, when there is more time for the cooling to take place, less mixing in the air, 6 Otake (1998) document that weeks 8 to 25 (late rst and almost entire second trimester) are particularly crucial for brain development. Black et al. (2014) also nd that the 3rd, 4th, and 5th months of pregnancy were the critical periods during which exposure to nuclear fallout resulted in lower IQ as adults. 7 See Jacobson (2002) for a more detailed discussion of the dierent types and causes of inversions. 13 and little water vapor to absorb the thermal infrared energy. Subsidence inversions take place when air descends and warms as it compresses, creating a warm layer above cooler air. This can happen in mountainous regions, when air ows down the side of a slope, or in high pressure systems, 8 which are characterized by this descending movement and compression of air. Over coastal areas, marine inversions take place when air above the sea, which is cooler than the air above land, ows inland and pushes the warm inland air upward. In general, inversions are the result of the combination of various atmospheric forces and geographic conditions. I argue that after controlling for all of the relevant main eects (xed geographic character- istics, time of year, temperature, humidity, cloud coverage, etc.), the occurrence of a thermal inversion is exogenous: essentially the random interaction of all of the necessary conditions. Like Jans et al. (2014) and Arceo et al. (2016), I assume that thermal inversions can only aect my outcomes of interest through their eect on pollution levels, once I have controlled for all of the weather controls, geographic xed eects, and non-linear time trends. 2.3.2 Data This section outlines the pollution, weather, individual-level, and labor market data used to document the eects of an exogenous shock to and pin down the role played by the local labor market. More details are provided in the Data Appendix (section A.2). Pollution and Weather Data The ideal data set for this analysis would consist of pollution data going back to the in utero months of adults observed in my data set, the MxFLS. Currently, pollution measurements for CO, O 3 , SO 2 , NO 2 , PM10, and most recently, PM2.5 are publicly available on Mexico's National Institute of Ecology (INECC) website for a total of 16 cities. However, the majority of this spatially limited data does not go back far enough to study at-birth exposure of adults. The earliest pollution measurements date back to 1986, but for only CO in Mexico City, for which there are large sections of missing data until about 1993. As a result of these limitations, I rely on thermal inversions as an exogenous driver of pollution for which I have data going back to 1979 for all of Mexico. I use the INECC pollution measurements to verify the link between inversions and pollution in my data. The INECC database also includes 8 High pressure systems are associated with high temperatures, clear skies, and light winds at the surface 14 temperature measurements for six cities, which I use to validate the temperature data set described in the next sub-section. I identify thermal inversions in Mexico using the North American Regional Reanalysis (NARR) data, which provides air temperatures just above the surface and at various pressure levels above sea-level on a 0.3 x 0.3 degree grid (roughly 30km by 30 km) across the North American continent. 9 Using atmospheric modeling techniques, the NARR combines temperature, wind, moisture, and precipitation data from a number of dierent sources, including weather balloons, commercial aircraft recordings, ground-based rainfall measurements, and satellite data. 10 The resulting data set records, every three hours for each grid point, a wide array of meteorological variables at the surface, a few meters above the surface, and at 29 pressure levels (extending vertically into the atmosphere), from 1000 hPa (roughly equivalent to sea level) to 100 hPa (about 16,000 meters above sea-level). To identify thermal inversions, I take the air temperature 2 meters above the surface 11 and subtract this from the air temperature recorded at the pressure level 25 hPa lower (roughly 300 meters higher) than the surface pressure at a given location. 12 I identify an inversion episode as any time this dierence is greater than zero. I use the 25 hPa increment because this is the smallest increment between pressure levels available in the NARR data. Looking further above the surface (50 hPa or 75 hPa, for example) does not detect many additional inversions and therefore, unsurprisingly, leaves my results virtually unchanged. In general, I am most interested in the inversions close to the surface as they are likely to have the largest eects on air quality. Like Jans et al. (2014), I focus on nighttime inversions. There is greater variation in the occurrence of nighttime (compared to daytime) inversions over time and across space, which makes nighttime inver- sions much stronger predictors of pollution in my rst-stage checks. Moreover, nighttime inversions are much less visible than daytime inversions and are therefore less likely to generate behavioral responses. 13 Detailed validation exercises have concluded that the NARR data closely matches observational data and oers a considerable improvement over prior global reanalysis data sets (Mesinger et al., 2006). Because all of these checks have included the United States and Canada, which may dominate 9 NCEP Reanalysis data provided by the NOAA/OAR/ESRL PSD, Boulder, Colorado, USA, from their Web site at http://www.esrl.noaa.gov/psd/. 10 See Mesinger et al. (2006) for more detail about the various data sources and model. 11 2-meter temperature is what is reported by meteorologists in weather reports and is distinct from \skin" surface temperature, which the NARR also records. 12 Because of varying surface altitude across Mexico, I do not take temperature from the same pressure level for all points. For example, for a municipality at sea level, I use the temperature at 975 hPa, whereas for a higher-altitude location in Mexico City, I use the 700 pressure level because the surface pressure is 725. 13 Daytime inversions are not always visible but are more likely to be seen in warm and humid climates like Mexico's. 15 the validation exercises due to their size, I verify that these conclusions are still valid when I restrict to only Mexico. First, using temperature data that is available on the same INECC pollution database described above, I nd a very high correlation (0.87) between the NARR 2-meter temperature and these ground-level measurements. Secondly, I compare my measure of inversions to a measure calculated using temperature readings from satellite data: NASA's Atmospheric Infrared Sounder (AIRS), used by Jans et al. (2014) to identify thermal inversions in Sweden. Because the AIRS was launched in 2002, the data is too recent to use as my measure of inversions or to instrument for my current measure of inversions, but for two overlapping years (2002 and 2003), I nd correlations between the NARR and AIRS inversions measures of around 0.7. 14 In addition to using the NARR to identify thermal inversions, I also utilize this data set's relative humidity, wind speed, and total cloud coverage variables as important controls in all specications. Although precipitation is also available in the NARR data set, I use ground measurements recorded by Mexico's National Meteorological Service (CONAGUA) to control for rainfall because these likely involve less measurement error. As mentioned above, Mexican pollution measures do not date back far enough to enable me to use thermal inversions as an instrument for in utero exposure to pollution, as Arceo et al. (2016) do in their study of the contemporaneous eects of pollution. However, using the pollution measures that do exist, I check whether thermal inversions drive pollution levels in the years and cities for which I have pollution data. To establish a link between thermal inversions (I jym ) and pollution levels (P jym ) in a given municipality j, during the three-month period starting from month m in year y, I run the following regression: P jym = 1 I jym + 0 2 W jym + j + y + m +v jym : (2.2) I aggregate to three-month periods here because I eventually analyze the eects of pollution by trimester. P jym represents CO (8-hour daily maximum) or PM-10 (24-hour mean) averaged over the three month period starting in month m of year y. I jym represents the total number of days (per 14 It should be noted that there are several factors that complicate the comparison between the NARR and AIRS data. First of all, the times at which the AIRS and NARR data recorded temperatures do not match up exactly. Secondly, the AIRS data has a 1 by 1 degree resolution, substantially larger than the NARR's 0.3 by 0.3 degree resolution. Finally, the AIRS data records temperatures at fewer pressure levels than the NARR. If anything, these factors are likely to weaken the correlation between the two measures, suggesting that a correlation of 0.7 may be an underestimate. 16 month) with a nighttime inversion in that same three-month period. I include municipality ( j ), year ( y ), and month ( m ) xed eects. W jym is a vector of exible weather controls (also averaged across the three month period): linear, quadratic, and cubic terms of minimum, maximum, and mean 2-meter temperature, rainfall, relative humidity, wind speed, and total cloud coverage. In this regression, these weather controls are important because they in uence the likelihood of a thermal inversion but also have the potential to directly aect pollution levels. I aggregate to the three-month level because my main analysis studies the eects of pollution by trimester. Table 2.1 reports the results of this regression, using data from 1994, when more complete data was being recorded, to 2009, the last year of available pollution data. Even after controlling for a complete set of xed eects and weather controls, inversions are positively and signicantly related to both CO and PM-10 levels. The F-statistics in this \quasi-rst-stage" exceed conventional thresholds for strong instruments. Table 2.1 Relationship between Thermal Inversions and Pollution, 3-Month Periods CO PM-10 Average Monthly Inversions During 3- Month Period 0.0140*** 0.474*** (0.00254) (0.0828) N 23821 21292 Mean of DV 2.306 55.17 Fstat 30.449 32.835 Standard errors in parentheses (clustered at municipality level) * p<0.1 ** p<0.05*** p<0.01 Notes: * p< 0:1 ** p< 0:05 *** p< 0:01 Standard errors (clustered at municipality level) in parentheses. CO and PM-10 represent three-month averages of the 8-hour daily maximum (in ppm) and 24-hour daily mean (in g/m 3 ), respectively. All regressions con- trol for month, year, and municipality xed eects, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, average monthly relative humidity, average monthly precipitation, and average monthly cloud coverage during each relevant 3-month period. Mexican Family Life Survey All outcome variables come from the Mexican Family Life Survey (MxFLS), a nationally representative longitudinal household survey that began in 2002 and conducted follow-ups in 2005 and 2009. In addition to collecting standard demographic, schooling, and employment information, this survey also 17 measured several physical biomarkers (like height) and administered Raven's tests of uid intelligence. I use these measures of cognitive ability and height, along with educational attainment and earned annual income as my main outcomes of interest. I include individuals found in any wave of the survey in order to obtain as large of a sample as possible. For all outcomes except for Raven's scores, I take the outcome from the most recent wave in which the individual was interviewed. For Raven's tests, I use each individual's rst test score in order to minimize the eect that test-taking experience (either from the survey or elsewhere) may have on their scores. 15 Another key variable obtained from the MxFLS is municipality of birth, a restricted-use variable that enables me to link adults and adolescents (including those who have migrated) with thermal inversion exposure specic to their birthplace at their time of birth. Table 2.2 reports summary statistics for the outcomes and main regressors for all individuals with non-missing thermal inversion data (implying a non-missing birth month, birth municipality, and birth year after 1978) and who were at least 15 years of age in the last MxFLS wave in which they appeared. These are the individuals old enough to have been included in the migration module of the survey, which obtains information about place of birth. I report raw means for Raven's test scores and height in this table but use standardized variables in the regressions. 16 The sample size for annual income is much smaller compared to the other variables, primarily because I restrict to those who report work as their primary activity in the week prior to the survey. 17 I do this in order to exclude those still in school but working part-time, whose income is likely a poor representation of their labor market productivity or lifetime earning potential. On average, individuals in this sample are exposed to approximately 18 inversion nights per month during any given trimester. Mexican Labor Market Data In order to investigate the interaction between labor market conditions and pollution exposure, I use oc- cupation information (specically, white-collar shares) from the 1990, 2000, and 2010 Mexican censuses (Minnesota Population Center, 2015). Following Atkin (2016), I collapse to the commuting zone level and link this labor market information to individuals using their commuting zone of residence during their school-aged years. Commuting zones, which I discuss in more detail in Appendix section A.2.3, 15 It should be noted that Raven's tests were identical across waves. 16 I standardize test scores using the full sample mean and standard deviation. For height, I use WHO standards for everyone under 20 and for the remainder of the sample simply standardize using the gender-specic mean and standard deviation of the sample population 20 and older. I identify and drop gross outliers. 17 They make up about 40% of this relatively young sample. About 1,000 more are dropped due to missing income data. 18 Table 2.2 Summary Statistics for Reduced Form Sample Variable Name Mean Standard Deviation N Outcome Variables Raven's test score (% correct) 0.56 0.228 10320 Height (cm) 160.31 10.52 10398 Years of schooling 9.37 3.078 10715 Annual income 27761.21 61580.8 3155 Control Variables Mother's Education 6.17 3.834 9770 Father's Education 6.50 4.258 9090 1(Male) 0.47 0.499 10848 Age for Raven's Test variable 17.18 3.357 10320 Age for height variable 20.13 4.506 10398 Age for schooling variables 20.21 4.449 10715 Age for income variable 22.45 3.698 3155 Dependent Variables Average monthly inversions during trimester 1 18.09 8.206 10848 Average monthly inversions during trimester 2 17.93 8.235 10848 Average monthly inversions during trimester 3 17.80 8.288 10848 Notes: Sample includes individuals with non-missing thermal inversion data who were at least 15 years of age in the last MxFLS wave in which they appeared. 19 are municipalities or groups of municipalities that better represent local labor markets: for instance, large metropolitan areas or neighboring municipalities. In some specications, I directly use the gender-specic share of white-collar workers calculated from the census. I assign these values to individuals based on the census conducted closest to the year in which they turned 12. In other specications, I predict values for years in between censuses and assign individuals to the predicted values from the exact year they turned 12. To calculate these predicted values, I use a shift-share strategy, similar to Bartik (1991) and others, which involves predicting economic variables for geographic regions (like states or, in this case, commuting zones) by combining national industry-level growth rates and baseline industry compositions for these regions of interest (see section A.2.3 for more details). I calculate national industry-level growth rates from Encuesta Nacional de Ingresos y Gastos de los Hogares (ENIGH), a nationally representative household survey that was rst conducted by Mexico's National Institute of Statistics and Geography (INEGI) in 1982, and every two years since 1992. I cannot obtain municipality-level data directly from this data set because it is only representative at the national level (and at the state level for a limited number of states and years). 2.4 Empirical Strategy 2.4.1 Reduced Form Analysis To estimate the eects of pollution, I directly regress my outcomes of interest on thermal inversion counts over three month periods prior to and after a child's birth. In addition to helping to overcome the pollution data limitations described above, using thermal inversions also addresses the endogeneity of pollution. Pollution is not randomly assigned: individuals born in highly polluted areas are dierent from those born in less polluted areas. While location xed eects can be used to alleviate these residential sorting concerns, they do not control for location-specic trends in pollution that may coincide with trends in the outcomes of interest. In this framework, thermal inversions can be thought of as an \instrument" that generates exogenous variation in an endogenous variable that I do not observe. This endogenous variable is not a particular pollutant but rather, air quality in general. The approach of this paper is not designed to estimate the dose-response function of specic pollutants: rather, it oers a well-identied way to learn whether being exposed to higher pollution while in utero has discernible eects in the long term. For individuali, born in municipalityj, in yeary and monthm, whose outcomeY ijymw comes from 20 survey wave w, I estimate the following specication: Y ijymw = 0 + 4 X k=7 k I 3k jym + 4 X k=7 0 k W 3k jym + 0 X i + j + ( y x w ) + m + ijymw : (2.3) I a jym represents the average number of monthly thermal inversions that took place in individual i's municipality of birth during the three month period startinga months after the individual's birth month (where negative values indicate months before birth). I include all three month periods starting a year before conception (21 months before birth) until a year after birth in order to identify critical periods and ensure that any eects I nd in the in utero period are not being driven by serial correlation in the thermal inversion variable year to year. Omitting the thermal inversion variables from before and after pregnancy could result in their eects loading onto the trimester coecients. The coecients on inversions prior to conception also serve as a falsication check, as pollution exposure before a child is conceived should not have direct eects on that child's outcomes. W a jym is a vector of weather controls (minimum, maximum, and mean temperatures, rain, total cloud coverage, relative humidity, and wind speed), averaged over each three-month period, along with their squares and cubes. In this specication, municipality xed eects ( j ) address cross-sectional pollution endogeneity concerns, including residential sorting issues, by ensuring that identication comes from within-municipality variation over time. Year ( y ) and month xed eects ( m ) control exibly for long-term and seasonal time trends. The interaction of year and wave dummies ( y x w ) capture both wave and age eects. ControlsX i include gender, mother's education, and father's education, for which I set missing values to zero and include dummies for missing values. In more rigorous specications, I add various combinations of location xed eects and location-specic trends in order to allow for dierential long-term and seasonal trends across geographic areas. I run these regressions for the full sample and then separately for men and women. In order to explore whether gender-specic labor market conditions play a role in determining schooling responses to shocks, I also estimate a specication that interacts various labor market variables with the trimester coecients of interest. I cluster the standard errors at the municipality level. 18 As stated above, I am restricted to indi- viduals born in 1979 or later due to the availability of the NARR data, and those who are at least 15 18 There are 150 municipalities in the nal sample. 21 years of age in their most recent MxFLS interview. Because I am identifying o of variation within municipalities over time (controlling non-linearly for year and month eects), I also drop individuals in municipalities with very small numbers of individuals (less than 30), which make up less than 5% of the full sample. 2.4.2 Structural Estimation In order to verify whether the results of the reduced form strategy described above are consistent with the model predictions in section 2.2, consistent estimates of the sector-specic cross-partials between schooling and ability are needed. This section describes the structural model, estimation methods, and sample used to obtain these parameters. Schooling and occupational choice are endogenous decisions, potentially made jointly. As a result, sector-specic Mincer-style regressions may not yield consistent estimates of the parameters of interest. The endogeneity of schooling in a wage regression has long been acknowledged as an important issue to consider when attempting to obtain causal estimates of the return to schooling (Griliches and Mason, 1972). Studies have used both structural approaches and instrumental variables strategies to deal with this issue (Belzil, 2007). Similarly, self-selection into sectors is also recognized as a potential source of bias in the estimation of sector-specic wage parameters (Roy, 1951; Heckman and Sedlacek, 1985), for which structural approaches are, again, a common solution. In order to deal with both sources of endogeneity, taking into account that schooling decisions may also depend on expected sectoral choice, I use a two-period dynamic discrete choice model. This dynamic model moves away from the static setting used in section 2.2, allowing individuals to choose their schooling in the rst period (based on expected future wages) and their occupational sector in the second period. In order to facilitate estimation, I collapse the continuous schooling decision into a binary choice about whether to obtain a high school degree. 19 Decision Tree The agent's decision tree is outlined in Figure 2.3. I use a similar set-up to that of Eisenhauer et al. (2015), which models several discrete schooling decisions from high school enrollment to college com- pletion. I focus on a single schooling decision (high school completion) but expand the model to include 19 High school completion is the only schooling milestone aected by the cognitive shock in the reduced form analysis. See section 2.5.1. 22 Figure 2.3 Decision Tree s = 0 s =h High School s =hn No Work w/ HS s =hw White-Collar w/ HS s =hb Blue-Collar w/ HS s =l No High School s =ln No Work w/o HS s =lw White-Collar w/o HS s =lb Blue-Collar w/o HS a labor market decision after graduation or dropout. In the rst period, individuals decide whether to obtain a high school degree. In the second period, they decide whether to work in the white-collar sector, work in the blue-collar sector, or remain out of the labor force. Figure 2.3 highlights that using a two-period model requires some drastic simplications. For exam- ple, once individuals have chosen their employment status and occupation, they do not change it. While this is certainly a strong assumption, the majority of individuals in my sample do not switch sectors across waves, as I discuss in more detail in section 2.4.2. Like the model in Eisenhauer et al. (2015), this framework can be viewed as a deliberately simplied version of the dynamic discrete choice model of education and occupational choice in Keane and Wolpin (1997). These simplications make it possible to estimate a joint schooling and occupational choice model when I do not have the long-run annual panel data that is typically required of papers aiming to model and predict the evolution of wages over the life cycle. Because these life cycle predictions are not the focus of this paper, I use a simpler model which allows me to estimate sector-specic wage parameters while accounting for the endogeneity of schooling and sectoral choice, using only one cross-section of individuals. Wage Equations In this model, the value of each state depends on the immediate net rewards, as well as the expected future value of all feasible states made available by entering that state. I denote the current state s2S =f0; h; l; hn; hw; hb; ln; lw; lbg. When an individual picks s 0 , they earn net rewards R(s 0 ) =Y (s 0 )C(s 0 ): 23 In the four states in which the individual is working, Y (s 0 ) is simply equal to the discounted sum of annual income earned during their working lives (which I assume to be from age 30 to 50) given that they have chosen s 0 . In other words, if they choose the white-collar sector, Y (s 0 ) =W w (E(s 0 );; w ) = 20 X t=0 t w w (E(s 0 );;t; w ) 8s 0 2fhw; lwg; (2.4) and if they choose the blue-collar sector, Y (s 0 ) =W b (E(s 0 );; b ) = 20 X t=0 t w b (E(s 0 );;t; b ) 8s 0 2fhb; lbg; (2.5) where now E is an indicator equal to 1 for high school completion and w k (E(s);;t; k ) is the income earned in states (which determines the sectork) at timet, 20 expressed as a linear function of observed characteristics and an unobserved component. Like in equation 2.1 in the static model, wages depend on schooling and ability . Unobserved by the researcher but known to the agent, ability is captured by an individual's Raven's test score plus a normal error, = Raven's Score +; in order to allow for other dimensions of skill to be included in this measure of labor market ability. The variance of is a parameter to be estimated. Because enters all relevant equations of this model, allows for correlation in the unobservables that govern the schooling choice, sectoral choice, and wages. This is assumed to be the only source of dependency among the unobservables in the model. LetA 1 toA 3 represent three indicator variables for each of the three 5-year age categories spanning ages 36 to 50, which leaves the age group 30 to 35 as the omitted category. As in the Eisenhauer et al. (2015) framework, which allows for a dierent set of coecients for each state, I allow for the eects of ability and experience to vary not only by sector but also by schooling level. Stochastic shocks(s;t) are independently normally distributed with mean zero (and variance normalized to 1). These components form the per-period wage functions for each sector: w w (E(s);;t; w ) = w0 + w1 E(s) + w2 + w3 E(s)+ 20 Time, measured in years from the beginning of an individual's working life, is equal to age minus 30. 24 6 X j=4 wj A j3 (t) + 9 X j=7 wj A j6 (t)E(s) + kw X j=10 wj X wj +(s;t) 8s2fhw; lwg (2.6) w b (S(s);;t; b ) = b0 + b1 E(s) + b2 + b3 E(s)+ 6 X j=4 bj A j3 (t) + 9 X j=7 bj A j6 (t)E(s) + k b X j=10 bj X bj +(s;t) 8s2fhb; lbg: (2.7) The coecients that map schooling and ability to wages are sector-specic, which implies that a xed set of characteristics will map to a dierent level of wages in the white-collar and blue-collar sector. This is consistent with the existence of two types of skill: one that is rewarded in the white-collar sector and one that is rewarded in the blue-collar sector, each formed by dierent functions of individual characteristics. Individuals are therefore choosing their sector in a generalized version of the Roy (1951) economy. In terms of tying my reduced form results to the predictions of the conceptual framework, w3 and b3 are the key parameters in question. These represent the non-separability between schooling and ability in each sector, or @ 2 Ww @E@ and @ 2 W b @E@ using the previous notation. Values greater than zero indicate the existence of complementarities between schooling and ability in the wage function. Most importantly, however, the dierence between w3 and b3 will indicate whether schooling and ability are more complementary in one sector than the other. Individuals do not earn \adult" income in period 1, while high-school-aged. Even if they do earn \adolescent" wages during this period, I do not observe this for the vast majority of individuals in the data. However, I do allow for opportunity cost of schooling { which includes foregone wages { to vary across individuals, as shown in the cost functions below. Relative Cost Equations At each node, I normalize the non-stochastic portion of the cost of one state (states l, hb, and lb for nodes s = 0, s =h, and s =l, respectively) to equal zero because only relative costs can be identied. Relative costs depend on , a vector of observed characteristics Q s , and stochastic shocks (s). c(h) 25 represents the total cost of obtaining a high school degree. c(hw) andc(lw) represent the costs of going into the white-collar sector, relative to the blue-collar sector, for individuals with and without high school degree. c(h) = c h + h + q h X j=1 hj Q hj +(h)(l) (2.8) c(hw) = c hw + hw + q hw X j=1 hwj Q hwj +(hw)(hb) (2.9) c(lw) = c lw + lw + q lw X j=1 lwj Q lwj +(lw)(lb) (2.10) Net rewards for states hn and ln are dened below. Although individuals do receive non-monetary rewards in these two states, I do not observe these rewards and cannot separately identify them from costs. Instead, I allow for net rewards (Y (s 0 )C(s 0 )) to be a function of E, , and observable charac- teristics: Y (hn)c(hn) =(c hn + hn + q hn X j=1 hnj Q hnj +(hn)(hb)) (2.11) Y (ln)c(ln) =(c ln + ln + q ln X j=1 lnj Q lnj +(ln)(lb)): (2.12) The shocks in the cost function,(s), are assumed to be independent across alls. I assume that they are drawn from a Type 1 extreme value distribution, with scale factors specic to each node: n for the initial schooling decision, h for the sector decision among high school graduates, and l for the sector decision among non-graduates. Although accompanied by a number of strong assumptions, 21 this error structure greatly reduces the computational burden of estimating the model as it allows for an analytic expression of the likelihood function and the calculation of standard errors using the information matrix. In period 1, individuals choose whether or not to go to high school based on current rewards and the continuation value of each choice. In period 2, they choose whether to work and in which sector to work, by comparing the expected net benets of each choice. Individuals observe the cost shocks (s) before they decide on their next state, but only observe the reward shocks (s) after making their choice. 21 This implies, for instance, a constant error variance across sector choices within each schooling branch and the inde- pendence of irrelevant alternatives. 26 This set-up generates decision rules and transition probabilities (outlined in Appendix section A.3), which I use to construct a likelihood function. I estimate the model using maximum likelihood, inte- grating over the ability error term using 500 simulations, and setting the discount rate to 0.04. Structural Estimation Sample and Variables To estimate this model, I use the MxFLS and generate additional labor market controls from the 1970 to 2000 decennial censuses. I restrict to those 30 and older to exclude late high school graduates and those who have not entered their main career sector. I also restrict to those 50 and younger to exclude early retirees and avoid classifying individuals into a sector they may have switched into later in their careers (in preparation for retirement, for example). It should be noted that this sample is distinct from the one used in the reduced form analysis because thermal inversion data is only available for a relatively young sample (born after 1979), while this model requires information from adults later in life. I take the schooling, sector, and total annual income for each individual from the most recent survey wave in which they were aged between 30 and 50 and living in Mexico. 22 Other variables I obtain from the MxFLS include age, gender, maternal schooling, paternal schooling, and an urban indicator for the individual's place of residence. Gender and urban dummies are included in all equations, while parental schooling is only allowed to aect wages through its eect on schooling and sectoral choice. To represent the non-stochastic portion of ability , I use the individual's standardized Raven's test score from the rst test they took. As mentioned earlier, switching across sectors is ignored by the model. Fortunately, 85% of the individuals who I observe more than once between the ages 30 and 50 never switch between the white- collar and blue-collar sector. More importantly, however, only 5% of those surveyed in all three waves switch more than once. 23 Most individuals appear to be picking a sector and staying in it, or else choosing a sector and eventually ending up there (potentially after dabbling in the other sector rst). Like the sectoral decision, the \no work" decision is also a permanent one in this framework. In 22 Although the MxFLS tracks migrants, even those that move to the U.S., data from the detailed interviews of these U.S. migrants are not publicly available. As a result, I include migrants in my analysis, but I use their income and sector information from the most recent wave in which they were still in Mexico. Doing this alleviates concerns about comparing income earned in the vastly dierent labor markets of the U.S. and Mexico. 23 While the existence of switchers does suggest that an individual's occupational choice is at least partially determined by time-varying shocks or learning, it does not invalidate the important assumption underlying this framework: that individuals can calculate, with reasonable accuracy, the probability that they end up in a particular sector for the majority of their career. 27 order to avoid erroneously placing individuals who are only temporarily out of work in this category, I only include individuals who report having never worked before in this group. Individuals who are currently out of work and therefore missing sector and wage information, but who report having worked before, are dropped from the analysis. Zone-level labor market variables that serve as exclusion restrictions are calculated from the 1970 to 2000 decennial censuses. For school-aged variables, I assign individuals, by commuting zone, to the value from the census at the beginning of the decade in which they turned 12. For early working age variables, I assign individuals to the census at the beginning of the decade in which they turned 22. 24 As a cost shifter in the white-collar cost equations (Q hw andQ lw ), I use the gender-specic proportion of men or women in the white-collar sector in an individual's municipality of residence in their early working years. As a cost shifter in the cost equation for not working (Q hn and Q ln ), I use the adult unemployment rate while the individual was working-aged. As cost shifters in the schooling opportunity cost equation (Q h ), I use gender-specic youth employment rates during an individual's school-aged years as well as a measure of teacher availability. Specically, for each commuting zone, I calculate the proportion of boys (for males) or girls (for females) aged 16 to 18 who report being currently employed, as well as the number of teachers per 10 children aged 6 to 18. Appendix Table A2 reports summary statistics for all of the relevant variables described in this section. 2.5 Results In this section, I begin by discussing the results of my reduced form analysis. After documenting the overall and gender-specic eects of pollution, I then investigate the labor market mechanisms driving the gender dierences that I nd. Next, I address potential threats to identication. Finally, I discuss the wage parameter estimates from the structural model. 2.5.1 Main Reduced Form Results To display my reduced form results, I graphically illustrate the estimated coecients from equation 2.3. All corresponding tables are available in the Appendix. Figure 2.4 reports the estimated biological eects of pollution on Raven's test scores, a measure of cognitive ability. In addition to the coecients from the baseline specication, I plot the coecients estimated from a specication that adds state-specic 24 Because there is no 1980 census, for individuals whose school aged or working age census was the 1980 census, I use the 1990 census instead. 28 Figure 2.4 Eects of Pollution on Raven's Test Scores Notes: Intervals represent 90% condence intervals. \Basic" coecients are from regressions that control for birth month, birth year, municipality of birth, and survey wave by birth year xed eects, gender, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. \State-Specic Trends" include all basic controls, state-by-season xed eects and state-by-quadratic year trends. See Table A3, columns 1 and 2, for corresponding estimates. quadratic year trends and state-specic quarter of the year dummies, hereafter referred to as season dummies. Across both specications, thermal inversions in the second trimester have a signicant negative impact on Raven's test scores. In the specication with state-specic trends, I estimate a coecient of -0.013, which implies that a standard deviation increase in average monthly thermal inversions per trimester (8.2) leads to a 0.106 standard deviation decline in Raven's test scores. I do not nd any signicant eects associated with any of the other three-month periods. This is consistent with the medical and economic literature discussed in section 2.3.1, which ags the second trimester as a crucial period for brain development (Otake, 1998; Almond et al., 2009; Black et al., 2014; Schwandt, 2016). In contrast, Figure 2.5 shows no evidence of a robust relationship between pollution exposure (in any period) and height, which is often used as a cumulative measure of the quality of health and nutritional inputs early in life (Thomas and Strauss, 1997; Maccini and Yang, 2009; Vogl, 2014) and has been shown to be causally linked to fetal health measures like birth weight (Behrman and Rosenzweig, 2004; Black 29 Figure 2.5 Eects of Pollution on Height Notes: Intervals represent 90% condence intervals. \Basic" coecients are from regressions that control for birth month, birth year, municipality of birth, and survey wave by birth year xed eects, gender, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. \State-Specic Trends" include all basic controls, state-by-season xed eects and state-by-quadratic year trends. See Table A3, columns 3 and 4, for corresponding estimates. et al., 2007). These results suggest that pollution did not substantially hinder the physical development of fetuses and therefore that the negative impact of in utero pollution exposure was primarily cognitive. In order to study dierences across gender, I run these regressions separately for men and women. In the following gures, I plot the coecients from the state-specic trend specication for males and females on the same graph, reporting only the three trimester coecients (even though all regressions control for the remaining three-month periods). In the Appendix, I report these trimester coecients, along with their dierences and associated standard errors. In addition to 90% condence intervals, I also plot 75% condence intervals, which can be used for a rough visual detection of dierences across groups (signicant at the 10% level). The rst panel of Figure 2.6 shows that the second trimester estimates for the eect of pollution on Raven's scores are very similar in magnitude for males and females: -0.0107 for females compared to -0.0127 for males, which are not signicantly dierent from each other. Neither coecient is signicant individually, likely due to the smaller sample sizes, but given the signicance of the negative eect in the full sample, the main takeaway from this table is that 30 Figure 2.6 Eects of Pollution on Health, by Gender Notes: Separate regressions are conducted for men and women. Intervals represent 90% and 75% condence intervals. Controls include birth month, birth year, municipality of birth, and survey wave by birth year xed eects, state-by-season xed eects, state-by-quadratic year trends, gender, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. See Table A4, columns 2 and 4, for corresponding estimates. Although not plotted here, inversions in all other three-month periods are included in these regressions. cognition appears to be aected by pollution in similar ways for men and women. For height, in the second panel of Figure 2.6, there are no signicant gender dierences. It is important to note that the eects being estimated here are reduced form eects: they are the result of the biological eects of pollution as well as a series of investments made by parents up until the age at which the Raven's tests are administered and height is measured. 25 The purpose of this analysis is not to tease out the biological eect from the investment responses, as the data is not well-suited for this question: for the sample that I am using, information on early parental investments is not available. What is important for the goals of this paper is the fact that thermal inversions provide exogenous variation in cognitive ability, which allows me to study how schooling decisions respond to exogenously determined cognitive endowments. Having established that in utero exposure to pollution acted as a negative and primarily cognitive endowment shock that did not aect men and women dierentially, I next ask whether there were any 25 See Cunha and Heckman (2007), Cunha and Heckman (2008), and Cunha et al. (2010) for a commonly used dynamic framework for the production function of skill). 31 Figure 2.7 Eects of Pollution on Educational Attainment, by Gender Notes: Separate regressions are conducted for men and women. Intervals represent 90% and 75% condence intervals. Controls include birth month, birth year, municipality of birth, and survey wave by birth year xed eects, state-by-season xed eects, state-by-quadratic year trends, gender, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. See Table A5, columns 2 and 4, for corresponding estimates. Although not plotted here, inversions in all other three-month periods are included in these regressions. dierences in male and female schooling responses to this shock. Clear gender dierences are apparent in Figure 2.7. Though both panels depict a similar pattern, the result is more pronounced in the regression on high school completion: thermal inversions had a signicant negative impact on high school completion for women only. The male coecient, on the other hand, is positive, statistically indistinguishable from zero, and signicantly dierent from the female coecient. High school graduation appears to be the only milestone aected by pollution: Appendix Table A7 shows that in utero thermal inversions had no signicant impact on elementary school or junior high school completion for either gender. This suggests that this cognitive shock primarily aected later-life schooling decisions and had little eect on early parental education decisions. Figure 2.8 reports the eects of thermal inversions on income, again by gender, among those that report work as their primary activity in the previous week. This deliberately excludes individuals who may be working part time while still in school and whose annual income would not be an appropriate measure of their labor market productivity. Once again, I nd that thermal inversions in the second 32 trimester have a signicant negative eect on female income. The eect of second trimester pollution on men is smaller in magnitude and not signicantly dierent from zero, but still negative, sizable, and not signicantly dierent from the female coecient. Unlike the high school completion results, Figure 2.8 does not oer clear-cut evidence for stark gender dierences. Although it appears that pollution aected incomes primarily for women, there are also some non-negligible eects on men, which would be consistent with existing examples of early-life circumstances that signicantly aected male labor market outcomes despite having very little eect on their schooling decisions (Hoddinott et al., 2008; Rosenzweig and Zhang, 2013; Politi, 2015). Figure 2.8 Eects of Pollution on Income, by Gender Notes: Separate regressions are conducted for men and women. Intervals represent 90% and 75% condence intervals. Controls include birth month, birth year, municipality of birth, and survey wave by birth year xed eects, state-by-season xed eects, state-by-quadratic year trends, gender, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3- month period. See Table A6, column 2, for corresponding estimates. Although not plotted here, inversions in all other three-month periods are included in these regressions. Because this is a young sample (aged 15 to 34), the estimated coecients represent the eect of pollution on early career outcomes, which might be very dierent from the eects on lifetime income. In particular, the career wage trajectories of men and women likely dier drastically; the direction and magnitude of the gender dierences found here may not be the same as those in lifetime income eects. Moreover, these regressions ignore selection into the sample of full-time workers, although I do not nd 33 evidence that thermal inversions aected whether an individual reported work as their main activity (results available upon request). 2.5.2 Labor Market Mechanisms According to the model in section 2.2, gender dierences in schooling responses to shocks can arise from gender-specic tendencies to sort into the white-collar sector. In order to directly test this, I take advantage of dierences across space and over time in the proportion of men and women in white collar jobs in local labor markets, which I argue are reasonable proxies for p jg . 26 In particular, the gender-specic proportion of white collar employment in the local labor market during a child's critical schooling transition period should be positively related to the expectation that she will end up in a white collar job (p jg ). Like Rosenzweig and Zhang (2013), I focus on the local labor market in which a child is residing at age 12. In Mexico, the end of elementary school is a critical transition period during which a large proportion of children drop out (Behrman et al., 2011). Moreover, for the majority of individuals in my sample, I have data on their municipality of residence at age 12 specically. For the following analysis, I create an indicator equal to 1 if the predicted share of men (for males) or women (for females) working in white collar jobs in the commuting zone in which the individual was residing at age 12 falls in the top quartile of the predicted white-collar proportion distribution. The results in Table 2.3 use a discrete transformation of proportions predicted by combining municipality- level occupation distributions from the census with national-level industry growth rates from ENIGH, using an industry shift-share strategy similar to Bartik (1991) and others (See Appendix section A.2 for more details). However, the pattern of results is robust to the use of a continuous instead of a discrete measure, as well as simply assigning individuals the relevant value from the census decade in which they turned 12 (Table A8). I begin this exercise by reporting, in columns 1 and 3 of Table 2.3, the trimester coecients from the fully-interacted specication used to generate the second panel of Figure 2.7, which demonstrates the signicant gender dierence in the eect of thermal inversions on high school completion. In columns 2 and 4, I add inversion-by-p jg interactions to investigate the extent to which this gender dierence 26 Although it is dicult to capture expectations without subjective expectations data, the existing literature suggests current labor market conditions can serve as a reasonable proxy for pjg. For example, Jensen (2010) nds that 70% of survey respondents in the Dominican Republic report that people in their community were their main source of information about expected earnings. Similarly, Nguyen (2008) shows that information about current labor market conditions can aect parental and child expectations about future returns. In a slightly dierent context, Attanasio and Kaufmann (2012) use current conditions in the marriage market { gender ratios for various education categories { to proxy for marriage market expectations. 34 is being driven by gender-specic labor market opportunities. The negative eect of second trimester inversions on high school completion is concentrated among individuals more likely to go into the white- collar sector: in both specications, the coecient on the second trimester interaction is negative and signicant at the 5% level, while the main eect is much smaller and insignicant. This establishes a clear link between labor market conditions and investment responses to shocks, likely operating through the eect the current labor market has on expectations. 27 More importantly, the gender dierence (reported in columns 1 and 3) completely disappears when the labor market interactions are included: it is much smaller in magnitude than the p jg interaction and insignicant. The drastic decrease in the second trimester male interaction with the inclusion of the p jg interactions demonstrates that the gender dierence in this context is driven by the dierent labor market conditions facing men and women. Importantly, this rules out other common explanations for gender dierences in the eects of shocks, including gender discrimination or gender-specic norms regarding high school completion. If the larger negative eect of pollution on female schooling was driven by son preference or parental beliefs that men should complete high school no matter their ability level, the gender dierence should persist even after controlling for gender-specic white-collar opportunities. 2.5.3 Threats to Identication Fertility Timing The validity of the above analysis relies on the assumption that mothers in a given municipality who experience many thermal inversions during their second trimester are not systematically dierent from mothers in that same municipality who experience fewer thermal inversions in that same period. One way of testing this is to regress observable maternal characteristics on the thermal inversion variables of interest. Columns 1 and 3 of Table 2.4 report the results of regressions of maternal years of schooling and an indicator for whether an individual's mother ever worked on thermal inversions in the second trimester. 28 In both columns, there is no systematic relationship between inversion exposure and these two maternal characteristics. In Columns 2 and 4, I report the regression results from running the entire specication used for the above analysis (excluding the maternal and paternal schooling controls), with these two maternal characteristics as my dependent variables. None of the trimester coecients are 27 The nding that parental and child expectations can in uence child schooling decisions is consistent with evidence from subjective expectations data from urban Mexico (Kaufmann, 2014; Attanasio and Kaufmann, 2014). 28 These are the only two maternal characteristics which are recorded in a comparable way for individuals with parents living in the household and individuals whose parents do not live in the same household. 35 Table 2.3 Eects of Pollution on High School Graduation, by White Collar Opportunities (1) (2) (3) (4) Average monthly inversions… HS Completion HS Completion HS Completion HS Completion Trimester 1 0.00375 0.00462 0.00363 0.00447 (0.00353) (0.00534) (0.00353) (0.00553) Trimester 2 -0.00773** -0.00172 -0.00792*** -0.000896 (0.00306) (0.00415) (0.00299) (0.00435) Trimester 3 0.000748 0.000168 0.00179 0.000764 (0.00308) (0.00571) (0.00311) (0.00579) Trimester 1 -0.00460 -0.00437 -0.00423 -0.00429 x 1(Male) (0.00476) (0.00596) (0.00478) (0.00621) Trimester 2 0.00702* 0.00258 0.00771* 0.00221 x 1(Male) (0.00393) (0.00448) (0.00405) (0.00483) Trimester 3 0.00240 0.00185 0.00112 0.000707 x 1(Male) (0.00429) (0.00593) (0.00431) (0.00584) Trimester 1 -0.000873 -0.000643 x 1(Predicted white collar proportion in top quartile) (0.00456) (0.00476) Trimester 2 -0.00758** -0.00891** x 1(Predicted white collar proportion in top quartile) (0.00359) (0.00370) Trimester 3 0.000228 0.000691 x 1(Predicted white collar proportion in top quartile) (0.00454) (0.00455) N 10715 10572 10715 10572 Dependent variable mean 0.266 0.264 0.266 0.264 Additional Fixed Effects None state-by-season, state-by-quadratic- year Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01 All regressions control for the following variables and their interactions with a male indicator (as well as the main eect of gender): birth month, birth year, municipality of birth, and survey wave by birth year xed eects, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period, as well as inversions in all other three-month periods. In columns 2 and 4, the main eect of the white collar variable and the interactions with inversions in all other three month periods are also included. Predicted white collar proportions calculated using census data and annual industry growth rates from ENIGH. See Data Appendix for details on the construction of predicted white collar proportions. 36 signicantly dierent from zero (and all are small in magnitude), suggesting that, conditional on all of the xed eects and weather controls, thermal inversion exposure is truly exogenous to these maternal characteristics. Of course, these two characteristics may not represent all of the observed or unobserved dimensions that could be systematically correlated with thermal inversion exposure. Perhaps the more relevant variables are those related to maternal characteristics in the year before birth, which are not available in this data set. For example, thermal inversions are more common in winter, and pregnant mothers who are in their second trimester during winter give birth in the spring. In areas where the maize harvest is in the spring, mothers who can aord to give birth in the spring might be less likely to be working in agriculture, for example, than mothers who choose instead to give birth in the fall. In the current specication, month xed eects help account for this, but are an incomplete solution if these seasonal eects vary over time or space. In order to better control for time-varying or municipality- specic seasonal eects, I run two additional specications. In the rst specication, I replace the state-season xed eects with municipality-season xed eects. In the second specication, I keep these municipality-season xed eects and replace the year and month xed eects with interacted year- month dummies. The latter allows for monthly trends to dier non-linearly over time, which would be important if the incentives to time births have changed over the two decade period spanning the birth years in my sample. As Appendix Figures A1 and A2 show, my main results are robust to these specication changes: pollution signicantly reduces Raven's test scores for the whole sample and high school completion for women only. Mortality Selection Given that in utero exposure to pollution is known to aect infant mortality, one important concern is whether my results are being driven by selective mortality. First, it is worthwhile to note that if the infants that do not survive as a result of pollution exposure are mostly from the left tail of the ability distribution, my estimated eects should be an underestimate of pollution's true impact. However, in order to verify whether selective mortality is an issue in my setting, I check whether thermal inversions before birth have any eect on cohort size or cohort gender composition. Using all individuals in the MxFLS born after 1979 and old enough in at least one survey wave to have been asked about their place of birth, I rst calculate the total number of individuals and fraction that is male for each birth municipality, birth month, and birth year combination. With each observation representing a year- 37 Table 2.4 Maternal Characteristics and Thermal Inversions (1) (1) (2) (2) Average monthly inversions… Mother's Education Mother's Education 1(Mother Worked) 1(Mother Worked) BEFORE CONCEPTION 19-21 months before birth -0.00180 0.00134 (0.0177) (0.00238) 16-18 months before birth -0.0229 -0.00329 (0.0194) (0.00219) 13-15 months before birth -0.0295 -0.00150 (0.0184) (0.00266) 10-12 months before birth -0.0137 0.00132 (0.0183) (0.00256) DURING PREGNANCY Trimester 1 0.00429 -0.00149 (0.0233) (0.00269) Trimester 2 -0.000340 -0.00930 -0.0000635 0.00196 (0.0175) (0.0215) (0.00132) (0.00248) Trimester 3 0.0125 0.00237 (0.0202) (0.00309) AFTER BIRTH 0-2 months after birth -0.00557 -0.00119 (0.0178) (0.00262) 3-5 months after birth -0.00262 0.000120 (0.0199) (0.00281) 6-8 months after birth 0.0121 0.00153 (0.0170) (0.00257) 9-11 months after birth 0.0280 -0.00234 (0.0202) (0.00256) N 10322 9770 11104 10496 Mean of dependent variable 6.105 6.170 0.462 0.466 Basic Controls No Yes No Yes Additional Fixed Effects None state-by-season, state- by-quadratic-year None state-by-season, state- by-quadratic-year Standard errors in parentheses (clustered at municipality level) * p<0.1 ** p<0.05*** p<0.01 Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01 The \Basic Controls" included in columns 2 and 4 include: birth month, birth year, municipality of birth, and survey wave by birth year xed eects, gender, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. 38 month-municipality, I regress these aggregate values on thermal inversions during pregnancy and in the year before and after. My results, reported in Table 2.5, show no evidence for selective mortality in this sample. While the absence of any pollution-driven changes in cohort size may seem inconsistent with pre- vious studies documenting a positive link between pollution and infant mortality (Arceo et al., 2016; Jayachandran, 2009; Currie and Neidell, 2005; Chay and Greenstone, 2003), it does not necessarily rule out the possibility that thermal inversions led to higher infant mortality in this sample as well. These null eects are consistent with a situation in which thermal inversions increased infant mortality by accelerating the deaths of infants who would have died before reaching adolescence or adulthood in the absence of pollution. By the time I observe my sample, pollution-driven changes in its composition do not appear to be a substantial concern. Correlates of White Collar Proportions In the investigation of labor market mechanisms summarized in Table 2.3, I have interpreted the gender- specic white-collar shares as representing individuals' expected probabilities of going into the white- collar sector. This interpretation may be awed, however, if these white-collar shares are simply cap- turing the eects of omitted variables that are correlated with these shares. For example, if white-collar proportions are correlated with higher pollution levels (due to greater economic activity and urbanization, for example), it could be the case that the stronger negative eect I nd in high white-collar areas is due to a non-linear relationship between thermal inversions and pollution. That is, if thermal inversions exacerbate pollution more in highly polluted areas (compared to less polluted areas), this would lead to larger reduced form eects in highly polluted areas. It is important to note, however, that while this would explain the signicant negative coecient on the interaction between white-collar proportions and thermal inversions, this would not be able to explain why the gender dierence disappears after controlling for these variables. Another piece of evidence that rules out this alternative explanation can be found in Table A9. Here, I repeat the analysis conducted in Table 2.3, instead using cognitive ability as the dependent variable. If the high white-collar proportions were simply capturing larger eects in more polluted areas, there should also be a similar pattern with cognitive ability: stronger negative eects in high white-collar areas. On the contrary, in this cognitive ability regression, I nd no dierential eect across the dierent labor market conditions, and the eect sizes (for males and females) are not aected by the inclusion of these labor market controls. 39 Table 2.5 Eects of Pollution on Cohort Size (1) (2) (3) (4) Average monthly inversions… Cohort size Cohort size Fraction male Fraction male BEFORE CONCEPTION 19-21 months before birth -0.00351 -0.00482 -0.000354 0.000394 (0.00329) (0.00322) (0.00271) (0.00269) 16-18 months before birth -0.00127 -0.00123 0.000777 0.00122 (0.00320) (0.00324) (0.00263) (0.00275) 13-15 months before birth 0.00514 0.00462 -0.00109 -0.000581 (0.00323) (0.00327) (0.00278) (0.00285) 10-12 months before birth 0.00369 0.00411 0.00254 0.00368 (0.00361) (0.00378) (0.00259) (0.00259) DURING PREGNANCY Trimester 1 0.00161 0.00178 -0.00134 -0.000932 (0.00331) (0.00344) (0.00246) (0.00256) Trimester 2 0.00216 0.00295 -0.000612 -0.000372 (0.00357) (0.00348) (0.00284) (0.00277) Trimester 3 -0.00557 -0.00593 -0.000615 -0.000327 (0.00467) (0.00483) (0.00240) (0.00233) AFTER BIRTH 0-2 months after birth -0.000247 0.000248 0.000183 0.00142 (0.00397) (0.00403) (0.00244) (0.00257) 3-5 months after birth 0.000784 0.000110 -0.00167 -0.000926 (0.00379) (0.00408) (0.00267) (0.00276) 6-8 months after birth -0.00101 -0.00100 -0.000345 -0.000430 (0.00345) (0.00354) (0.00248) (0.00261) 9-11 months after birth -0.00439 -0.00472 -0.00131 -0.00119 (0.00361) (0.00365) (0.00307) (0.00316) N (municipality-year-months) 10108 10108 10098 10098 Mean of dependent variable 1.352 1.352 0.482 0.482 Additional Fixed Effects None state-by-season, state- by-quadratic-year None state-by-season, state- by-quadratic-year Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01 In these regressions, each observation represents a unique municipality-month-year combination. All regressions control for month, year, and municipality xed eects, cubic functions of average monthly mean, minimum, and maximum 2m tempera- tures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. 40 In order for an omitted correlated variable to be driving both the signicant negative white-collar interaction and the disappearance of the gender dierence in Table 2.3, it has to be gender-specic (like the white-collar proportion variables I construct) and on average dierent for men and women. Specif- ically, I need only to be concerned about variables that are correlated with white-collar proportions, within each gender, and which have dierent means for men and women. Variables that are positively (negatively) correlated with white-collar proportions and which are on average higher (lower) for women than for men are a concern. Fortunately, most correlates of white-collar shares do not fulll this criteria. For example, one might be concerned that high white-collar shares for a particular gender represent higher income or better job opportunities, specically for that gender. Unlike white-collar shares, however, incomes are on average higher for men than women, which means that this variable would not be able to generate the pattern of results in Table 2.3. Another concern is that high white-collar shares could be related to features of the marriage market. Specically, high white-collar shares for a particular gender may be associated with later marriage or later parenthood for that gender. However, in order for this to produce the pattern of results in Table 2.3, men would have to marry and have children at a younger age than women on average, which is not the case. Though I can rule out gender-specic incomes or gender-specic marital and fertility behavior as alternative explanations for my results in Table 2.3, gender-specic agricultural industry shares do satisfy the criteria of being negatively correlated with white-collar proportions and on average lower for men than women. In other words, areas with high white-collar occupation shares also tend to have low shares of workers in the agricultural industry, while in addition, men on average tend to work in the agricultural industry at higher rates than women. If agricultural shares (which are dened by industry) rather than white-collar shares (which are dened by occupation type) are responsible for the results in Table 2.3, this would rule out white-collar expectations as the main mechanism and therefore imply an underlying model dierent from the one outlined in section 2.2. In order to check whether this is the case, I repeat the analysis conducted in Table 2.3, this time including interactions between thermal inversions and agricultural industry shares. The inclusion of these interactions does not aect any coecients from the previous regressions. In Appendix Table A10, I still nd that there are signicantly larger negative eects for individuals living in high white-collar areas, and no signicant dierences across individuals facing dierent agricultural industry shares. 41 2.5.4 Structural Estimates of Wage Function The above analysis has shown that female schooling decisions are more strongly aected by a cognitive endowment shock than male schooling decisions. This is primarily driven by the higher tendency of women to enter the white-collar sector. Recall that the model in section 2.2 predicted a larger schooling response among women if schooling and ability are more complementary in the white collar than in the blue collar sector. I verify the latter by structurally estimating these sector-specic parameters. Table 2.6 reports the estimates for the wage parameters from the model described in section 2.4.2. The parameter estimates for the cost functions can be found in Appendix Table A12. In Table 2.6, the rst two columns report the coecient estimates and standard errors for the white-collar wage parameters, and the second two for the blue-collar wage function. Average wages are higher in the white-collar sector than in the blue-collar sector. The age patterns dier across sectors as well; the white collar sector oers greater rewards for experience in the older age categories, but the standard errors of these dierences are large. Individuals in urban areas earn higher income on average. Conditioning on schooling and ability, men earn signicantly higher wages than women in both sectors. Notably, the male advantage is signicantly larger in the blue-collar sector than in the white-collar sector, which is consistent with men having a comparative advantage in the more physical blue-collar sector. The variance of the ability error term, , is signicantly dierent from zero, which is an indication that unobservables that drive schooling and sectoral choice as well as wages do need to be taken into account (and would be a potential source of bias in simple sector-specic Mincer-style regressions). Most relevant to the model predictions, however, is the relative magnitude of the sector-specic coecients on the interaction between high school completion and ability. This term is positive and signicant in the white-collar sector, oering evidence for complementarities between ability and edu- cational investments in this sector. In the blue-collar sector, on the other hand, this term is negative but not statistically signicant. The dierence between the two coecients is signicant at the 1% level (with a t-statistic of 4.31). Because sector-specic wage functions separately capture two dierent types of skill (one that is rewarded in the white-collar sector and one that is rewarded in the blue-collar sector), this result oers a nuanced contribution to the discussion about the production function of skill and whether there are complementarities between cognitive ability accumulated during childhood and schooling investments in adolescence (Cunha and Heckman, 2007; Cunha et al., 2010; Aizer and Cunha, 42 2012). In short, the existence or strength of complementarities can be heterogenous across dierent skill types. Table 2.6 Wage Function Parameter Estimates Estimate Standard Error Estimate Standard Error Constant 10.447 (0.118)*** 9.996 (0.078)*** HS 0.266 (0.178) 0.131 (0.193) Ability -0.082 (0.065) 0.282 (0.025)*** HS x Ability 0.194 (0.085)** -0.283 (0.078)*** HS: Age 35-40 0.069 (0.105) 0.147 (0.142) HS: Age 41-45 0.044 (0.101) 0.217 (0.150) HS: Age 46-50 0.550 (0.118)*** 0.299 (0.189) No HS: Age 35-40 0.292 (0.140)** -0.022 (0.060) No HS: Age 41-45 0.227 (0.148) -0.052 (0.062) No HS: Age 46-50 0.293 (0.171)* 0.072 (0.074) Male 0.361 (0.060)*** 0.732 (0.055)*** Urban 0.289 (0.076)*** 0.487 (0.049)*** Ability Error Variance 2.128 (0.200)*** Standard errors in parentheses White Collar Blue Collar Notes: Standard errors are calculated analytically using the information matrix. * p< 0:1 ** p< 0:05 *** p< 0:01 These estimates conrm that the reduced form results discussed earlier do indeed support the model predictions from section 2.2. Because women tend to sort disproportionately into a sector where school- ing and ability are more complementary, they exhibit stronger schooling responses to a cognitive en- dowment shock. 2.6 Chapter 2 Conclusion This study oers evidence that gender dierences in investment responses can arise from gender-specic opportunities in the labor market. Long-term cognitive damage caused by pollution exposure in utero aects the schooling decisions and income of women, but not of men. This gender dierence is largely driven by the dierent labor market opportunities faced by men and women. In particular, women have a comparative advantage in white-collar jobs, where I show that schooling and ability exhibit a higher 43 degree of complementarity than in blue-collar jobs. These ndings suggest that interventions aimed at improving cognitive ability will have larger eects on female schooling and potentially also labor market outcomes, therefore helping to close gender gaps in these outcomes. These results shed light on the important links between current labor market conditions, future labor market expectations, and investment responses to early-life shocks. This paper joins Pitt et al. (2012) and Rosenzweig and Zhang (2013) in underscoring that gender-specic comparative advantage aects how males and females respond to shocks. I also oer evidence that parents and individuals respond to expectations about future labor market opportunities, which is consistent with related studies that use subjective expectations data (Kaufmann, 2014; Attanasio and Kaufmann, 2014). This nding also speaks to a broader literature documenting that labor market conditions, including current and future job opportunities, aect schooling decisions (Jensen, 2012; Atkin, 2016; Shah and Steinberg, 2015). Finally, these results address an important question that has motivated a number of recent studies: how do early-life shocks interact with events or conditions later in life? Whether these events are policy interventions (Adhvaryu et al., 2015; Rossin-Slater and W ust, 2015; Gunnsteinsson et al., 2014), economic shocks (Bharadwaj et al., 2014), or simply the labor market conditions studied in this paper, the fact that they interact with early-life conditions in ways we may not yet fully understand has important implications for future policy and the interpretation of existing results. 44 Chapter 3 Helping Children Catch Up: Early Life Shocks and the Progresa Experiment 1 3.1 Introduction to Chapter 3 Poor circumstance in early life often has long-lasting negative impacts (Almond and Currie, 2011; Heckman, 2006, 2007; Currie and Vogl, 2013). 2 What role can important change agents { parents, communities, governments { play in lessening the burden of adverse events in a young child's life? Research has demonstrated that in many contexts, parents provide more time and material resources to their more disadvantaged children (Almond and Mazumder, 2013). We ask: how much of a dierence does this extra investment make? That is, to what extent is remediation possible, and which behaviors and policies can generate meaningful catch-up? This relates closely to recent work evaluating the impacts of policies that provide support to disadvantaged children (Conti et al., 2015; Gertler et al., 2014; Chetty et al., 2016; Aizer et al., 2016; Hoynes et al., 2016; Lavy and Schlosser, 2005; Lavy et al., 2016). The answer to this question is neither theoretically obvious nor empirically straightforward. The theory of dynamic human capital formation suggests that timing matters a great deal (Heckman and Mosso, 2014; Cunha et al., 2010). Due to the decreasing degree of static (within-period) substitutabil- 1 This paper was co-authored with Achyuta Adhvaryu, Anant Nyshadham, and Jorge Tamayo. 2 Shocks to the early life environment { disease, poverty, maternal stress, nutritional or income availability, and con ict, among many others { aect a wide range of adult outcomes (see, e.g., Bhalotra and Venkataramani (2011); Venkataramani (2012); Fink et al. (2015); Almond (2006); Maccini and Yang (2009); Adhvaryu et al. (2016); Persson and Rossin-Slater (2014); Duque (2017); Gould et al. (2011)). 45 ity of investments and stocks of human capital as individuals age, investing in children very early in their lives yields the largest returns; attempting to correct for disadvantage in later childhood (say, adolescence) or adulthood may be economically inecient (Conti and Heckman, 2014; Heckman and Mosso, 2014). It is yet unclear at what ages this drop in returns kicks in, and thus when the potential remediating eects of investments may disappear. The main empirical challenge in answering this question rigorously is that investments following a shock are, in general, endogenous responses. Investments and resulting outcomes are jointly determined by parents' preferences, families' access to resources, and the like. Comparing the outcomes of two people who faced the same shock but were privy to dierent levels of corrective investment will therefore produce a biased estimate of the remediation value of investments if these investments are correlated with unobserved determinants of the outcomes in which we are interested. As Almond and Mazumder (2013) put it in their recent review, resolving this identication problem \may be asking for `lightning to strike' twice: two identication strategies aecting the same cohort but at adjacent developmental stages. Clearly this is a tall order." In this study, we attempt to overcome this diculty. We demonstrate that recovery from early life shocks is possible, at least with regard to educational attainment and employment outcomes, via conditional cash transfers during childhood. We leverage the combination of a natural experiment that induced variation in the extent of early disadvantage and a large-scale cluster randomized controlled trial of cash transfers for school enrollment in Mexico. In our study's agrarian setting, where weather plays a signicant role in determining household income (and thus the availability of nutrition and other health inputs for children), we verify that adverse rainfall lowers the agricultural wage, and show that Mexican youth born during periods of adverse rainfall have worse educational attainment and employment outcomes than those born in normal rainfall periods. Exposure to adverse rainfall in the year of one's birth { a crucial period for the determination of long-term health and human capital { decreased years of completed education by more than half a year. However, for children whose households were randomized to receive conditional cash transfers through Progresa, Mexico's landmark experiment in education policy, each additional year of expo- sure mitigated the long-term impact of rainfall shocks on educational attainment by 0.1 years. By reducing the opportunity cost of schooling, Progresa enabled all children to stay in school longer than they would have otherwise, but had the largest eects on those impacted by negative rainfall shocks at birth. Each additional year of program exposure during childhood mitigated more than 20 percent 46 of early disadvantage. The negative eects of adverse rainfall become discernible after primary school, with the largest impacts measured for completion of grades 7 through 9. The mitigative impact of Progresa, as well as the main eect of the program, is also largest precisely in these years. Finally, although data limitations preclude the analysis of longer-term outcomes for much of our sample, for the oldest individuals (who were 18 at the time of the 2003 survey), we nd a similar pattern of coecients in regressions on continued education (after high school) and employment outcomes. 3 Adverse rainfall in the year of birth leads to a reduction of 17 percentage points in the probability of working; while each additional year of Progresa exposure osets nearly 8 percentage points of this impact. At 2 years of program exposure (the within cohort dierence due to randomized treatment), Progresa osets more than 88 percent of the disadvantage caused by adverse rainfall in the year of birth in terms of employment at age 18. Put another way, there is substantial heterogeneity in the treatment eect of Progresa across the distribution of initial endowments, as determined by economic circumstance in early life. The eect of conditional cash transfers on schooling in our case is driven in large part by the impact on disadvantaged children. At the mean length of program exposure, children born in \normal" circumstances get around 0.5 years of additional schooling. But program exposure increases schooling for disadvantaged children by double this amount { slightly over 1 year. With respect to employment at age 18, we nd that Progresa has little to no eect on children born during normal rainfall, with roughly the entire impact of Progresa exhibited for disadvantaged children. Our study furthers the understanding of a crucial aspect of the complex process of human capital formation: how do early stocks of human capital and subsequent investments interact to determine long-run outcomes (Heckman and Mosso, 2014; Cunha et al., 2010)? Our attempt to answer this question exploits two orthogonal sources of variation: exposure to abnormal rainfall around the time of birth and exposure to a large-scale randomized conditional cash transfer program. In this regard, our work is most related to three recent working papers: Gunnsteinsson et al. (2016), who examine the interaction of a natural disaster and a randomized vitamin supplementation program in Bangladesh; Rossin-Slater and W ust (2015), who study the interaction of nurse home visitation and high quality preschool daycare in Denmark; and Malamud et al. (2016), who examine the interaction of access to abortion and better schooling in Romania. Despite the vastly dierent contexts and types of programs 3 Attrition and low quality data in the 2007 wave of the survey make this wave unusable. Accordingly, we have post- secondary schooling and employment outcomes only for 18 year olds in 2003, who are also impacted by both sources of exogenous variation. 47 studied, the results in these papers, quite remarkably, mirror what we nd in our work { an (at least weakly) negative interaction eect { indicating that remediation of early-life shocks via investments can indeed be successful. Part of the argument for targeting low-endowment children is the idea that the return on investment is highest for this group, but we do not have credible evidence that this is indeed the case. While there is substantial evidence that early interventions for disadvantaged children can have large long-term impacts (Heckman et al., 2010, 2013; Chetty et al., 2016; Gould et al., 2011; Hoynes et al., 2016; Lavy et al., 2016), we know little about how large are those returns compared to the returns of similar intervention on less disadvantaged populations. The ethical imperative for parents, communities, and the government to improve the circumstance of disadvantaged children may be clear. But if returns to investment are highest for high-endowment children (i.e., if \skill begets skill"), then this moral argument would be at odds with the economic drive to invest where the return is largest. 4 Our results show that in terms of schooling and employment outcomes, children disadvantaged at birth are actually the highest-return beneciaries of remediating investments. This result is consistent with new evidence from the Head Start program in the United States (Bitler et al., 2014). 5 Our empirical context is appealing because of the relatively high potential for external validity. Adverse rainfall is likely the most common type of shock experienced by poor households in much of the developing world (Dinkelman, 2013), and has large short- and long-term consequences (Paxson, 1992; Maccini and Yang, 2009; Shah and Steinberg, 2013). Given the rising importance of wide-scale cash transfer programs around the world (Haushofer and Shapiro, 2013; Blattman et al., 2013), it is important to learn here that these programs, if administered as successfully as Progresa was in Mexico, can mitigate a sizable portion of the adverse impacts of poor rainfall at the time of birth. The rest of the paper is organized as follows. Section 3.2 provides background on the Progresa program in Mexico. Section 3.3 describes the survey data and rainfall data we use. Section 3.4 describes our empirical strategy. Section 3.5 details our results and section 3.6 concludes. 4 In other words, there would be an equity-eciency tradeo for late stage child investments (Heckman, 2007). 5 In both contexts, it should be noted that what is being estimated is the return to an intervention for the poorest among a disadvantaged population, as both Progresa and Head Start already target low-income households. 48 3.2 Program Background In 1997, Government of Mexico began a conditional cash transfer program called Progresa, aimed at alleviating poverty and improving the health, education and nutritional status of poor families, particularly children and mothers, in rural communities. In this paper, we focus on the education component of Progresa, which consisted of bimonthly cash payments to mothers during the school year, contingent on their children's regular school attendance (an attendance record of 85% is required to continue receiving the grant). 6 Initially ranging from 60 to 205 pesos in 1997, the size of the subsidy depended on the number of children enrolled in school and the grade levels and genders of the children. As shown in Table 3.1, from seventh grade onwards, the grants increase with grade level, with higher amounts for girls than boys. 7 At the program's onset, grants were provided only for children between third and ninth grade (the third year of junior high school). In 2001, the grants were extended to high school. Table 3.1 summarizes the monthly grant amounts for the second semester of 1997, 1998 and 2003. Table 3.1 Monthly Amount of Educational Transfers to Beneciary Households Boys Girls Boys Girls Boys Girls Primary 3rd year 60 60 70 70 105 105 4th year 70 70 80 80 120 120 5th year 90 90 100 100 155 155 6th year 120 120 135 135 210 210 Secondary 1st year 175 185 200 210 305 320 2nd year 185 205 210 235 320 355 3rd year 195 205 220 625 335 390 High School 1st year - - - - 510 585 2nd year - - - - 545 625 3rd year - - - - 580 660 Notes: 1. Amounts (in pesos) are for the second semester of the year 2. Grants extended to high school in 2001. 1998 2003 1997 The program was initially implemented in 506 rural localities from the states of Guerrero, Hidalgo, 6 The health component involved conditional cash transfers that incentivized health behaviors. 7 Given the lower rates of attendance of girls in rural Mexico, the policy's intention was to provide additional incentives to girls (Skouas, 2005). Skouas and Parker (2001), Behrman et al. (2009), and Behrman et al. (2011) cover additional program details in depth. 49 Michoacan, Puebla, Queretaro, San Luis de Potosi and Veracruz. 320 localities (the \treatment group") were randomly assigned to start receiving benets in the Spring of 1998. 186 localities were kept as a control group and started receiving Progresa benets at the end of 1999. This randomized variation has allowed for rigorous evaluations of the program's eects on a wide range of outcomes. For instance, studies have found that Progresa improved educational outcomes and decreased child work (Skouas and Parker, 2001; Schultz, 2004; Behrman et al., 2011), reduced infant and elderly mortality (Barham and Rowberry, 2013; Barham, 2011), increased investment in farm assets (Gertler et al., 2012), and improved health and nutrition across a number of dimensions (Hoddinott and Skouas, 2004; Fernald et al., 2008b,a; Barber and Gertler, 2008; Fernald et al., 2008c; Gertler, 2004). Like these studies, we take advantage of the random assignment and treat Progresa as an exogenous shock to the cost of schooling. We also exploit additional variation in years of treatment exposure across cohorts. We follow the majority of previous studies in utilizing the extensive margin of program exposure and ignoring actual receipt of transfers or specic grant amounts, which depend on fertility and other endogenous characteristics and decisions of the household. However, it should be noted that the vast majority of households eligible for the program actually did receive benets (Hoddinott and Skouas, 2004). 8 Because only households who were classied as poor by the program administration were eligible to receive the benets from the program, we focus, as many previous studies do, on this subset of the population in our analysis. The next section describes the surveys conducted as part of this program and identies the specic datasets and variables used in this study. 3.3 Data 3.3.1 Progresa Data The data collected for the Progresa program includes a baseline survey of all households in Progresa villages (not just eligible poor households) in October 1997 and follow-ups every six months thereafter for the rst three years of the program (1998 to 2000). These surveys collect detailed information on many indicators related to household demographics, education, health, expenditures, and income. To evaluate the medium-term impact of the program, a new follow-up survey was carried out in 2003 in all 506 localities that were part of the original evaluation sample. By that time all localities 8 Hoddinott and Skouas (2004) report that only 5% of the households in treatment localities who were dened as eligible to receive benets and formally included in the program in 1998 had not received any benets by March 2000. 50 that had participated in the baseline survey as control localities had also received the treatment. Like previous surveys, the 2003 wave contains detailed information on household demographics and individual socioeconomic, health, schooling and employment outcomes. A follow-up survey was also conducted in 2007, but we do not use this wave due to high attrition rates. 9 We use data from the rst survey and the survey carried out in 2003, focusing only on households who were eligible for the program (\poor" households). We construct dierent education outcomes using the information provided by the 2003 follow-up survey. Similarly, based on the ndings of Behrman and Todd (1999) and Skouas and Parker (2001), we also construct control variables related to parental characteristics, demographic composition of the household, and community level characteristics using the baseline survey. We focus on individuals in poor households aged 12 to 18 in 2003. We restrict to these ages because 12 year-olds are the youngest cohort for which there is dierential exposure to Progresa in treatment and control villages (see Table B1), while individuals over 18 are more likely to have moved out of the household by the 2003 survey and are therefore not surveyed. 10 While survey respondents (usually mothers or grandmothers) are still asked some questions about non-resident individuals, these responses are likely to introduce greater measurement error, potentially correlated with our regressors of interest. To avoid this issue, which is particularly problematic for our employment outcomes (which are missing for non-resident household members), we exclude individuals over 18 years old. Following Behrman et al. (2011), we also drop individuals who have non-matching genders across the 1997 and 2003 waves, as well as those who report birth years that dier by more than 2 years. For those with non-matching birth years with smaller than 2 year dierences, we use the birth year reported in the 1997 wave. 3.3.2 Rainfall Data We exploit variation in early life rainfall to identify changes in early-life circumstances not correlated with the initial conditions of the parents. We use rainfall data from local weather stations collected by Mexico's National Meteorological Service (CONAGUA) and match those rainfall stations to program 9 We lose over half of our 2003 sample, partially due to household-level attrition, but primarily due to individual migration (no proxy information is collected for those no longer living in the originally surveyed household) { likely to be endogenous. This unfortunate feature of the 2007 data has resulted in its limited use in the literature: the few studies that do use the 2007 data (for example, Behrman et al. (2008) and Fernald et al. (2009)) focus exclusively on Progresa's health eects on a much younger cohort, for whom migration is less of an issue. 10 As Figure B1 shows, the proportion of 19-year-olds not living in the household is over 40%, and this proportion continues to grow with age. 51 localities using their geocodes. Due to changes in the use of weather stations as well as irregular reporting by some stations, there are some localities for which the nearest rainfall station has missing observations during the period of time relevant for our study. To deal with this issue, we use data from all of the stations within a 20 kilometer radius of the locality. Then, we take a weighted average of rainfall from these nearby stations, weighting each value by the inverse of the distance between that station and the locality. 11 Using this procedure, 69 of the 506 localities were still missing rainfall measurements for our study period. Thus, our nal sample, after excluding individuals missing rainfall for their particular year of birth, restricting to those from poor households in our desired age group meeting the data quality requirements, consists of individuals from 420 localities. 3.3.3 Outcome Variables Our main education outcome variables include continuous years of schooling, a dummy for grade pro- gression, and a dummy for having completed the appropriate years of schooling for one's age. Given the fairly young age restrictions of our sample, the latter two variables are used as potentially more appropriate variables for individuals who have yet to complete their schooling. Educational attainment is constructed using information on the last grade-level achieved in 2003. 12 \Grade progression" is a binary variable equal to 1 if an individual progressed at least ve complete grades between 1997 and 2003. We also dene an indicator for age-appropriate grade completion. This is equal to 1 if an individual completed the appropriate years of schooling for their age. For an individual who is 7 years old, we expect them to have completed one year of schooling, for an 8 year-old, two years, and so on. In order to study dierential eects by grade, we also use 12 dummy variables, each indicating whether the individual received at least 3, 4, and up to 12 years of schooling. For individuals who are 18 years old in 2003, we also look at continued enrollment and employment outcomes. Specically, we create indicators for whether an individual is still enrolled in school (after having received a high school degree). Similarly, we are interested in whether an individual was employed in the past week, employed in the past year, and employed in a non-laborer job in the past year. This 11 Weights are normalized to sum to 1. 12 Students with complete primary education have a maximum of 6 years of schooling; junior high school adds a maximum of three additional years; and high school three years more. College education adds a maximum of ve additional years of schooling and graduate work an additional one. We do not count years in preschool and kindergarten. 52 last variable attempts to separate the lowest skill and least stable jobs from the rest of the employment categories (by grouping those working as spot laborers with the unemployed). 3.3.4 Progresa Exposure Variable Our two independent variables of interest represent two types of shocks: an early-life endowment shock and an investment shock. The investment shock we use is the Progresa program. In particular, we calculate the years an individual was exposed to Progresa, which depends on their locality (treatment or control status) and age. Table B1 shows, for each birth cohort, the number of years of exposure to Progresa by treatment status, calculated by rst calculating the number of months, dividing by 12, and rounding to the nearest year. For the majority of cohorts, the dierence between treatment and control exposure is 2 years, but the dierence is only 1 year for the youngest cohort with any dierential exposure at all (who aged into the program) and the oldest cohort with dierential exposure (because the control group aged out at the end of 1999, and started receiving benets when the program was expanded to include high school in 2001). Creating a continuous years of exposure variable takes advantage of the variation in exposure lengths across dierent age cohorts within the treatment and control groups, in addition to the exogenous variation generated by the randomization of the Progresa program. 3.3.5 Rainfall Shock Variable For our early life shock, we use annual rainfall during an individual's calendar year of birth in their locality of residence in 1997. 13 To calculate the rainfall levels, we simply sum all monthly rainfall during an individual's calendar year of birth. We do not use month of birth to dene this annual shock because in our sample, approximately 30% of individuals report dierent birth months in the 1997 and 2003 surveys. In robustness checks (not shown here but available on request), we nd that our results using calendar-year annual rainfall are very similar to results using the sum of monthly rainfall from the 6 months before and 6 months after birth (using either the 1997 reported month or 2003 reported month). This suggests that most of the eects we nd are coming from input shocks in the latest prenatal and earliest post-natal months. 13 The data does not include locality of birth, which would be the ideal geographic identier in this context. We therefore use locality of residence (as of 1997), which should be equivalent for most of the individuals in our sample, as migration is minimal due to their young ages. 53 Our interest is not in the absolute level of rainfall itself, but rather in a measure of rainfall that maps best to household incomes at the time of birth (and therefore to a child's biological endowment). Specically, we dene a shock as a level of rainfall that is one standard deviation above or below the locality-specic mean (calculated over the 10 years prior to the birth year). In our analysis, we use a \normal rainfall" dummy in order to represent the absence of a negative shock (for ease of interpretation of the interaction coecients). This dummy equals 1 if the rainfall in an individual's locality during their year of birth fell within a standard deviation of the locality-specic historical mean. We use this relative measure instead of an absolute measure of rainfall in order to capture the fact that the same amount of rainfall may have dierent consequences for dierent regions based on average rainfall levels. As we discuss in detail in section 3.4, both previous literature as well as our own data show that dening the shock variable in this way captures the relationship between rainfall and agricultural wages: normal years are associated with better outcomes than shock years. It is also important to note that this shock variable eliminates much (but not all) of the spatial correlation that typically poses a problem in studies of rainfall, a highly spatially correlated variable. This is illustrated in Figure 3.1, which maps all Progresa localities by their rainfall status. Black dots represent localities that experienced a rainfall shock (according to our denition) in 1987, while gray crosses represent those that experienced normal rainfall in that same year. We see a great deal of variation within states, and even within clusters of neighboring localities, in the rainfall shock variable. 14 We show only one year in Figure 3.1 for illustrative purposes, and chose 1987 because it is the birth year of the largest number of individuals in our sample. This exercise also maps well to our estimating equation, which includes birth year xed eects and accordingly identies using within birth year variation. In the Appendix, Figure B2 uses rainfall from all birth years. Since we ultimately care about the interaction between rainfall and Progresa exposure, it is also important to note that for both treatment and control villages, we see still substantial variation in rainfall shock status, even within small geographic areas, as shown in Figure 3.2. 54 Table 3.2 Summary Statistics for Individual-Level Variables in 2003 Individual Variables Full Sample Treatment Villages Control Villages Treatment - Control Differences 12 to 18-year-olds 6.786 6.847 6.692 0.154*** (2.109) (2.094) (2.128) (0.0397) 0.579 0.591 0.561 0.0295*** (0.494) (0.492) (0.496) (0.00955) 0.465 0.479 0.442 0.0366*** (0.499) (0.500) (0.497) (0.00939) Number of individuals 11829 7193 4636 18-year-olds 0.0607 0.0584 0.0641 -0.00574 (0.239) (0.235) (0.245) (0.0122) 0.502 0.514 0.485 0.0290 (0.500) (0.500) (0.500) (0.0301) 0.532 0.543 0.515 0.0284 (0.499) (0.498) (0.500) (0.0301) 0.354 0.356 0.351 0.00511 (0.479) (0.479) (0.478) (0.0288) Number of individuals 1597 942 655 Educational Attainment Grade Progression Appropriate Grade Completion Currently Enrolled w/ HS Degree Worked this Week Worked this Year Worked in Non-Laborer Job Notes: Standard errors in parentheses (*** p<0.01, ** p<0.05, * p<0.1). Variable definitions: -Educational attainment: years of schooling -Grade progression: 1( progressed 5 grades between 1997 and 2003) -Appropriate grade completion: 1(completed the age-appropriate years of schooling, eg: 1 for age 7, 2 for age 8, etc) -Currently enrolled w/ HS degree: 1(still enrolled in school after having received a high school degree) -Worked last week: 1(worked in the week before survey) -Worked last year: 1(worked in year before survey) -Worked in non-laborer job: 1(worked in year before survey at a job other than as a spot laborer) 55 Figure 3.1 Progresa Localities by Rainfall Shock in 1987 Figure 3.2 Progresa Localities by Treatment Status and Rainfall Shock in 1987 Treatment Localities Control Localities 56 3.3.6 Summary Statistics Table 3.2 reports summary statistics for individual-level variables from the 2003 survey for our sample of interest: individuals aged 12 to 18 (and for employment outcomes, only those aged 18) living in households eligible for Progresa. 15 Average educational attainment is 6.8 years for the pooled sample, with individuals in treatment villages receiving on average 0.154 more years of schooling than control villages. This dierence is signicant at the 1% level. Similarly, the proportion of children who pro- gressed at least 5 grades from 1997 to 2003 and the proportion that completed the appropriate number of years of schooling for their age is signicantly higher in the treatment villages. Note that employment outcomes for 18 year olds do not appear to be impacted signicantly by treatment on average. In the next section, we outline how we analyze these dierences in more robust specications, controlling for covariates and taking into account heterogeneous impacts for individuals with dierent endowments. Table 3.3 reports summary statistics for the variables related to our two shocks, Progresa exposure and rainfall. Years of Progresa exposure, annual rainfall during the year of birth, and occurrence of a rainfall shock all vary at the locality x birth year level. Summary statistics are calculated accordingly and reported in two panels, one for the full sample and one for a trimmed sample described below. By experimental design, treatment villages were exposed to Progresa for longer than control villages. On average, treatment individuals received 1.9 more years of Progresa: the treatment-control dierence is 2 years for the majority of cohorts, but 1 for the youngest and oldest cohorts, as shown in Table B1). Mean rainfall, both in raw levels and in normalized terms, is not signicantly dierent across treatment and control villages. However, there appears to be a small but statistically signicant dierence in the prevalence of a one-standard deviation shock between treatment and control villages. Since Progresa treatment was randomly allocated and rainfall is exogenous, this dierence in the prevalence of a shock does not necessarily indicate an identication issue (especially because, as we describe in section 3.4, we control for the main eects of Progresa and rainfall and focus on the sign of the interaction). However, this imbalance could be problematic if it resulted from a lack of common support across the treatment and control rainfall distributions. Accordingly, we verify in Figure 3.3 that the rainfall distributions for 14 While it may be surprising to see some localities situated so close together take on dierent values for this shock variable, we are able to detect these dierences because of the large number of rainfall stations (most localities have several stations within 20km) as well as our use of inverse-distance weighting, which assigns dierent rainfall values to even very closely situated localities. 15 In this table, as in the rest of the analysis, we restrict to individuals who satisfy the data quality requirements described in section 3.3.1. 57 Table 3.3 Summary Statistics for Shock Variables Full Sample Treatment Villages Control Villages Treatment - Control Differences A. Full Sample 4.841 5.574 3.695 1.879*** (1.168) (0.727) (0.720) (0.0296) 1182.4 1180.6 1185.3 -4.752 (644.3) (654.8) (628.0) (26.32) -0.0704 -0.0539 -0.0962 0.0423 (0.812) (0.792) (0.841) (0.0332) 0.242 0.223 0.272 -0.0483*** (0.428) (0.417) (0.445) (0.0175) 2519 1536 983 B. Trimmed Sample 4.812 5.576 3.707 1.869*** (1.166) (0.724) (0.707) (0.0313) 1181.1 1171.1 1195.5 -24.43 (644.0) (654.8) (628.0) (28.12) -0.0667 -0.0511 -0.0891 0.0379 (0.844) (0.833) (0.859) (0.0368) 0.277 0.266 0.294 -0.0279 (0.448) (0.442) (0.456) (0.0195) 2170 1282 888 Notes: Standard errors in parentheses (*** p<0.01, ** p<0.05, * p<0.1). Variable definitions: -Rainfall shock: 1(Normalized rainfall greater than 1 or less than -1) -Normalized rainfall: Total annual rainfall, standardized using locality-specific, 10-year historical mean and standard deviation -Annual rainfall: Total annual rainfall in mm Years of Progresa exposure Annual rainfall Rainfall Shock Years of Progresa exposure Annual rainfall Rainfall Shock Number of locality x birth-year observations Normalized rainfall Normalized rainfall Number of locality x birth-year observations 58 treatment and control localities indeed share a common support and are actually quite similar overall. Moreover, looking at Figure 3.2, it is clear that though there are more shocks in the treatment group, the spatial distribution of rainfall shocks are similar across the two groups (and both quite disperse). Figure 3.3 Normalized Rainfall Distributions in Treatment and Control Villages Notes: Rainfall levels are normalized using each locality's location-specic historical mean and standard deviation. Nevertheless, in order to alleviate concerns that this imbalance is driving our results, we also trim the sample by excluding localities that could be considered outliers. That is, we drop any localities that either experienced no rainfall shocks throughout the sample period or experienced rainfall shocks in every year throughout the period, noting that such localities would not contribute to coecient estimates. As shown in Panel B of Table 3.3, this trimming results in a sample of balanced rainfall shocks across treatment and control. Figure 3.4, which maps this trimmed sample, is not noticeably dierent from Figure 3.2, emphasizing that this trimming did not substantially change the distribution of rainfall shocks (by removing localities only from a particular area, for example). In the Appendix, we repeat our main empirical analysis using the trimmed sample and show that our results remain nearly unchanged. Despite the randomized nature of the Progresa experiment, previous literature has found that some household-level and locality-level characteristics are not fully balanced across treatment and control villages (Behrman and Todd, 1999). For this reason, in keeping with empirical methods used in previous studies of Progresa impacts, we include a rich set of controls that are summarized in Appendix Table 59 Figure 3.4 Progresa Localities by Treatment Status and Rainfall Shock in 1987, Trimmed Sample Treatment Localities Control Localities B2. At the household level, the sample is fairly balanced across the groups with the exception of household head age, several household composition variables, two parental education variables, and father's language. At the locality-level, access to a public water network as well as garbage disposal techniques are signicantly dierent across treatment and control villages, at the 10% level. We control for all of these household and locality-level variables in our regression analysis, which we outline in the following section. In the Appendix, we run additional specications that control for the interaction of these unbalanced controls with the rainfall shock and nd that this does not substantially change our results. 16 3.4 Empirical Strategy We use rainfall during an individual's year of birth as a shock to that individual's biological endowment. Maccini and Yang (2009) have shown that early-life rainfall shocks can impact adult outcomes like health and educational attainment, and this operates through the positive impact rainfall has on agricultural output in rural settings. Increased household income means increased nutritional availability for the 16 Similar to the strategy used in Acemoglu et al. (2004), this ensures that the unbalanced characteristics do not confound the estimate of our treatment-rainfall interaction. 60 fetus or infant during a crucial stage of development, which could lead to improved physical health and cognitive ability. Like the Indonesian villages in Maccini and Yang (2009), the Progresa villages are also rural, suggesting that rainfall also serves as an important income shock to these communities. Bobonis (2009) conrms that negative rainfall shocks have a large negative impact on household expenditures in rural Mexico. Unlike in Indonesia, however, where the relationship between rainfall and income appears to be more monotonic, Bobonis (2009) nds that expenditures can be negatively impacted by large deviations from the mean in either direction. Specically, he nds that rainfall shocks, dened as monthly rainfall above or below one standard deviation from the historical mean, reduce household expenditures by 16.7%. In the same setting as Bobonis (2009), we allow for droughts and oods to both have negative impacts on household income. Using locality-level wages reported by village leaders in the Progresa data, we show graphically that this is indeed the appropriate relationship to use. Figure 3.5 depicts the relationship, using lowess smoothing, between average male wages from the 2003 surveys and rainfall in that same year, normalized using the locality-specic 10-year historical mean and standard deviation. The clear inverted U-shape, which peaks at around zero, shows that wages are highest around the locality mean but fall at the tails of the rainfall distribution. Motivated by this gure and the prior literature, we dene a negative shock as a realized rainfall level that is over one standard deviation above or below the locality-specic mean calculated over the 10 years prior. Our investment shock, which is the total number of years of Progresa exposure, also depends on the year of birth and locality of residence during the Progresa program. The rainfall shock, years of exposure, and their interaction form the basis of our empirical specication. For individuali, living in states and localityl in 1997, born in yeart, their education or employment outcomes y islt can be expressed as follows: y islt = 1 R slt + 2 P slt + X 0 islt + s x t + islt (3.1) where R slt represents a normal rainfall dummy, indicating that rainfall during the individual's year of birth was within one standard deviation of the ten-year locality-specic mean. In order for this variable to be interpreted as a positive endowment shock (in the same way Progresa is seen as a positive investment shock), we use a 1 to indicate a normal year (or absence of a shock) and 0 to indicate a shock year. P slt represents the number of years of Progresa exposure, which varies across treatment and 61 Figure 3.5 Locality Wages Notes: Dashed lines represent 95% condence intervals, calculated from 1000 bootstrap replications. control villages as well as across dierent birth cohorts within villages. Our basic specication includes state x birth year xed eects ( s x t ). In some specications we add municipality xed eects, which is the smallest set of geographic xed eects we can use, given that one of our primary sources of exogeneity { the Progresa randomization { varies at the locality level. In our base specication, we cluster our standard errors at the municipality level, which is a larger administrative unit than the locality. In addition to this, we also show standard errors that adjust for spatial correlation (unrelated to administrative boundaries) using the method described in Conley (1999). As discussed in section 3.3.5, using a rainfall shock dummy instead of rainfall levels reduces the spatial correlation in our independent variable of interest, but we correct our standard errors for any spatial correlation that may remain. We show two sets of standard errors that allow for spatial correlation. First, we allow for dependence between observations located less than 100km apart, but no dependence between those further than that. Our second weighting function allows for dependence between observations up to 500km apart. For both of these standard errors, we impose a weight that decreases linearly in distance until it hits zero at the relevant cuto point. In keeping with previous work on Progresa (Skouas and Parker, 2001; Schultz, 2004; Behrman 62 et al., 2011), we include a rich set of controls in order to account for some signicant dierences across treatment and control villages that exist despite the randomization. All of our specications include controls for individual gender, household size, household head age, household head gender, household composition variables, 17 as well as locality controls for water source type, garbage disposal methods, the existence of a public phone, hospital or health center, and a DICONSA store in the locality. 18 In the Appendix, we show specications that include interactions between the rainfall shock and each of the characteristics that are not balanced across treatment and control. Although parental education and language (specically, a dummy for whether the parents speak the indigenous language) are important controls (Skouas and Parker, 2001; Schultz, 2004; Behrman et al., 2011), these are missing for 30% and 10% of the sample, respectively. Similarly, distance to secondary school and distance to bank are missing for 58% and 12% of localities, respectively. In order to include these variables without reducing sample size, we control for missing values instead of dropping missing observations. Parental education and parental language are represented by a set of dummy variables, with the omitted category representing a dummy for missing. 19 Similarly, distance to bank and distance to secondary school are set to zero for missing observations but missing dummies for each variable are added to the specication. In equation 3.1, 1 represents the average eect of a positive early-life shock on our outcomes of interest, while 2 represents the average eect of a positive investment shock: specically, we measure the eect of one more year of exposure to Progresa, which incentivized and decreased the opportunity cost of schooling. This specication, however, does not measure potential heterogeneity in the eect of the investment shock on individuals with dierent endowments. The following specication adds an interaction term to measure precisely this heterogeneity: y islt = 1 R slt + 2 P slt + 3 R slt P slt + 0 X islt + s x t + islt (3.2) Now, 1 represents the main eect of a positive early-life income shock, and 2 represents the eect of a positive investment shock for individuals who did not experience a positive rainfall shock. 2 + 3 17 These include counts of the number of children aged 0-2, children aged 3-5, males aged 6-7, males aged 8-12, males aged 13-18, females 6-7, females aged 8-12, females aged 13-18, females aged 19-54, females aged 55 and over, and males aged 55 and over. 18 DICONSA stores, operated by the Ministry of Social Development, are responsible for distributing the nutritional supplements that are part of the health component of Progresa. 19 For parental education, the included dummies are less than primary school completion, completion of primary school, and completion of secondary school; for parental language, the included dummies are a dummy for speaking the indigenous language and a dummy for not speaking the indigenous language. 63 represents the total eect of the Progresa shock on individuals who also experienced a positive rainfall shock, and 3 therefore gives us the dierential eect of Progresa for the higher endowment individuals (who experienced a positive shock). If 3 is positive, this would suggest that Progresa had a larger eect for higher endowment individuals than lower endowment individuals, while a negative 3 would suggest the opposite: that Progresa helped to mitigate the negative impact of an early life shock. 3.5 Results In this section, we report and discuss estimation results from the strategy discussed above. We begin with a graphical illustration of our results on education, which re ects the pattern found in the remainder of the empirical results. We then move on to present the results of the regression analysis for all outcomes, rst discussing educational outcomes and then enrollment and employment outcomes for the oldest cohort of our sample. Finally, we discuss a number of checks to address concerns about selective fertility, attrition, and imbalance in the prevalence of rainfall shocks across treatment and control. Figure 3.6 illustrates the intuition underlying our identication strategy, using lowess smoothing to depict the non-monotonic relationship between rainfall at birth and educational attainment across treat- ment and control households, as well as in the pooled sample. We rst regress educational attainment on our full set of controls (state-by-birth year xed eects, and all household and locality-level controls described in Section 3.4). We then plot non-parametrically the relationship between the educational attainment residuals on the y axis and normalized rainfall on the x axis. The solid line represents the relationship for the pooled sample, including both treatment and control villages, which had varying degrees of exposure to the Progresa experiment. We also examine the same education-rainfall relationships separately for treatment and control villages. The control group has an inverted U- shape, which reinforces the idea that extreme deviations from mean rainfall are harmful for children. Comparing the dotted control group line to the dashed treatment line, there are two important features to note. First, the treatment line is above the control line across the entire range of rainfall deviations. Consistent with our summary statistics and previous work on Progresa, education outcomes are improved for those exposed longer to Progresa. Second, the distance between the treatment and control lines is smallest around a normalized rainfall deviation of zero and grows larger in the tails (below and above one standard deviation, depicted by the vertical lines). Furthermore, the treatment line is essentially at, as compared to the control line, indicating 64 Figure 3.6 Years of Educational Attainment by Rainfall in Year of Birth Notes: All three lines represent the lowess-smoothed educational attainment residuals for the relevant group, calculated after regressing educational attainment on state by birth-year xed eects and the control variables described in section 3.4. Vertical lines depict one standard deviation above and below the mean of normalized rainfall, which is trimmed at the 5th and 95th percentiles. 65 that Progresa exposure successfully mitigates the impacts of extreme rainfall at birth on educational attainment. 3.5.1 Education Results The following tables report the analogous parametric regression estimates from the specications dis- cussed in Section 3.4. Panel A of Table 3.4 displays the results from specication 3.1, which includes only the main eects of rainfall and Progresa exposure. The rst three columns show the regression results from our base specication, which includes state-by-year xed eects and household and local- ity controls. 20 For each coecient of interest, we report three standard errors: rst, clustered at the municipality level; second, allowing for spatial correlation using a 100km cuto; and third, allowing for spatial correlation using a 500km cuto. The results in column 1 show that one year of Progresa exposure leads individuals to obtain 0.129 more years of schooling on average: this eect is signicant at the 5% level. Multiplying this coecient by 1.5 years (the number of years between the treatment and control villages' rst exposure to Progresa), we obtain a treatment eect of 0.1935 years, which is consistent with previous work by Behrman et al. (2009, 2011), which estimated a treatment eect of 0.2 years (using a slightly dierent sample). Individuals who did not experience a negative rainfall shock at birth show a similarly sized boost in educational attainment of 0.102 years, marginally signicant using the rst two types of standard errors reported. Since our sample includes children who may not have completed their schooling yet, we also look at the two other variables that adjust for age. Grade progression is positively impacted by both years of exposure and normal rainfall, although these coecients are generally not signicant at the 5% level. In column 3, we see that Progresa and normal rainfall have positive and signicant impacts on appropriate grade completion. In the specication with municipality xed eects, none of the main eects are signicant at the 5% level. These results, however, do not allow the investment shock to have heterogeneous impacts on individuals with dierent endowments. Panel B of Table 3.4 displays the results from specication 2. Again, columns 1 to 3 show the results with the baseline set of controls, while columns 4 to 6 add the municipality xed eects. Again, we report three sets of standard errors, which are generally quite similar. For educational attainment in column 1, the main eects of Progresa and normal rainfall are 20 Because these results are very similar to those from a simplied specication that only includes the state-by-year xed eects, gender, and household size, we only report results using the more complete set of controls. 66 Table 3.4 Eects of Progresa and Rainfall on Education Outcomes Effects of Rainfall and Progresa on Educational Attainment for Ages 12-18 (1) (2) (3) (4) (5) (6) Panel A: Main Effects Only Years of Progresa Exposure 0.129 0.0145 0.0167 0.0423 -0.00819 -0.00610 (0.0365)*** (0.00960) (0.00740)** (0.0462) (0.0121) (0.0109) [0.0257]*** [0.00628]** [0.00638]*** [0.0327] [0.00849] [0.00817] {0.0205}*** {0.00543}*** {0.00660}** {0.0314} {0.00774} {0.00918} No Rainfall Shock 0.102 0.0119 0.0272 0.0664 -0.000747 0.0205 (0.0557)* (0.0145) (0.0117)** (0.0539) (0.0138) (0.0110)* [0.0617]* [0.0147] [0.0133]** [0.0499] [0.0124] [0.0120]* {0.0677} {0.0154} {0.0146}* {0.0487} {0.0123} {0.0117}* Panel B: Main Effects and Interaction Years of Progresa Exposure 0.217 0.0304 0.0315 0.145 0.0107 0.0136 (0.0546)*** (0.0132)** (0.0110)*** (0.0582)** (0.0149) (0.0140) [0.0456]*** [0.0111]*** [0.00970]*** [0.0428]*** [0.0112] [0.0112] {0.0562}*** {0.0118}** {0.00909}*** {0.0435}*** {0.0106} {0.0108} No Rainfall Shock 0.648 0.111 0.120 0.703 0.116 0.142 (0.279)** (0.0556)** (0.0506)** (0.267)*** (0.0570)** (0.0536)*** [0.271]** [0.0583]* [0.0487]** [0.227]*** [0.0484]** [0.0477]*** {0.340}* {0.0646}* {0.0474}** {0.247}*** {0.0458}** {0.0433}*** No Shock x Exposure -0.112 -0.0203 -0.0189 -0.130 -0.0238 -0.0248 (0.0531)** (0.0109)* (0.0102)* (0.0509)** (0.0114)** (0.0107)** [0.0528]** [0.0121]* [0.0102]* [0.0435]*** [0.00955]** [0.00956]*** {0.0623}* {0.0130} {0.00911}** {0.0452}*** {0.00859}*** {0.00813}*** Observations 11824 11216 11824 11824 11216 11824 Mean of Dependent Variable 6.787 0.579 0.465 6.787 0.579 0.465 Fixed Effects Years of Education Grade Progression Appropriate Grade Completion Years of Education Grade Progression Appropriate Grade Completion Notes: - Standard errors clustered at the municipality are reported in parentheses, Conley standard errors using a 100km cutoff are reported in square brackets, and Conley standard errors using a 500km cutoff are reported in curly brackets. (*** p<0.01, ** p<0.05, * p<0.1). -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean -All specifications include gender, household head gender and age, household size, household composition variables, parental education, parental language, and locality characteristic s. Controls for parental language/education and locality distance include dummies for missing values Birth year x state Birth year x state, Municipality 67 positive and signicant while the interaction is negative and signicant, all at the 5% level (10% level when using the 500km Conley standard errors errors). The same pattern holds for grade progression and appropriate grade completion. Compared to the coecients in Panel A, both the size and the signicance of the main eects increase with the inclusion of the interaction. The coecient on Progresa exposure in Panel B represents the eect of Progresa for those who experienced a negative rainfall shock. The fact that this is larger than the main eects in Panel A suggests that Progresa had a larger impact on those with a lower endowment, which is veried by the signicant negative interaction terms. Looking at the magnitude of our estimates, having normal rainfall during the year of birth increases schooling by 0.648 years in our base specication; and although Progresa increases educational attainment for lower-endowment individuals by 0.217 years, it only increases educational attainment for higher-endowment individuals by 0.105 years (still positive and signicant), indicating that educational outcomes respond less for children with relatively high endowments. Looking at the specication with municipality xed eects in columns 4 to 6, the pattern of the results is the same, with positive main eects and negative interaction eects, which here almost completely dwarf the positive main eects of Progresa. In the regressions on grade progression and appropriate grade completion, the main eects of Progresa are positive but not signicant, likely due to lack of variation in treatment and control status within municipalities. Although municipality xed eects are appealing in the sense that they control for location-specic unobservables on a ner level than state, the fact that over half of the municipalities consisted of either all treatment or all control villages reduces the amount of variation we can exploit. For this reason, we focus on the baseline specication (reported here in columns 1 through 3) for the remainder of the paper. The large magnitudes of the interaction terms in all regressions suggests a large potential for pol- icy interventions like Progresa to remediate inequalities in endowments. At 2 years of exposure { the average dierence between treatment and control exposure { the program mitigated 35% of the disad- vantage caused by the rainfall shock at birth in years of completed schooling. For grade progression and appropriate grade completion, the gures are similarly high: 37% and 32%, respectively. 21 In Table 3.5, we look at schooling completion by grade. We create separate dummy variables for the completion of 3 years to 12 years of school and estimate specication 3.2 using these dummies as the dependent variables. We start with 3 years of school because this is the youngest grade directly aected 21 These proportions are calculated using the results from columns 1 to 3. 68 Table 3.5 Interaction Eects on Schooling Completion by Grade Effects of Rainfall and Progresa on School Completion Dummies for Ages 12-18 (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Years of Progresa Exposure 0.00224 0.0120 0.0183 0.0226 0.0501 0.0456 0.0421 0.0132 0.00656 0.00280 (0.00380) (0.00514)** (0.00706)** (0.00873)** (0.0148)*** (0.0135)*** (0.0121)*** (0.00637)** (0.00303)** (0.00208) [0.00373] [0.00494]** [0.00629]*** [0.00780]*** [0.0115]*** [0.0113]*** [0.0102]*** [0.00615]** [0.00312]** [0.00240] {0.00410} {0.00597}** {0.00717}** {0.00898}** {0.0119}*** {0.0118}*** {0.0133}*** {0.00570}** {0.00328}** {0.00240} No Rainfall Shock -0.0122 0.00896 0.0311 0.0365 0.167 0.166 0.157 0.0512 0.0405 0.0161 (0.0203) (0.0283) (0.0384) (0.0473) (0.0638)*** (0.0618)*** (0.0614)** (0.0361) (0.0189)** (0.0136) [0.0186] [0.0281] [0.0339] [0.0420] [0.0613]*** [0.0608]*** [0.0594]*** [0.0286]* [0.0194]** [0.0159] {0.0205} {0.0337} {0.0351} {0.0485} {0.0640}*** {0.0633}*** {0.0709}** {0.0261}** {0.0221}* {0.0194} No Shock x Exposure 0.00203 -0.00253 -0.00515 -0.00471 -0.0267 -0.0303 -0.0291 -0.00923 -0.00637 -0.00273 (0.00401) (0.00536) (0.00718) (0.00897) (0.0127)** (0.0118)** (0.0117)** (0.00716) (0.00356)* (0.00251) [0.00369] [0.00529] [0.00657] [0.00799] [0.0126]** [0.0126]** [0.0119]** [0.00599] [0.00367]* [0.00297] {0.00399} {0.00614} {0.00674} {0.00872} {0.0123}** {0.0125}** {0.0136}** {0.00535}* {0.00401} {0.00360} Observations 11824 11824 11824 11824 11824 11824 11824 11824 11824 11824 Mean of Dependent Variable 0.970 0.935 0.881 0.785 0.484 0.369 0.260 0.0610 0.0308 0.0123 Fixed Effects state x year and muni (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Years of Progresa Exposure 0.000162 0.00638 0.00807 0.00948 0.0333 0.0345 0.0347 0.0111 0.00320 0.00312 (0.00454) (0.00566) (0.00810) (0.00984) (0.0180)* (0.0166)** (0.0126)*** (0.00644)* (0.00321) (0.00211) [0.00430] [0.00530] [0.00631] [0.00807] [0.0113]*** [0.0113]*** [0.00990]*** [0.00628]* [0.00338] [0.00231] {0.00405} {0.00533} {0.00602} {0.00731} {0.0110}*** {0.0101}*** {0.0114}*** {0.00573}* {0.00370} {0.00208} No Rainfall Shock -0.00818 0.00558 0.0177 0.00903 0.190 0.186 0.173 0.0650 0.0469 0.0237 (0.0215) (0.0297) (0.0370) (0.0464) (0.0580)*** (0.0604)*** (0.0588)*** (0.0380)* (0.0196)** (0.0138)* [0.0190] [0.0266] [0.0301] [0.0382] [0.0521]*** [0.0533]*** [0.0509]*** [0.0291]** [0.0196]** [0.0150] {0.0199} {0.0293} {0.0271} {0.0365} {0.0450}*** {0.0460}*** {0.0541}*** {0.0274}** {0.0227}** {0.0190} No Shock x Exposure 0.00109 -0.00235 -0.00377 -0.00107 -0.0324 -0.0354 -0.0328 -0.0126 -0.00779 -0.00451 (0.00419) (0.00557) (0.00703) (0.00885) (0.0115)*** (0.0117)*** (0.0114)*** (0.00766) (0.00374)** (0.00256)* [0.00380] [0.00511] [0.00588] [0.00745] [0.0101]*** [0.0106]*** [0.0102]*** [0.00588]** [0.00369]** [0.00282] {0.00383} {0.00546} {0.00547} {0.00728} {0.00843}*** {0.00876}*** {0.0104}*** {0.00545}** {0.00426}* {0.00355} Observations 11824 11824 11824 11824 11824 11824 11824 11824 11824 11824 Mean of Dependent Variable 0.970 0.935 0.881 0.785 0.484 0.369 0.260 0.0610 0.0308 0.0123 Fixed Effects Notes: -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean Birth year x State, Municipality -All specifications include gender, household head gender and age, household size, household composition variables, parental education, parental language, and locality characteristic dummies. Controls for parental language/education and locality distance include dummies for missing values Primary School Secondary School High School 3 yrs 4 yrs 5 yrs 6 yrs 7 yrs 8 yrs 9 yrs 10 yrs 11 yrs 12 yrs - Standard errors clustered at the municipality are reported in parentheses, Conley standard errors using a 100km cutoff are reported in square brackets, and Conley standard errors using a 500km cutoff are reported in curly brackets. (*** p<0.01, ** p<0.05, * p<0.1). Primary School Junior High School High School Birth year x state Notes: - Standard errors clustered at the municipality are reported in parentheses, Conley standard errors using a 100km cutoff are reported in square brackets, and Conley standard errors using a 500km cutoff are reported in curly brackets. (*** p<0.01, ** p<0.05, * p<0.1). -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean -All specifications include gender, household head gender and age, household size, household composition variables, parental education, parental language, and locality characteristic s. Controls for parental language/education and locality distance include dummies for missing values 10 yrs 11 yrs 12 yrs 3 yrs 4 yrs 5 yrs 6 yrs 7 yrs 8 yrs 9 yrs 69 by the conditional cash transfers. In columns 2 to 9, we see that the impact of Progresa on completing grades 4 to 11 is positive and signicant. The size of this main eect is largest in magnitude for the 7th year of schooling, which Behrman et al. (2011) highlight as a critical transition period (between primary and secondary school) during which many children drop out. This is clearly an important transition period, as it is also only starting in 7th grade that the main eect of normal rainfall becomes positive and signicant. Prior to this, the high completion rates suggest that endowments may not matter much during this period, as the vast majority attend school and pass. Also starting in 7th grade, we see signicant negative interaction coecients that oer support for the potential for interventions to mitigate the eects of early life shocks by encouraging the completion of secondary schooling among those hit by these shocks. As in Table 3.4, these interaction terms are over half of the size of the main eects of Progresa. We are also interested in how our endowment and investment shocks interact to determine skill, not just educational attainment. We thus look at the Woodcock-Johnson dictation, word identication, and applied problems test scores available for a sample of the population, as a potential proxy for ability. The tests were administered to a sample of the population aged 15 to 21 in 2003. We nd small eects tightly bound around 0 of Progresa, rainfall, and their interaction on these tests (see Appendix Table B7). This is consistent with previous literature (Behrman et al., 2009), which has found no main eect of Progresa on test scores. The lack of any Progresa impact on cognitive scores could potentially be due to low school quality as well as the absence of variation in Progresa exposure for the older ages in the sample of test-takers. For both the endowment and investment shocks, the smaller sample size also makes it dicult to detect their eects. Moreover, it is possible that the tests were unable to capture enough variation in skill or ability. In the letter-word identication test, for example, almost 30% of the sample answered everything correctly (and over 50% only made 2 mistakes) in a test of 58 questions. 3.5.2 Employment Outcomes We are also interested in whether the endowment and investment shocks we study have similar eects on longer-run labor outcomes that are not directly tied to the Progresa cash incentive. Unfortunately, much of our sample is too young for us to study impacts on their employment outcomes, 22 but the 22 We do not use the 2007 survey because of signicant attrition problems. We also cannot use individuals who are older than 18 in 2003 as the fraction living outside of the original household and, accordingly, missing employment data, grows large after age 18. See section 3.3.1 for more details. 70 Table 3.6 Eects of Progresa and Rainfall on Longer-Term Outcomes (1) (2) (3) (4) (5) (6) (7) Years of Progresa Exposure 0.0126 0.0884 0.0798 0.0841 0.0811 0.0814 0.0853 (0.0135) (0.0427)** (0.0384)** (0.0393)** (0.0462)* (0.0443)* (0.0429)** [0.0124] [0.0428]** [0.0290]*** [0.0314]*** [0.0483]* [0.0345]** [0.0341]** {0.0128} {0.0478}* {0.0155}*** {0.0249}*** {0.0578} {0.0273}*** {0.0218}*** No Rainfall Shock 0.103 0.206 0.174 0.220 0.215 0.206 0.259 (0.0532)* (0.146) (0.128) (0.128)* (0.146) (0.141) (0.133)* [0.0531]* [0.153] [0.101]* [0.0959]** [0.157] [0.110]* [0.104]** {0.0474}** {0.168} {0.0585}*** {0.0719}*** {0.185} {0.0853}** {0.0628}*** No Shock x Exposure -0.0175 -0.0874 -0.0767 -0.0985 -0.0792 -0.0780 -0.0996 (0.0165) (0.0442)** (0.0391)* (0.0395)** (0.0464)* (0.0444)* (0.0416)** [0.0169] [0.0434]** [0.0304]** [0.0312]*** [0.0491] [0.0375]** [0.0368]*** {0.0158} {0.0464}* {0.0182}*** {0.0230}*** {0.0552} {0.0311}** {0.0252}*** Observations 1597 1147 1143 1143 1145 1139 1138 Mean of Dependent Variable 0.0607 0.502 0.532 0.354 0.563 0.587 0.414 Fixed Effects Birth year x state Notes: - Standard errors clustered at the municipality are reported in parentheses, Conley standard errors using a 100km cutoff are reported in square brackets, and Conley standard errors using a 500km cutoff are reported in curly brackets. (*** p<0.01, ** p<0.05, * p<0.1). -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean -All specifications include gender, household head gender and age, household size, household composition variables, parental education, parental language, and locality characteristics. Controls for parental language/education and locality distance include dummies for missing values. -These regressions restrict to individuals aged 18 in 2003. Currently Enrolled w/ HS Degree Worked this Week Worked this Year Worked in Non-Laborer Job Enrolled or Currenly Working Enrolled or Worked this Year Enrolled or Worked in Non- Laborer Job 71 oldest cohort { who were 18 at the time of the 2003 survey { were just old enough to be graduating from high school and pursuing either further education or formal employment. In this smaller sample, we estimate the eects of Progresa, birth year rainfall, and their interaction on a set of variables related to continuing education and employment after high school. Our rst dependent variable of interest is the continuation of education after high school: this is an indicator equal to 1 if an individual is enrolled in school (including college or vocational training) and has already completed 12 years of schooling. In columns 2 and 3, we create dummies for employment in the week of survey and in the past year. Column 4 attempts to separate those employed in low-skilled, intermittent jobs from the pool of employed individuals by using an indicator equal to 1 if an individual was employed and worked in a non-laborer job; that is, those who were working as spot laborers were grouped in the same category as the unemployed. In the last 3 columns, we take the stance that both continued enrollment and employment are \desirable" outcomes, and create dummies that combine the continued enrollment variable with each of our employment variables. For instance, the dependent variable in column 5 is an indicator equal to 1 if individuals report either being currently enrolled or having worked that week. An important takeaway from this table is the consistent pattern of coecients across all columns: both main eects are positive, while interaction terms are all negative. Some of the coecients are imprecisely estimated, which is unsurprising given the much smaller sample sizes, but the overall pattern clearly suggests that the mitigative eects of Progresa are not limited to school-aged outcomes directly incentivized by the program. The results in columns 4 and 7 are particularly striking. Normal birth- year rainfall signicantly increases the probability of an individual being employed in a non-laborer (i.e., higher skill and more stable) job, and Progresa also has a positive eect for individuals who experienced negative rainfall shocks. But the eect of Progresa is essentially zero for higher-endowment children. That is, Progresa has signicant impacts on the probability of stable employment immediately following high school completion among disadvantaged children, but no impact on children with higher endowments. Taken in sum, these results illustrate the ability of investments in adolescence to oset the impacts of insults in early life and the higher return to investments for disadvantaged children. 72 3.5.3 Robustness Checks Other Programs One potential threat to validity is the rollout of other programs during the period between the birth years of our sample individuals and our survey year, 2003. In particular, though we argue that the occurrence of a rainfall shock is random, it is possible that a rainfall shock in a given year aects the probability of a household or locality being the target of another program in subsequent years. This of course is more of a concern in situations where localities are hit by repeated shocks, which are more likely to aect future agricultural activity than a single shock. To this end, the exercise conducted in section 3.5.3 helps alleviate these concerns by showing that the exclusion of localities hit by multiple consecutive shocks does not aect our results. We also directly address this issue by controlling specically for programs or reforms targeted to individuals based on agricultural activity. The Program for Direct Assistance in Agriculture (PROCAMPO) was a cash transfer program introduced in 1994 in order to compensate for the anticipated negative eects of NAFTA on rural incomes (Sadoulet et al., 2001). Land use in 1993 was used to determine eligibility for the program as well as the size of all future payments: transfers were made per hectare of land that was used to grow at least one of the following crops: corn, beans, rice, wheat, sorghum, barley, soybeans, cotton, or cardamom. The 2003 survey asks whether anyone in the household receives PROCAMPO payments, and we use this as an additional control in our next set of regressions. In general, the eects of the trade liberalization reforms that took place in the 1990's likely varied across localities, and one important source of variation in these eects were the types of crops grown in each village. Price changes as a result of trade liberalization were clearly crop-specic, as were the support policies implemented to protect farmers. 23 In short, an important concern is whether trends over time varied for localities growing dierent types of crops. To address this concern, we create indicators for whether a locality reports corn, kidney beans, or sugar as one of their top three crops, and interact these indicators with individual birth year dummies. Finally, we also control for the rollout of a land certication program (PROCEDE) that essentially eliminated the link between land use and land rights in communally farmed agricultural communities called ejidos. PROCEDE has been found to have aected migration decisions (De Janvry et al., 2015) 23 For example, import quotas for most traditional crops { except maize and beans { were eliminated in 1991. Similarly, although taris for most commodities were phased out by 2006, transitional taris for maize, dry edible beans, milk, and sugar were not scheduled to be phased out until 2008 (OECD, 2006). 73 and therefore might have also aected the returns to and opportunity costs of schooling. Controlling for the age of an individual in the year their locality was certied, 24 we address concerns that correlations between PROCEDE's rollout and rainfall shocks might be confounding our estimates. Appendix Table B8 addresses all of these concerns by running our main regressions with the addition of several controls: an indicator for PROCAMPO recipients, crop variables interacted with birth year dummies, and individual age during PROCEDE rollout. Our results are robust to these adjustments. Selective Fertility Table 3.7 Eects of Progresa and Rainfall on Fertility (1) (2) (3) Locality-Level 1 Total Number of Children Number of Younger Siblings Birth Spacing (in days) between younger sibling Years of Progresa Exposure 0.0228* -13.79 (0.0132) (13.71) No Rainfall Shock 0.0898 0.101 -29.07 (0.161) (0.0665) (80.00) No Shock x Exposure -0.0202 6.373 (0.0126) (15.52) Observations 2519 11686 7230 Mean of Dependent Variable 4.827 1.982 1107.9 Fixed Effects Birth year x state Birth year x state Birth year x state Notes: 1. Locality-level analysis: unit of observation is birth-year-locality. -"No rainfall shock" = 1 for locality birth years that experienced rainfall levels within one standard deviation of the 10-year historical locality-specific mean - Standard errors clustered at the municipality level are in parentheses (*** p<0.01, ** p<0.05, * p<0.1). -All specifications include locality controls and individual/household characteristics (gender, household head gender and age, household size, household composition, parental education and language). For the locality-level variables, these are averaged at the locality-birth-year level. Individual-Level In Table 3.7 we investigate how Progresa and rainfall shocks may have aected fertility, which could lead to potential selection issues. One concern might be that negative rainfall shocks during a year may aect the number of children that are born and/or survive to school-aged years. If this were the case, the composition of individuals in our sample who were born in shock years would be dierent from those in our sample born in regular years. In order to check this, we collapse to the locality x birth year level and count the total number of children born in a particular year in each locality. We then use this constructed panel to regress the total number of children born that year on our rainfall shock. Column 24 We obtain this data from De Janvry et al. (2015), which restricts attention to ejidos that were certied after 1996. Therefore, we are unable to distinguish between ejido localities certied in 1993, 1994, 1995, 1996, and localities that were not part of an ejido at all. For individuals in this category, we set the PROCEDE age variable to zero and include a dummy for missing PROCEDE information. 74 1 of Table 3.7 reports results from this regression. We nd no evidence of selective fertility or selective child mortality. Our next test is to check whether Progresa, rainfall shocks, and their interaction had any impact on mothers' subsequent fertility decisions. Specically, we might be concerned that a good rainfall shock would increase the likelihood of having more children (or total fertility), or decrease the birth spacing between children, just as exposure to Progresa may do the same (by lowering the opportunity cost of having children). If this were the case, an individual's exposure to Progresa or rainfall shocks would also be related to intrahousehold allocation issues that may vary with the total number of siblings and spacing between siblings. To check for this, we estimate equation 3.2, again at the individual level, using number of younger siblings and birth spacing between next youngest sibling (in days) as dependent variables. With one exception, the main eects and interaction are all insignicant. Given that the coecient on Progresa exposure in column 2 is very small in magnitude, we interpret these results as nding little evidence in support of selection bias. Attrition Table 3.8 Eects of Progresa and Rainfall on Attrition (1) (2) (3) (4) Years of Progresa Exposure -0.00415 -0.00342 0.00343 -0.0328 (0.00889) (0.00384) (0.00373) (0.0379) No Rainfall Shock -0.0336 -0.0297 0.00168 -0.129 (0.0383) (0.0228) (0.0192) (0.135) No Shock x Exposure 0.00652 0.00492 0.0000278 0.0393 (0.00757) (0.00450) (0.00369) (0.0400) Observations 14525 12917 12159 1646 Mean of Dependent Variable 0.889 0.941 0.973 0.697 Ages 12 to 18 12 to 18 12 to 18 18 Fixed Effects Notes: - The sample in column 2 restricts to households found in 2003, while columns 3 and 4 restrict to those that meet data quality restrictions. Birth year x state Household found in 2003 Meets Data Quality Restrictions Non-missing education variable Non-missing employment variable - Standard errors clustered at the municipality level are in parentheses (*** p<0.01, ** p<0.05, * p<0.1). -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean As in any longitudinal study, we must consider the extent to which selective attrition may be 75 confounding our results. In Table 3.8, we show that although attrition between the baseline and 2003 surveys was sizeable, it appears to be uncorrelated with our regressors of interest. In this table, we simply regress various attrition indicators on years of Progresa exposure, the positive rainfall indicator, their interaction, and state by birth year xed eects. In column 1, we investigate household attrition, including all eligible individuals in the baseline survey who would have been aged 12 to 18 in 2003. We do not nd that our investment or endowment shocks in uenced the likelihood of a household being dropped from the 2003 sample. In column 2, conditional on the household being found in 2003, we show that our regressors of interest do not signicantly predict the likelihood of an individual being included in our sample given the data quality restrictions we impose (matching genders across surveys and birth year dierences of less than 2 years). Finally, in columns 3 and 4, we investigate whether the shocks predict the probability that an individual { who is found in 2003 and meets the data quality restrictions { has non-missing education and employment variables (restricting of course to 18-year-olds in column 4). We do not nd any evidence of either. Balance We investigate further the implications of the small but statistically signicant imbalance in rainfall shock prevalence across Progresa treatment and control villages in our baseline sample. First, to further test whether our results are being driven by this imbalance, we repeat our analysis using the trimmed sample described in Section 3.3, in which rainfall shock prevalence is the same across treatment and control villages. This sample omits localities exhibiting shocks for rainfall measures in every year, or no shocks in any year, over the study period. As Tables B3, B4, and B5 show, our results are virtually identical to the full sample results. 25 Second, we conduct a robustness exercise regarding the unbalanced demographic characteristics dis- cussed in section 3.3 above and identied in previous studies. Table B6 reports the results of regressions on our main outcomes of interest, additionally controlling for interactions between the rainfall shock variable and each of the control variables that are not balanced across treatment and control groups. The results are once again very similar to the main results reported above. 25 Because our previous results revealed little dierence across the three types of standard errors used, we only show standard errors clustered at the municipality level in this section. 76 3.6 Chapter 3 Conclusion In this paper, we leverage the combination of two sources of exogenous variation { in early life circum- stance and investments during childhood { to study whether (and the extent to which) it is possible to mitigate the impact of early life shocks, a question that is usually confounded by the endogeneity of investment responses. Using the Progresa experiment and year-of-birth rainfall shocks, we study the impacts of these investment and endowment shocks on educational attainment and employment outcomes. We nd that better early-life circumstance and more investments generate greater schooling and employment probabilities. Moreover, the coecient on the interaction between Progresa exposure and normal rainfall is negative and signicant across most outcome measures, indicating that remediation of early-life shocks is possible through investments. Put dierently, the positive impact of Progresa exposure on educational outcomes is largest for individuals with low endowment realizations due to adverse early-life conditions. The magnitude of the interaction term is telling: in most cases, it is over half of the size of the main eect of Progresa, suggesting that cash transfer programs like Progresa have the potential to oset almost entirely the inequality generated by early life circumstances. We nd similar patterns when studying continued education and employment outcomes in a sub-sample of older individuals. That is, longer-run post-schooling labor outcomes exhibit the potential for remediation as well. Our study contributes to the large literature evaluating Progresa and conditional cash transfer programs more generally (Schultz, 2004; Behrman et al., 2011; Haushofer and Shapiro, 2013; Blattman et al., 2013). While most evaluations of such programs tend to focus on average eects, we compare impacts across individuals with dierent unobserved endowments, exploiting rainfall shocks as our source of exogenous variation in this unobservable. Indeed, unlike the few other studies attempting this sort of exercise, the continuous nature of the endowment shock we observe allows us to calculate treatment eects of Progresa at every point along the endowment distribution. Progresa appeared to have had a very targeted impact on those who experienced negative shocks early in life. An important nding for policymakers, this suggests that programs like these may be most ecient if targeted toward the disadvantaged { not just in terms of income (as Progresa already targets the poor) but also in terms of endowments. While the challenges involved with this sort of targeting are not trivial, our results oer reason for optimism about the ability of policies to mitigate the negative impacts and inequality 77 generated by early life shocks. 78 Chapter 4 Reporting Heterogeneity and Health Disparities across Gender and Education Levels: Evidence from Four Countries 1 4.1 Introduction to Chapter 4 Understanding health disparities across gender and education levels is crucial for informing policies aimed at reducing such inequalities and for understanding why life choices and outcomes may dier across these groups. 2 Valid measures of these health inequalities are required, and self-reported health is a relatively simple and widely available measure that can be used. Unfortunately, comparisons of self-reported health can be confounded by the use of dierent response scales across individuals. In this paper, I use anchoring vignettes to quantify the extent to which dierences in reporting behavior may drive these dierences across gender and additionally, dierences across education levels. I draw on data from four dierent countries: the Indonesian Family Life Survey (IFLS), the United States Health and Retirement Study (HRS), the English Longitudinal Study of Aging (ELSA), and the China Health and Retirement Longitudinal Study (CHARLS). All of these surveys ask respondents to rate 1 Accepted for publication at Demography. The nal publication is available at http://link.springer.com/article/ 10.1007/s13524-016-0456-z. 2 For example: human capital investment, occupational choice, marriage, income, or life satisfaction. 79 their own health diculties from 1 to 5 (where 1 represents the least and 5 the most severe problems) in six domains: mobility, pain, cognition, sleep, aect, and breathing. In addition, for each domain, all surveys ask respondents to rate the health of three hypothetical individuals in order to anchor the respondents' numerical self-reports. These anchoring vignettes allow me to adjust for the use of dierent response thresholds across gender and education levels using a hierarchical ordered probit (HOPIT) model, enabling comparisons that are not confounded by systematic reporting dierences. In most health domains across countries, I nd that gender gaps are reduced after accounting for the use of dierent thresholds, though less drastically in Indonesia and the United States, where half of the domains still reveal signicant gender dierences after adjustment. In England and China, adjusting for thresholds completely eliminates the gender gap in the majority of domains. This elimination (or reduction) of signicant gender dierences after adjusting for response thresholds oers a partial explanation for one quite persistent puzzle that has emerged from studies of self-reported health: women have signicantly worse self-reported health than men, despite the fact that they have lower mortality rates (Case and Paxson, 2005; Strauss et al., 1993; Macintyre et al., 1999; Nathanson, 1975; Verbrugge, 1989). The observed female disadvantage in self-reported health could be driven by their use of dierent response thresholds when evaluating a person's health. This is not the only possible explanation for the gender paradox 3 or the rst time this particular hypothesis has been proposed (Macintyre et al., 1999; Verbrugge, 1989), but this paper oers evidence that the use of dierent response thresholds across men and women can confound gender comparisons of self-reported health because women have a higher bar for considering someone \healthy." The narrowing or elimination of gender gaps is not a mechanical result of the econometric exercise: when I repeat this analysis to compare individuals of dierent education levels, I nd no evidence of existing dierences shrinking. Across all four datasets, I nd persistent education dierences that do not diminish (and in most cases widen) after adjusting for the use of dierent thresholds. This adds further support to the large literature on the education health gradient, 4 emphasizing that if anything, dierential reporting behavior may result in an underestimation of the strength of the link between education and health. 3 Mortality selection is one potential reason for the gender paradox, but Strauss et al. (1993) nd that adjusting for it reduces but does not eliminate the gender gap in self-reported health. Case and Paxson (2005) nd evidence that men and women face dierent distributions of chronic conditions, and for some conditions, the severity is worse for men than women. The combination of these two ndings help explain why women, aicted with more chronic conditions that are less fatal, may report worse health yet still live longer than men. 4 See Cutler and Lleras-Muney (2006) and Grossman (2006) for reviews of the theory and empirical evidence and Vogl (2012) for a review specically for developing countries. 80 In addition to oering evidence on the role of reporting behavior in explaining gender and educa- tion gaps, this paper contributes to the literature on anchoring vignettes by expanding their use to within-country gender and education dierences in four dierent countries. Most of the early anchoring vignettes papers focus on cross-country comparisons: for example, political ecacy in China and Mex- ico (King et al., 2004) or work disability and life satisfaction in the United States and the Netherlands (Kapteyn et al., 2007, 2010). A more recent strand of literature has used vignettes and the HOPIT model to analyze within-country dierences, particularly in self-reported health (Mu, 2014; Dowd and Todd, 2011; Bago d'Uva et al., 2008b,a). In these papers, any discussion of dierences across genders or education levels is usually limited to a comparison of coecients in a pooled HOPIT model, which only allows gender and education to have a level eect on latent health and response thresholds. Unlike existing work, I estimate the HOPIT model separately for men and women (and separately for more educated and less educated individuals) and then simulate self-report distributions using adjusted and unadjusted thresholds. This allows for gender and education to change how other covariates aect health and reporting behavior. Kapteyn et al. (2007), Kapteyn et al. (2010), and Mu (2014) all run the HOPIT model separately for dierent countries or dierent regions, but this paper is the rst to conduct this exercise for gender and education levels. This paper is also the rst to calculate standard errors for a key estimate: the dierence between the simulated proportion of individuals falling into the \healthiest" category in two dierent groups. Previously ignored in the literature, standard errors allow me to conclude whether groups are statistically dierent before and after allowing for the use of dierent response thresholds across groups. The next section outlines how anchoring vignettes help solve the problems that arise due to indi- viduals' use of dierent response thresholds. Section 4.3 outlines the econometric model, and Section 4.4 describes the four datasets used in this analysis. I outline the estimation methods in Section 4.5, discuss my results in Section 4.6, and conclude with Section 4.7. 4.2 Anchoring Vignettes Many economic studies have turned to self-reported health measures as outcome variables (Finkelstein et al., 2012; Gertler and Gruber, 2002; Maccini and Yang, 2009; Strauss et al., 1993; Manning et al., 1987) since objective measures of health are often infeasible to measure for large populations or too narrow to capture the multidimensional nature of health. The particular type of measure studied in 81 this paper is a response to a question like \overall, in the last 30 days, how much pain or bodily aches did you have?", chosen from 5 options: none, mild, moderate, severe, and extreme. These self- reports are simple and may be better suited to capture an individual's health as a whole, compared to objective measures that are more specic (like blood pressure or BMI) or more extreme (like mortality). Moreover, self-reported health is also strongly linked with objective measures of health. 5 General self- reported health, which is slightly dierent from the measures used in this paper, has been repeatedly shown to have a signicant relationship with mortality, robust to the inclusion of a host of demographic and socioeconomic controls. 6 Despite this, subjective scale measures have also long been the source of some controversy, due to potential dierences in reporting behavior across groups. Dow et al. (1997), in their analysis of the eect of health care prices on health outcomes, highlight that self-reported measures often suer from reporting bias that is non-random, potentially correlated with variables like income and healthcare utilization. Clearly, self-reported measures of health that assign a quantitative value to how healthy one feels are not perfect measures of actual health. They also incorporate an individual's interpretation of the response choices: what do mild, moderate, severe, and extreme really mean? The idea that individuals may use dierent reporting thresholds in their self-reports is particularly problematic when making comparisons across groups or individuals. The underlying problem is that we cannot ascertain whether the dierences we see are being driven by actual dierences in health status or simply the use of dierent response scales, what King et al. (2004) refer to as \dierential item functioning" (DIF), a term originally from the education testing literature. 7 Equivalently, we are also unsure if, across groups that appear similar, there exist dierences masked by dierent response scales. In short, with systematically dierent response scales, we must rst adjust for this DIF before any valid comparisons can be made. Methods recently developed to make these necessary adjustments involve the use of anchoring vignettes, introduced by King et al. (2004). These vignettes tell a brief story about a hypothetical person and ask respondents to evaluate the severity of the person's situation. For example, 5 Idler and Benyamini (1997) review 27 studies conducted in eight dierent countries. With remarkable consistency, these studies show that the coecient on general self-rated health in regression on mortality remains signicant even when other covariates and health status indicators are included. A more recent meta-analysis by DeSalvo et al. (2006) nds that individuals who report being in \poor" health have almost double the mortality risk of those who reported being in \excellent" health. This calculation included studies which controlled for various covariates like age, socioeconomic status, and others. 6 General self-reported health is an answer to the question: "In general, how healthy do you feel?" I use domain-specic and not general self-reported health in this paper because the standard vignettes have been designed for domain-specic health. 7 A test question with \dierential item functioning" is one that two people of the same ability but from dierent groups (races or genders, for example) have dierent probabilities of answering correctly. 82 [John] can concentrate while watching TV, reading a magazine, or playing a game of cards or chess. Once a week he forgets where his keys or glasses are, but nds them within ve minutes. Overall how much diculty did [John] have remembering things? 8 A vignette like this one would help anchor respondents' answers to the question: \Overall in the last 30 days, how much diculty did you have remembering things?" In general, vignettes allow us to evaluate how people set their thresholds and therefore help adjust for dierences in response scales. Figure 4.1 Comparing Subjective Scales (From King et al. (2004)) Panel A Panel B Panel C A simple gure can summarize why comparisons based on subjective scales can be problematic and how anchoring vignettes can be used to address these issues. Figure 4.1, from King et al. (2004), shows two dierent respondents: A and B. In Panel A, Self1 represents A's numerical response to a subjective question like \how is your health in general?" Self2, in Panel B, represents B's response to this same question. A naive comparison of these two numbers would lead to the conclusion that A is in better health than B. However, these gures also depict how A and B evaluate three hypothetical vignette individuals, Alison, Jane, and Moses. Even though A and B are faced with identical vignette descriptions, they give very dierent evaluations of the three vignettes, indicating the use of potentially very dierent response scales. Panel C shows what B's responses would look like, if she had instead used A's response scale. This essentially boils down to aligning B's vignette evaluations to A's and comparing Self1 and Self2 on the new scale. Comparing Panel A and Panel C show us that B is actually in better health than A but has a higher bar for dening what is \healthy." Anchoring vignettes allow us to infer something about respondents' internal response scales that are otherwise completely unobservable to the researcher. When comparing two groups of individuals, we 8 This vignette is from the cognition domain and used in all four datasets this paper. See appendix (section C.2) for complete list of vignettes. 83 can use the scale in one group as a benchmark in order to make valid comparisons. The validity of these comparisons hinges on two important assumptions. First, we assume response consistency, which means that respondents use the same response scales when evaluating themselves and evaluating others. The second assumption is vignette equivalence, which means that the way respondents interpret the scenarios and questions are independent of their individual characteristics. In other words, respondents only dier in the thresholds they use, not in how they interpret the question. In the next section, I discuss what both of these assumptions mean in the context of the econometric model. Response consistency would not hold if for some reason, the respondents held the hypothetical individuals to a dierent standard than their own. For example, King et al. (2004) suggest that response consistency in their study of political ecacy would be violated if respondents felt inferior to the people in vignettes and set a higher bar for what it means to have \a lot of say" in the government. Both King et al. (2004) and van Soest et al. (2011) test for response consistency by using objective measures and nd strong evidence to support response consistency. Unfortunately, tests like these are only possible when relevant objective measures, which map directly to the unobserved latent variable, exist. 9 While the validity of this assumption may depend on the particular context of the vignettes, I argue that the straightforward nature of the vignettes in this paper make this a reasonable assumption for the self- reported health setting. The individuals described in the vignettes in this paper suer from common ailments undoubtedly somewhat familiar to respondents in all countries. This familiarity, combined with the fact that health is an issue these elderly respondents deal with everyday, unlike the political issues in King et al. (2004), makes it unlikely that respondents would hold the vignette individuals to a dierent standard, or use a dierent scale to evaluate them. The second assumption, vignette equivalence, would not hold if there are systematic dierences in the way respondents interpret the questions or vignettes, which is more likely when dealing with abstract concepts. Since vignettes are brief, vignette equivalence may also be violated if respondents ll in any gaps by making assumptions to create a complete picture. These assumptions are likely to vary by person and are problematic if correlated with individual characteristics. Fortunately, all of the vignettes used in this paper are straightforward and deal with tangible, familiar concepts. However, because of their brevity, they may be slightly open to interpretation. Because of the dearth of objective measures that map directly to my domain-specic health variables 9 For example, King et al. (2004) use vision tests to validate subjective scale questions about vision impairment and van Soest et al. (2011) use actual counts of alcoholic drinks to validate subjective questions about the severity of drinking problems. 84 of interest, as well as the strong support in the literature for the validity of response consistency (van Soest et al., 2011; King et al., 2004; Grol-Prokopczyk et al., 2015), I take this rst assumption as given. However, I test for vignette equivalence using methods proposed by Bago d'Uva et al. (2011). 4.3 Econometric Model In order to separately identify the eect of individual characteristics on true health from their eect on reporting thresholds, I use the same econometric model used in Kapteyn et al. (2007) and Kapteyn et al. (2010). For each health dimension d, I model the subjective response of an individual i, Y di , in the following ordered response equation, where Y di ranges from 1 (least severe) to 5 (most severe). Y di is determined by a latent variable Y di , which is a function of individual respondent characteristics and an error term. For simplicity, I drop the subscript d in the model exposition but analyze a separate model for each health domain in the empirical section. 1. Y i =X i + i ; i is N(0; ), i independent of X i and the other error terms in the model 2. Y i =j if j1 i <Y i j i , j=1,....5 3. 0 i =1, 5 i =1, 1 i = 1 X i +u i , j i = j1 i +e j X i , j = 2, 3, 4 u i is N(0; 2 u ) and is independent of X i and the other error terms in the model. What sets this apart from a normal ordered response model is that the thresholds j i vary across individuals. These thresholds are also a function of individual characteristics and an unobserved indi- vidual eect,u i ; which allow individuals with identicalX characteristics to have dierent response scale thresholds. The individual-specic j i 's are the essence of DIF. Given data on self-reported health and individual characteristics only, it is impossible to identify and 1 separately (but j for j > 1 is identied through the non-linearity of the exponential function). For this, we use the three vignette evaluations given by each respondent for each health domain. The vignette responses (of individual i to vignette number l for domain d) can be modeled in a similar ordered response framework. Again, the d subscript is omitted. In this paper, l = 1; 2; 3. 4. Y li = l + li ; li is N(0; v ), li independent of X i and the other error terms in the model 5. Y li =j if j1 i <Y li j i , j=1,....5 85 The non-negative exponential function in threshold Eq. 3 ensures that 1 2 3 4 . Its non- linearity ends up identifying the j coecients for j > 1. The results in the paper use the exponential function to dene the gaps between dierent thresholds, as in Eq. 3, but in the appendix, I also test the sensitivity of these results by replacing the exponential in Eq. 3 with a square, as follows. 3a. 0 i =1, 5 i =1, 1 i = 1 X i +u i , i = j1 i + ( j X i ) 2 , j = 2, 3, 4 I also explore the possibility of using a linear specication for the threshold equations in the appendix. The results remain remarkably consistent across alternate functional forms. This is true for all domains and all four datasets. The model's rst crucial assumption, response consistency, means that the i 's in Eq. 3 are used for both the self-reports (Eq. 1 and Eq. 2) and the vignette responses (Eq. 4 and Eq. 5). Since vignette responses Y li only depend on individual characteristics through their in uence on the thresholds i , it is possible to identify and vectors from Eq. 4 and Eq. 5. Here, l is a vignette xed eect that, together with an unobserved individual error li , completely determine the latent variable for vignette evaluations, Y li . The assumption of vignette equivalence implies that l is constant across all individuals, and the unobserved error is uncorrelated with individual characteristics. That is, individual characteristics do not aect the perceived underlying severity of the each vignette. Respondent characteristics can only aect evaluations of vignettes through their eect on thresholds. This leads naturally to a test of vignette equivalence which involves including respondent characteristics X i in vignette Eq. 4. I discuss this vignette equivalence check in section C.6 of the appendix. Like Bago d'Uva et al. (2011) (who developed this test) and Grol-Prokopczyk et al. (2015) (who apply the same methods), I nd evidence that vignette equivalence is not always satised. However, adjusting the model to allow for violations does not signicantly change my coecient estimates and therefore my conclusions. 4.4 Data I use data from the 2007 wave of the IFLS (Strauss et al., 2009), the 2007 Disability Vignette Study mail survey from the HRS (HRS 2014), the 2006-2007 wave of the ELSA (Marmot et al., 2014), and the rst wave of the CHARLS, conducted in 2011 (Zhao et al., 2013). Each of these four datasets includes the following domain-specic self-reported health questions: Overall in the last 30 days... 86 1. ... how much of a problem did you have with moving around? 2. ... how much pain or bodily aches did you have? 3. ... how much diculty did you have remembering things? 4. ...how much diculty did you have with sleeping, such as falling asleep, waking up frequently during the night, or waking up too early in the morning? 5. ...how much of a problem did you have with feeling sad, low, or depressed? 6. ...how much of a problem did you have because of shortness of breath? In addition to these questions, all four surveys include the exact same set of three vignettes per health domain (see section C.2 in the appendix for a list all of the vignettes). The inclusion of all six of the same health domains, as well as the use of identical vignettes across the four datasets, make this combination of datasets particularly appealing. Moreover, unlike several other surveys that also include vignettes, all of these datasets either focus on the elderly or have a large enough sample of elderly individuals to estimate the HOPIT model separately for dierent subgroups within the elderly population { the group likely to be most familiar with the health problems discussed in the vignettes. Focusing on this narrow (and arguably more relevant) age range allows me to hone in on sources of reporting heterogeneity other than age. Answers to the health status questions and anchoring vignettes form the outcome variables of interest for this analysis: domain-specicY i ,Y 1i ,Y 2i , andY 3i in the HOPIT model. For the explanatory variables X i I purposely focus on a simple set of variables in order to facilitate comparisons across the datasets: gender, age, and education levels. Specically, I create two age dummies (for those aged 56 to 70 and those older than 70, leaving those 55 and younger as the omitted category) and a dummy variable for males. Because I eventually split each sample into high and low education groups, I dene dierent education dummies for each dataset in order to have groups that are large enough (see Table 4.1 for category descriptions). Although all datasets include the same self-report questions and anchoring vignettes, there are some important dierences in the way the information was collected. For instance, the IFLS and CHARLS were in-person surveys, while the ELSA and HRS involved written questionnaires for the vignettes. Appendix section C.1 contains more information about the individual datasets. 87 4.4.1 Summary Statistics Table 4.1 Summary Statistics Sumstats (1) (2) (3) (4) IFLS HRS ELSA CHARLS Age 52.00 63.76 65.80 59.42 (9.618) (9.046) (10.30) (10.04) 1(Male) 0.535 0.453 0.461 0.466 (0.499) (0.498) (0.499) (0.499) 1(High Education Group) 1 0.218 0.281 0.358 0.364 (0.413) (0.449) (0.480) (0.481) 1(Medium Education Group) 2 0.436 0.570 0.225 0.214 (0.496) (0.495) (0.418) (0.410) Mobility Self-Report 1.430 1.742 1.644 1.296 (0.848) (0.910) (0.944) (0.801) Pain Self-Report 1.815 2.366 2.288 1.872 (1.027) (0.871) (0.932) (1.116) Cognition Self-Report 1.687 1.834 1.801 1.711 (0.989) (0.776) (0.815) (1.011) Affect Self-Report 1.678 2.309 2.278 1.727 (1.034) (0.922) (1.044) (1.058) Sleep Self-Report 1.473 1.777 1.583 1.476 (0.896) (0.876) (0.836) (0.877) Breathing Self-Report 1.282 1.450 1.408 1.336 (0.727) (0.772) (0.782) (0.786) Average Pairwise Correlation 0.39 0.42 0.34 0.34 Year of Vignette Survey 2007 2007 2006-2007 2011 Observations 3058 4158 2192 3630 Notes: 2. IFLS: primary but not high school; HRS: high school but not college; ELSA: any degree lower than A-levels; CHARLS: primary but not junior high 1. IFLS: high school graduates; HRS: college graduates; ELSA: A-levels and above; CHARLS: junior high and above - All data are weighted using individual cross-sectional sampling weights provided by each dataset to make summary statistics representative of the United States for the HRS, England for the ELSA, China for the CHARLS, and the 13 IFLS provinces in Indonesia for the IFLS. - Standard errors reported in parentheses. - Self-reports are on a scale from 1 to 5, with 1 representing the least and 5 the most severe health difficulties. Page 1 Table 4.1 lists summary statistics for all four datasets, including only individuals who responded to the self-report and three vignette evaluations for at least one of the domains and who were not missing any of the other covariates of interest. Each survey represents one cross section of data, with the IFLS and HRS sampled in 2007, the ELSA sampled during 2006 and 2007, and the CHARLS sampled in 2011. For the IFLS and CHARLS, the sample sizes reported here are much larger than the sample sizes in each individual domain since individuals only responded to two domains each. 10 Although t-tests are not reported here, there are large and signicant dierences across all four 10 See Appendix section C.1 for more detail. 88 countries that arise from dierences in survey parameters, covariate distributions within each country, or a combination of both. For instance, the HRS and ELSA samples are older on average, which could be due in part to the higher life expectancies in these two countries but is likely driven primarily by the higher age threshold for inclusion in these datasets: 50 compared to 40 in the IFLS and 45 in the CHARLS. 11 Rather than drop all IFLS and CHARLS respondents younger than 50, I choose to include everyone and control for age in order to retain as many observations as possible. The longer life expectancy of females relative to males is re ected in the fact that less than half the population is male in all samples except the IFLS (which is also the youngest sample). This disproportionate female share is particularly apparent in the older HRS and ELSA samples, which have signicantly higher female proportions than the other two: again, most likely an artifact of the survey design but potentially also generated by demographic dierences across countries. The education statistics must be interpreted with caution because, as described above, the \high education," \medium education," and \low education" category denitions dier across the samples and are roughly equivalent to using the 75th percentile as the high education cuto. Keeping this in mind, it is clear that there are large dierences in the levels of educational attainment across countries. Over 80% of the American sample are high school graduates, while this gure is less than a quarter for Indonesian respondents, an older cohort in a developing country. In the CHARLS sample, less than 10% of the sample graduated from high school. 36% of the ELSA sample received their A-levels or higher, which is a slightly more advanced qualication than high school graduation in the United States. Table 4.1 also lists the self-report means for each health domain, and the average of all pair-wise correlations between self-reports for dierent domains. The correlations are positive but weak for all four datasets. For IFLS and CHARLS respondents, all self-report means fall between 1 (\no diculty") and 2 (\mild diculty"). Pain and to a lesser extent, cognition, appear to be the most serious aictions for these two groups. The U.S. sample reports the worst health on average across all domains; pain and aect appear to be the most serious problems for this group. These are also the two most serious aictions for the ELSA sample, whose self-report averages are almost on the same level as the HRS. Given the signicant dierences in covariates across groups, the dierent formats and languages of the surveys, and of course, the possibility of dierent response thresholds across countries, it is dicult to use these raw dierences in self-reports to draw any conclusions about the relative true health levels of 11 The HRS, ELSA, and CHARLS are all aging datasets focused on the elderly, while the IFLS is a household survey that interviews all members of a sample household. The vignettes in the IFLS, however, were targeted only to those 40 and older. 89 these countries. 12 Table 4.2 reports the responses to the hypothetical vignettes for each sample and each domain. I report the domain-specic sample size at the bottom of each column. Here, I number the vignettes in order of increasing intended severity based on the IFLS sample and questionnaire. 13 In all samples, the average perceptions of severity are generally in accord with the intended relative levels. With the exception of the sleep domain for the ELSA and CHARLS samples (which is one of the least straightforward of all vignette domains) and the pain domain for the CHARLS, the rst vignette is on average rated healthier than the second, which in turn is rated healthier than the third. 14 As shown in Appendix Figures C1 and C2, there are substantial within-country dierences in self- reported health across gender and education. For all datasets, there are at least three domains which show signicantly dierent distributions for men and women and at least four domains for which highly educated and less educated individuals have signicantly dierent distributions. I investigate these dif- ferences using the HOPIT model discussed in Section 4.3, which I estimate using the methods described in the following section. 4.5 Estimation Strategy 4.5.1 Estimating the Model I use maximum likelihood to estimate the model described in Section 4.3. Details about the estimation procedure, as well as the likelihood function, can be found in section C.3 of the appendix. I estimate the model separately for each dataset and health domain, as common response scales across health domains is a strong assumption (Kapteyn et al., 2007). In order to simulate distributions by subgroup, I also estimate the model separately for males and females, and then for high-education and pooled medium and low education individuals (which I will refer to for the remainder of the paper as the \lower-education" category). For the gender analysis, my specication includes the following in the 12 Molina (2014) demonstrates that response thresholds play a large role in explaining the drastic cross-country dierences between these four countries. Although the HRS and ELSA samples seem less healthy in initial comparisons, they are in fact signicantly healthier than both the IFLS and CHARLS respondents once thresholds are equalized across countries. 13 The vignettes in the IFLS are grouped by domain and within each domain appear to be ordered with the least severe vignettes at the beginning and the most severe at the end. For most domains, the ordering is quite clear, while domains like cognition and sleep are more open to interpretation. However, the data conrms that the relative severity perceived by IFLS respondents is consistent with the ordering of vignettes in the interview. 14 In these three exceptions, the dierences in average ratings are very small in magnitude. It should be noted that my arbitrarily chosen ordering is irrelevant to the estimation of the model, as the l 's, which capture the actual ordering of perceived severity, are directly estimated. 90 Table 4.2 Vignette Responses Sumstats VG IFLS Mobility Pain Cognition Sleep Affect Breathing Vignette 1 2.352 2.525 2.536 2.712 2.508 2.794 (1.047) (1.006) (1.000) (1.018) (0.966) (1.064) Vignette 2 2.843 2.726 2.884 3.058 3.025 3.330 (1.065) (0.971) (1.050) (1.042) (1.002) (1.056) Vignette 3 3.520 3.457 3.175 3.396 3.703 3.758 (1.081) (1.076) (1.093) (1.094) (1.175) (1.142) Observations 1003 1027 1018 1122 944 996 HRS Mobility Pain Cognition Sleep Affect Breathing Vignette 1 2.461 1.902 1.948 3.030 2.567 3.092 (0.722) (0.652) (0.735) (0.721) (0.693) (0.769) Vignette 2 3.708 3.187 2.796 3.852 3.357 3.973 (0.817) (0.739) (0.769) (0.837) (0.762) (0.804) Vignette 3 3.834 3.790 3.776 3.858 4.532 4.382 (0.802) (0.775) (0.759) (0.780) (0.761) (0.767) Observations 4118 4123 4127 4126 4113 4119 ELSA Mobility Pain Cognition Sleep Affect Breathing Vignette 1 2.485 1.967 2.098 2.994 2.627 3.197 (0.770) (0.569) (0.680) (0.718) (0.709) (0.789) Vignette 2 3.616 3.035 2.888 3.649 3.274 3.865 (0.878) (0.733) (0.745) (0.890) (0.777) (0.816) Vignette 3 3.860 3.902 3.690 3.582 4.318 4.434 (0.796) (0.785) (0.834) (0.778) (0.840) (0.808) Observations 2115 2145 2121 2148 2088 2085 CHARLS Mobility Pain Cognition Sleep Affect Breathing Vignette 1 1.758 2.080 1.826 2.333 2.107 2.708 (0.902) (0.784) (0.873) (0.930) (0.863) (1.085) Vignette 2 2.393 2.075 2.504 3.167 2.730 3.454 (1.067) (0.792) (0.927) (1.163) (0.937) (1.060) Vignette 3 3.532 3.263 2.626 3.054 3.822 3.933 (0.991) (0.940) (1.058) (0.979) (1.075) (1.095) Observations 1067 1045 1136 1155 1116 1082 Notes: -All data are weighted using individual cross-sectional sampling weights provided by each dataset to make summary statistics representative of the United States for the HRS, England for the ELSA, China for the CHARLS, and the 13 IFLS provinces in Indonesia for the IFLS. - Standard errors reported in parentheses. - Vignettes are evaluated on a scale from 1 to 5, with 1 representing the least severe and 5 the most severe health difficulties. Page 1 91 vectorX i : two age dummies, a dummy for high education, and a dummy for medium education, which essentially breaks down the sample into three groups, where the omitted category is the low education group. I also include interactions between the age and education dummies. For the education analysis, X i includes the age dummies, a male dummy, and the age-gender interactions. 15 4.5.2 Simulating Distributions and Standard Errors for Predicted Probabilities Using the coecients from the separately estimated models, I simulate the distribution of self-reports for the separate groups in several ways. I simulate the distribution of domain-specic self-reported health separately for males (high-education individuals) using their own thresholds, females (lower- education individuals) using their own thresholds, and then males (high-education) using female (lower- education) thresholds. As a summary measure for each simulated distribution, I calculate the simulated proportion of males and females (or high and lower-education groups) who fall into the healthiest category. Therefore, to analyze the dierences between groups, I can look at two estimates. The rst is the dierence between the simulated proportion of males and females (or high versus lower- education groups) in the healthiest category, calculated using their own group's coecients estimated from the model. The second comparison is the dierence between the simulated proportion of healthy males predicted using female thresholds and the simulated proportion of healthy females using female thresholds. This can be thought of as a DIF-adjusted gender comparison, and an analogous analysis can be conducted to compare high and lower education groups. This DIF-adjusted comparison illustrates how dierent the two groups would be if they used the same reporting thresholds. In previous literature that has conducted these simulations, most analysis and interpretation has been conducted by simply comparing the distributions calculated using own-group thresholds and then the same thresholds for both groups. Without standard errors, however, it is dicult to draw denitive conclusions about how much the thresholds matter and whether signicant dierences still exist after adjustment. In order to conduct statistical inference, I analytically calculate standard errors for the two dierences described above. The standard errors I calculate (Eq. C.11 and Eq. C.12) take into account correlations in covariates across individuals in a married couple. Section C.4 of the appendix goes into greater detail about the derivations of all the formulas used. 15 In the appendix, I estimate both an ordered probit and HOPIT model on the entire IFLS sample, in order to illustrate importance of accounting for reporting heterogeneity. For pooled analyses of the HRS, ELSA, and CHARLS vignettes, see Dowd and Todd (2011), Bago d'Uva et al. (2011) and Mu (2014) respectively. Dowd and Todd (2011) and Bago d'Uva et al. (2011) use the exact same data as I use here, while Mu (2014) uses the pilot wave of the CHARLS. I use a slightly dierent specication from these papers. 92 4.6 Results 4.6.1 Simulations In this section, I discuss the simulation results by gender and by education for each of the four datasets. 16 Table 4.3 reports the results of various simulations that compare males to females. Each panel sum- marizes the results from a dierent dataset, and each column represents a dierent domain. Every cell in the table reports the same summary measure of the simulated distribution: the proportion of individuals (in the given subgroup, either in the raw data or simulated using the specied parameters) that fall into the healthiest category (corresponding to a self-report response of one). The rst row simply reports the proportion of ones in the self-reports raw data for men, while the last row reports the proportion among women. These re ect the same numbers represented graphically in Figure C1. The second row uses the coecients estimated using the male-specic HOPIT model to simulate the distribution of self-reports. Taking the explanatory variables for males as given, I use the male-specic coecients to predict the proportion of the male sample in each self-report category and report the proportion in the healthiest category. The fourth row conducts the same exercise for the female sample. The middle row is the most informative. These calculations once again take the male explanatory variables and coecients as given, but instead use the female thresholds ( coecients) to predict the distribution of self reports among men. This essentially predicts what the male distribution would look like if they had the same thresholds as women. In the IFLS and ELSA data, the third row narrows the gap between males (row 2) and females (row 4) in all domains. In the HRS, the gap is narrowed for cognition, aect, and breathing, but widened in mobility, pain, and sleep. In the CHARLS, the gender gap is close to eliminated in the pain domain and is narrowed in several others. In general, the signicance of the reductions or increases that take place is unclear. Table 4.4, which summarizes the results of this same analysis conducted instead to compare high- education to lower-education individuals, shows a more universal pattern across countries. Across the overwhelming majority of domains and datasets, using the same thresholds for both groups does not narrow the education gap and in fact, seems to widen it. In all domains for the IFLS and HRS and at least four domains in the CHARLS and ELSA, the numbers in row 3 are of larger magnitude than 16 Appendix section C.5.1 contains a discussion of the individual coecients from the pooled HOPIT model for the IFLS, in order to illustrate the dierences between an ordered probit and the HOPIT model. 93 Table 4.3 Simulated Proportion Falling in Healthiest Category, by Gender sim gender IFLS (1) (2) (3) (4) (5) (6) Mobility Pain Cognition Sleep Affect Breathing 1 Male sample raw data 76.16% 54.95% 65.25% 66.96% 78.53% 85.28% 2 Male sample using Male thresholds 75.15% 54.99% 61.87% 66.03% 77.10% 84.04% 3 Male sample using Female thresholds 71.00% 54.31% 59.06% 62.10% 76.10% 82.18% 4 Female sample using Female thresholds 69.74% 44.98% 53.18% 54.09% 67.50% 84.46% 5 Female sample raw data 71.70% 44.67% 54.15% 55.08% 68.04% 85.60% HRS Mobility Pain Cognition Sleep Affect Breathing 1 Male sample raw data 51.07% 14.96% 38.43% 21.32% 50.81% 68.82% 2 Male sample using Male thresholds 53.16% 17.73% 40.45% 25.92% 53.64% 70.37% 3 Male sample using Female thresholds 58.77% 18.60% 33.76% 26.20% 44.76% 66.29% 4 Female sample using Female thresholds 52.54% 15.40% 36.14% 21.30% 44.18% 70.96% 5 Female sample raw data 50.99% 12.29% 35.10% 17.25% 42.06% 69.09% ELSA Mobility Pain Cognition Sleep Affect Breathing 1 Male sample raw data 64.64% 24.19% 43.39% 34.07% 65.59% 76.19% 2 Male sample using Male thresholds 65.40% 25.28% 45.44% 36.92% 67.96% 78.05% 3 Male sample using Female thresholds 63.38% 18.59% 36.73% 34.94% 57.49% 67.17% 4 Female sample using Female thresholds 60.49% 18.13% 42.07% 24.23% 56.26% 73.56% 5 Female sample raw data 59.56% 17.36% 40.59% 22.93% 54.63% 71.90% CHARLS Mobility Pain Cognition Sleep Affect Breathing 1 Male sample raw data 85.80% 59.02% 70.03% 66.94% 75.36% 83.00% 2 Male sample using Male thresholds 86.37% 57.35% 64.33% 68.01% 75.19% 82.25% 3 Male sample using Female thresholds 85.03% 50.89% 61.44% 59.83% 70.15% 86.76% 4 Female sample using Female thresholds 82.61% 48.69% 52.79% 52.71% 66.45% 78.35% 5 Female sample raw data 83.55% 52.03% 57.45% 54.53% 65.93% 76.50% Notes: - Individual cross-sectional sampling weights are used. - Proportions are calculated using coefficients from a HOPIT specification with the following explanatory variables: two age dummies, 1(High Ed), 1(Medium Ed), and all age-education interactions Page 1 94 Table 4.4 Simulated Proportion Falling in Healthiest Category, by Education Level sim hsgrad IFLS (1) (2) (3) (4) (5) (6) Mobility Pain Cognition Sleep Affect Breathing 1 High-Ed sample raw data 84.39% 55.40% 61.67% 68.13% 74.69% 86.53% 2 High-Ed sample using High-Ed thresholds 81.49% 57.55% 58.82% 67.67% 74.28% 84.02% 3 High-Ed sample using Lower-Ed thresholds 89.49% 69.78% 68.72% 68.63% 81.01% 88.80% 4 Lower-Ed sample using Lower-Ed thresholds 70.58% 48.13% 57.48% 58.46% 72.47% 84.18% 5 Lower-Ed sample raw data 71.52% 48.58% 59.68% 59.46% 73.44% 85.11% HRS Mobility Pain Cognition Sleep Affect Breathing 1 High-Ed sample raw data 64.23% 18.13% 45.72% 23.65% 54.10% 79.89% 2 High-Ed sample using High-Ed thresholds 64.51% 21.01% 46.07% 27.13% 55.70% 79.88% 3 High-Ed sample using Lower-Ed thresholds 70.90% 30.50% 64.45% 45.42% 63.12% 90.03% 4 Lower-Ed sample using Lower-Ed thresholds 48.23% 14.37% 34.59% 22.53% 45.70% 67.12% 5 Lower-Ed sample raw data 45.86% 11.70% 33.06% 17.32% 42.89% 64.70% ELSA Mobility Pain Cognition Sleep Affect Breathing 1 High-Ed sample raw data 70.45% 24.73% 49.08% 28.28% 61.65% 81.27% 2 High-Ed sample using High-Ed thresholds 70.95% 25.54% 50.65% 30.20% 63.22% 82.13% 3 High-Ed sample using Lower-Ed thresholds 68.05% 29.47% 61.55% 46.33% 79.62% 84.34% 4 Lower-Ed sample using Lower-Ed thresholds 58.19% 19.28% 39.91% 30.51% 61.01% 72.13% 5 Lower-Ed sample raw data 57.03% 18.15% 37.79% 27.99% 58.62% 69.67% CHARLS Mobility Pain Cognition Sleep Affect Breathing 1 High-Ed sample raw data 92.04% 64.31% 75.31% 71.37% 80.44% 86.79% 2 High-Ed sample using High-Ed thresholds 91.22% 62.09% 68.50% 70.49% 80.24% 87.83% 3 High-Ed sample using Lower-Ed thresholds 87.91% 63.15% 63.25% 72.17% 85.82% 92.96% 4 Lower-Ed sample using Lower-Ed thresholds 80.29% 47.18% 50.99% 52.99% 64.28% 75.51% 5 Lower-Ed sample raw data 80.07% 49.94% 55.13% 53.08% 63.86% 75.33% Notes: -"Lower-Ed" pools both the medium and low education categories. - Individual cross-sectional sampling weights are used. - Proportions are calculated using coefficients from a HOPIT specification with the following explanatory variables: two age dummies, 1(Male), and all age-gender interactions Page 1 95 those in row 2, indicating that the proportion of high education individuals falling into the healthiest category increases when predicted using the same thresholds as lower-education individuals. This is because high education individuals usually have a lower rst threshold: although they may be healthier than lower-education individuals, they are also less likely to categorize themselves or others as having no diculty with a particular health problem. 17 This results in an understatement of dierences across education levels. 4.6.2 Standard Errors for Simulated Probabilities The preceding discussion about the importance of response thresholds has been based on simply com- paring one simulated proportion to another, without considering statistical signicance. Not only are the simulated proportions calculated from estimated parameters, but they are also calculated using the distribution of covariates in a sample of the true population. For many comparisons, including some of the education comparisons discussed here, standard errors may be less important because denitive conclusions can be drawn without them. For the domains where signicant education dierences existed in the raw data, if adjusting for DIF widens the dierence between the proportion of high education and low education individuals that fall into the healthiest category, it is clear that the use of dierent thresholds at the very least does nothing to explain the education gap, and at most, masks even larger dierences. However, certain types of analysis, like that of the gender gap, require more subtlety. For instance, in the sleep domain of the IFLS, where using female thresholds to predict male distributions appeared to narrow the gender gap slightly but not completely (dropping the male proportion of 66% to 62%, bringing it closer to but still somewhat higher than the female proportion of 54%) , it is unclear whether males and females remain signicantly dierent even after the same thresholds are used. The opposite problem exists with, for example, the mobility domain of the HRS, where the groups seemed similar initially but diverged when the same thresholds were used. This second issue is also relevant to some education comparisons, where dierences appeared trivial to begin with and widened after the DIF adjustment. In order to assess the statistical signicance of the dierences between sub-groups, before and after accounting for thresholds, I calculate standard errors for two dierences: rst, the dierence between the male (high-education) proportion in the healthiest category, predicted using male (high-education) 17 A specic example of this is discussed in more detail in the appendix, section C.5.1 96 thresholds, and the female (lower-education) proportion in the healthiest category, predicted using fe- male (lower-education) thresholds (row 2 minus row 4 in Tables 4.3 and 4.4); second, the dierence between the male (high-education) proportion in the healthiest category, predicted using female (lower- education) thresholds, and the female (lower-education) proportion using female (lower-education) thresholds: row 3 minus row 4 of Tables 4.3 and 4.4. The formulas for the estimated variances are in the appendix (section C.4, Eq. C.11 for the gender dierences and Eq. C.12 for the education dierences). In Tables 4.5 and 4.6, I report gender and education dierences, along with their respective standard errors and t-statistics, for dierences calculated using group-specic thresholds and dierences calculated using the same thresholds for both subgroups. Each panel represents a dierent dataset, and each row represents a dierent domain. Perhaps the most informative comparisons to make are between columns 3 and 6. Those comparisons indicate whether signicant dierences between gender and education exist before adjustment for DIF and after adjustment for DIF. The gender results reported in Table 4.5 reveal an important role for reporting behavior in explaining the gender gap, particularly in the ELSA and CHARLS. In the ELSA, ve domains show signicant dierences before adjustment, but only one (sleep) remains signicant after using the same thresholds to simulate the probabilities. In the CHARLS data, four domains start out with dierences signicant at the 10% level, but none remain signicant after adjusting for DIF. For these two datasets, it is clear that reporting dierences are driving the majority of the signicant gender dierences that show up in naive comparisons. On the other hand, in the IFLS, the dierences in pain, sleep, and aect remain signicant even after adjustment, although all of the dierences are narrowed. In the HRS, signicant dierences in mobility, pain, and sleep remain even after adjusting for thresholds. Interestingly, the signicant dierence in the mobility domain arises only after adjusting for thresholds, suggesting that DIF in this case distorts naive comparisons by masking existing dierences instead of generating spurious ones. It is surprising that the English and Chinese appear more similar (in terms of the absence of gender dierences after adjustment) than the English and Americans or the Indonesians and Chinese, which represent pairings of countries at more similar stages of economic development. Nevertheless, the narrowing or elimination of gender gaps as a general result is broadly consistent with ndings from existing studies that analyze biomarkers and other objective health measures from these datasets. For example, in CHARLS data, the magnitude of the female disadvantage in hyper- 97 Table 4.5 Standard Errors and t-statistics for Simulated Gender Dierences std errs stars Page 2 IFLS (1) (2) (3) (4) (5) (6) Domain Gender Difference Standard Error t-statistic Gender Difference Standard Error t-statistic Mobility 0.0541 0.0339 1.5972 0.0126 0.0445 0.2841 Pain 0.1001 0.0327 3.064*** 0.0934 0.0396 2.355** Cognition 0.0868 0.0334 2.597*** 0.0588 0.0379 1.5497 Sleep 0.1193 0.0316 3.771*** 0.0801 0.0332 2.412** Affect 0.0960 0.0350 2.747*** 0.0860 0.0427 2.014** Breathing -0.0042 0.0308 -0.1356 -0.0227 0.0376 -0.6048 HRS Domain Gender Difference Standard Error t-statistic Gender Difference Standard Error t-statistic Mobility 0.0062 0.0199 0.3098 0.0623 0.0306 2.036** Pain 0.0233 0.0122 1.906* 0.0319 0.0180 1.774* Cognition 0.0431 0.0183 2.35** -0.0238 0.0340 -0.6997 Sleep 0.0462 0.0150 3.085*** 0.0490 0.0235 2.082** Affect 0.0946 0.0191 4.949*** 0.0058 0.0394 0.1482 Breathing -0.0059 0.0208 -0.2863 -0.0467 0.0453 -1.0308 ELSA Domain Gender Difference Standard Error t-statistic Gender Difference Standard Error t-statistic Mobility 0.0491 0.0218 2.257** 0.0290 0.0331 0.8738 Pain 0.0715 0.0178 4.011*** 0.0046 0.0227 0.2037 Cognition 0.0337 0.0221 1.5284 -0.0533 0.0440 -1.2122 Sleep 0.1270 0.0202 6.287*** 0.1072 0.0275 3.9*** Affect 0.1170 0.0211 5.542*** 0.0123 0.0497 0.2473 Breathing 0.0449 0.0194 2.308** -0.0640 0.0518 -1.2336 CHARLS Domain Gender Difference Standard Error t-statistic Gender Difference Standard Error t-statistic Mobility 0.0376 0.0552 0.6820 0.0242 0.0593 0.4076 Pain 0.0866 0.0507 1.708* 0.0220 0.0502 0.4387 Cognition 0.1153 0.0679 1.698* 0.0865 0.0679 1.2748 Sleep 0.1531 0.0535 2.859*** 0.0712 0.0524 1.3599 Affect 0.0874 0.0504 1.735* 0.0370 0.0597 0.6198 Breathing 0.0390 0.0524 0.7448 0.0841 0.0565 1.4876 Notes: *** p<0.01, ** p<0.05, * p<0.1 - "Gender Difference" is the difference between the proportion of males in the healthiest category and the proportion of females in the healthiest category. - Simulated proportions are calculated using coefficients from a HOPIT specification with the following explanatory variables: two age dummies, 1(High Ed), 1(Medium Ed), and all age-education interactions. - Standard errors are calculated analytically . Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds 98 Table 4.6 Standard Errors and t-statistics for Simulated Education Dierences std errs stars Page 1 IFLS (1) (2) (3) (4) (5) (6) Domain Education Difference Standard Error t-statistic Education Difference Standard Error t-statistic Mobility 0.1091 0.0463 2.354** 0.1892 0.0494 3.827*** Pain 0.0942 0.0414 2.277** 0.2165 0.0538 4.022*** Cognition 0.0134 0.0399 0.3365 0.1124 0.0457 2.458** Sleep 0.0921 0.0381 2.416** 0.1017 0.0428 2.375** Affect 0.0181 0.0438 0.4138 0.0854 0.0525 1.6272 Breathing -0.0016 0.0385 -0.0404 0.0463 0.0416 1.1124 HRS Domain Education Difference Standard Error t-statistic Education Difference Standard Error t-statistic Mobility 0.1629 0.0231 7.042*** 0.2267 0.0374 6.06*** Pain 0.0664 0.0150 4.427*** 0.1613 0.0308 5.243*** Cognition 0.1148 0.0213 5.384*** 0.2986 0.0510 5.858*** Sleep 0.0460 0.0177 2.599*** 0.2289 0.0357 6.41*** Affect 0.0999 0.0224 4.453*** 0.1742 0.0476 3.662*** Breathing 0.1276 0.0239 5.328*** 0.2291 0.0383 5.984*** ELSA Domain Education Difference Standard Error t-statistic Education Difference Standard Error t-statistic Mobility 0.1275 0.0221 5.782*** 0.0986 0.0336 2.934*** Pain 0.0627 0.0187 3.344*** 0.1019 0.0320 3.187*** Cognition 0.1074 0.0226 4.758*** 0.2164 0.0509 4.25*** Sleep -0.0031 0.0209 -0.1489 0.1582 0.0303 5.225*** Affect 0.0221 0.0220 1.0018 0.1860 0.0406 4.587*** Breathing 0.1000 0.0198 5.062*** 0.1221 0.0405 3.017*** CHARLS Domain Education Difference Standard Error t-statistic Education Difference Standard Error t-statistic Mobility 0.1093 0.0661 1.655* 0.0762 0.0689 1.1066 Pain 0.1491 0.0584 2.551** 0.1597 0.0623 2.563** Cognition 0.1751 0.0915 1.914* 0.1226 0.0834 1.4709 Sleep 0.1751 0.0758 2.309** 0.1918 0.0843 2.276** Affect 0.1597 0.0579 2.758*** 0.2155 0.0671 3.211*** Breathing 0.1233 0.0637 1.934* 0.1746 0.0706 2.471** Notes: *** p<0.01, ** p<0.05, * p<0.1 -"Education Difference" is the difference between the proportion of high-ed individuals in the healthiest category and the proportion of lower-ed individuals in the healthiest category. - Simulated proportions are calculated using coefficients from a HOPIT specification with the following explanatory variables: two age dummies, 1(Male), and all age-gender interactions. - Standard errors are calculated analytically. Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds 99 tension, diabetes, depression, and cognition measures are much smaller than the magnitude of their disadvantage in self-reported health (Zhao et al., 2012). For cognition specically, Lei et al. (2013) nd that the signicant female disadvantage in objective measures is almost completely explained (for mental inactness) or completely explained (for episodic memory) by dierences in education levels. Crimmins et al. (2010) look at gender dierences in the prevalence of various conditions in HRS and ELSA data and nd that women are signicantly more likely to have certain disabling conditions (like arthritis or depressive symptoms) than men. While this is consistent with my result that HRS gender dierences still remain signicant after adjustment, it seems contradictory to the result that most ELSA gender dierences do disappear after adjustment. It should be noted, however, that each domain self- report potentially takes into account a number of conditions: some conditions that aict women more (hypertension, functional limitations), as well as conditions more prevalent among men (heart problems, stroke, diabetes). As a result, the signicance, sign, and magnitude of a gender dierence in self-reported health is in part driven by the relative severities and prevalences of the two sets of conditions. In the United States, for example, there is a much higher prevalence of hypertension and functional limitation than in the ELSA (Crimmins et al., 2010), which could explain why women in the HRS are signicantly worse o than men in the pain domain, for example, but not in the ELSA. 18 Potential explanations aside, this discussion highlights an important point { what is captured by self-reported health is not necessarily the same as what is captured by objective measures like disease prevalence rates. Table 4.6 tells a more straightforward story. On the whole, education dierences in reporting behavior appear to be masking larger underlying dierences between the two groups. In the IFLS, although only three domains show signicant education dierences before adjustment, using the same thresholds to adjust for DIF reveals signicant dierences in an additional domain (cognition). Similarly, in the ELSA data, unadjusted signicant dierences only exist in four, but signicant dierences in the adjusted proportions exist in all six. For the HRS, signicant dierences are found both before and after adjustment in all six domains. The CHARLS shows signicant dierences in all six domains before adjustment, but for mobility and cognition, the dierences narrow and become insignicant after adjusting for DIF. Despite this, across all datasets, including CHARLS, education dierences are generally quite large and persistent. For pain and sleep, all datasets show signicant dierences across education levels after accounting for reporting heterogeneity. 18 Though hypertension itself may not result in more pain, related conditions like obesity or inactivity might. I thank an anonymous reviewer for making this point. 100 4.7 Chapter 4 Conclusion Anchoring vignettes are a vital tool that can be used to account for reporting bias in subjective scale measures. Ignoring DIF underestimates the dierences in health across education levels in Indonesia, the United States, England, and China, because educated individuals have a higher bar for considering someone healthy. If individuals' evaluations of health are based partially on comparisons with peers, this could be because more educated people are surrounded by more educated and healthier peers and therefore have a tendency to consider themselves (and hypothetical individuals) relatively less healthy. 19 If schooling directly aects one's knowledge about health and disease, then it is also possible that more educated individuals are simply more aware of potential threats to health, or more knowledgeable about the consequences of certain symptoms. The result that education disparities in health can be underestimated by reporting heterogeneity is consistent with previous literature that uses the same datasets (Dowd and Todd, 2011; Bago d'Uva et al., 2011), as well as with studies of elderly health in dierent countries (Bago d'Uva et al., 2008a). However, the universality of this nding should not be overstated: it does not appear to be true in younger populations (Bago d'Uva et al., 2008b), or for variables other than domain-specic self- reported health. Using general self-reported health instead of the domain-specic health I use here, Grol-Prokopczyk et al. (2011) nds that the education gap actually diminishes after adjustment. For work disability, the results are mixed (Kapteyn et al., 2007; Angelini et al., 2011). This paper's conclusions about gender dierences are slightly less uniform than its education results. Although signicant dierences between males and females remain in three out of the six domains for the IFLS and HRS even after adjusting for thresholds, in England and China, accounting for thresholds completely eliminates signicant dierences between males and females in all but one domain (sleep in the ELSA). Overall, however, it is clear that reporting dierences across gender are important, as gender gaps are narrowed after adjustment in the majority of domains for all datasets except the HRS. Previous vignette studies have found that both male and female respondents rate a given vignette condition as more severe when the hypothetical vignette individual is female (Kapteyn et al., 2007). Together with the results of this paper, these ndings suggest that the gender of the object of evaluation, whether a hypothetical individual or one's own self, plays a role in shaping the elicited evaluations of health. Separating the eect of the respondent's gender from the eect of the object's gender is outside 19 See Dowd and Todd (2011) for a more detailed discussion. 101 the scope of this paper, 20 but existing research suggests that the gender of the respondent matters much more than the gender of the vignette individual (Grol-Prokopczyk, 2014). What I can conclude from this analysis is that, irrespective of the reasons for their use of dierent thresholds, males and females in the ELSA, CHARLS, and to a lesser extent, IFLS, would report much more similar levels of health if they used the same thresholds. The narrowing of the gender gap after adjusting for reporting heterogeneity provides empirical sup- port for the hypothesis that dierential reporting behavior may play a partial role in the gender puzzle discussed in section 4.1. Males are more stoic in their evaluations of health, which leads to overstated dierences between the self-reports of each gender that are not aligned with dierences in objective measures. 21 Although this nding holds true across the majority of domain-dataset combinations in this paper, it is a partial explanation at best. Gender gaps fail to narrow after adjustment, not only in several HRS health domains in this paper, but also in other vignette studies that use dierent measures of health (Kapteyn et al., 2007; Grol-Prokopczyk et al., 2011; Angelini et al., 2011). Education disparities in self-reported health appear to re ect true (and, if anything, understated) dierences in health. Although over-stated in some contexts, gender inequalities also do exist (par- ticularly in the HRS). Both of these ndings emphasize the importance of pinning down the causal mechanisms linking health, gender, education, and related life outcomes. They also highlight how cru- cial it is to consider reporting heterogeneity when comparing self-reported measures. Fortunately, the increasing availability of anchoring vignettes in surveys across the globe is making it easier to avoid relying on naive, distorted comparisons of self-reported health. 20 Although some studies are able to include vignette gender as a variable in the vignette latent variable equation, I do not have this information for all four datasets. 21 These heterogeneous reporting styles are likely related to the tendency of women to incorporate a wider range of non-physical factors into self-reports (Benyamini et al., 2000) or societal expectations that consider males the tougher gender (Courtenay, 2000). 102 Chapter 5 Conclusion The concept of human capital incorporates a wide variety of distinct but related elements, including education and health. Given the vastness of this topic, understanding the process of human capital formation requires answers to a seemingly unlimited set of questions, of which this thesis answers a narrow subset. For example, what determines the level of human capital, including cognitive ability and physical health, that a child is born with? How do parents and individuals invest in these human capital endowments from early life until adulthood? For children who are endowed with lower levels of human capital, can policy interventions help them catch up with the rest of their peers? Focusing on the health component of human capital, I also investigate health disparities using subjective measures of health. Are there dierences in health levels across genders and education levels? Can we distinguish between \true" dierences and those driven by dierences in reporting behaviors? The chapters of this thesis provide some new insight to the above questions. First of all, there are many components of human capital, like physical health and cognitive ability, that can be permanently aected by events early in life, including before birth. But how these human capital endowments eventually translate into adult well-being depends on a number of factors, including the gender of the individual, the labor market conditions faced by that individual, and the policies that aect that individual throughout various stages of life. Specically, I show that in utero pollution exposure aects cognitive ability for both men and women, but this results in dierent schooling responses for boys and girls because of the dierent labor market conditions that men and women face. In addition, rainfall shocks around the time of birth have negative eects on educational attainment and employment in agricultural settings { but these negative eects can be partially mitigated by conditional cash transfer programs like Progresa. 103 Another lesson from this thesis, which is borne out in very dierent ways in chapters 2 and 4, is the fact that gender dierences in human capital levels or investment responses cannot always be attributed to simple biological dierences between men and women. They might, instead, re ect more nuanced pathways, like gender-specic labor market conditions or dierential reporting behavior, which require more careful analysis to tease out. Finally, I demonstrate that using subjective measurements of health to document health disparities across genders and socioeconomic groups requires some caution. With the help of anchoring vignettes, I am able to show that health dierences across education levels persist even after accounting for response heterogeneity. There are clearly important links between these dierent components of human capital. On the other hand, consistent with the idea that gender dierences are not always straightforward, dierences between men and women become much more muted when response thresholds are allowed to vary across genders. The analysis conducted in this thesis highlights the complexity involved in studying what drives human capital dierences across individuals. Identifying causal relationships is not straightforward, as these questions are rife with endogeneity concerns. When exogenous variation in endowments and investments can be found, in weather shocks or randomly assigned interventions, they can substantially advance our knowledge on this topic. Similarly, it is important to think carefully about measurement issues when dealing with components of human capital that are not straightforward to measure. For- tunately, improvements in survey tools and empirical methodologies have brought huge improvements in our ability to tackle these dicult questions. 104 Bibliography Acemoglu, D., Autor, D. H., and Lyle, D. (2004). Women, war, and wages: The eect of female labor supply on the wage structure at midcentury. Journal of Political Economy, 112(3):497{551. Adhvaryu, A., Fenske, J., and Nyshadham, A. (2016). Early life circumstance and adult mental health. Technical report, Centre for the Study of African Economies, University of Oxford. Adhvaryu, A., Molina, T., Nyshadham, A., and Tamayo, J. (2015). Helping Children Catch Up: Early Life Shocks and the Progresa Experiment. Aizer, A. and Cunha, F. (2012). The production of human capital: Endowments, investments and fertility. Technical report, National Bureau of Economic Research. Aizer, A., Eli, S., Ferrie, J., and Lleras-Muney, A. (2016). The long-run impact of cash transfers to poor families. American Economic Review, 106(4):935{71. Almond, D. (2006). Is the 1918 In uenza pandemic over? Long-term eects of in utero In uenza exposure in the post-1940 US population. Journal of political Economy, 114(4):672{712. Almond, D. and Currie, J. (2011). Killing me softly: The fetal origins hypothesis. The Journal of Economic Perspectives, pages 153{173. Almond, D., Edlund, L., and Palme, M. (2009). Chernobyl's subclinical legacy: prenatal exposure to radioactive fallout and school outcomes in Sweden. Quarterly Journal of Economics, (November). Almond, D. and Mazumder, B. (2013). Fetal origins and parental responses. Annu. Rev. Econ., (January 2013). Anderson, M. L. (2008). Multiple inference and gender dierences in the eects of early intervention: A reevaluation of the Abecedarian, Perry Preschool, and Early Training Projects. Journal of the American statistical Association, 103(484):1481{1495. Angelini, V., Cavapozzi, D., and Paccagnella, O. (2011). Dynamics of reporting work disability in europe. Journal of the Royal Statistical Society: Series A (Statistics in Society), 174(3):621{638. Arceo, E., Hanna, R., and Oliva, P. (2016). Does the Eect of Pollution on Infant Mortality Dier Between Developing and Developed Countries? Evidence from Mexico City. The Economic Journal, 126(591):257{280. Atkin, D. (2016). Endogenous Skill Acquisition and Export Manufacturing in Mexico. Technical Re- port 8. Attanasio, O. and Kaufmann, K. (2012). Education choices and returns on the labor and marriage markets: Evidence from data on subjective expectations. 105 Attanasio, O. P. and Kaufmann, K. M. (2014). Education choices and returns to schooling: Mothers' and youths' subjective expectations and their role by gender. Journal of Development Economics, 109:203{216. Aubard, Y. and Magne, I. (2000). Carbon monoxide poisoning in pregnancy. BJOG: An International Journal of Obstetrics & Gynaecology, 107(7):833{838. Backes, C. H., Nelin, T., Gorr, M. W., and Wold, L. E. (2013). Early life exposure to air pollution: how bad is it? Toxicology letters, 216(1):47{53. Bago d'Uva, T., Lindeboom, M., O'Donnell, O., and Van Doorslaer, E. (2011). Slipping anchor? testing the vignettes approach to identication and correction of reporting heterogeneity. Journal of Human Resources, 46(4):875{906. Bago d'Uva, T., O'Donnell, O., and van Doorslaer, E. (2008a). Dierential health reporting by education level and its impact on the measurement of health inequalities among older europeans. International Journal of Epidemiology, 37(6):1375{1383. Bago d'Uva, T., Van Doorslaer, E., Lindeboom, M., and O'Donnell, O. (2008b). Does reporting het- erogeneity bias the measurement of health disparities? Health economics, 17(3):351{375. Barber, S. L. and Gertler, P. J. (2008). The impact of mexico's conditional cash transfer programme, oportunidades, on birthweight. Tropical Medicine & International Health, 13(11):1405{1414. Barham, T. (2011). A healthier start: the eect of conditional cash transfers on neonatal and infant mortality in rural mexico. Journal of Development Economics, 94(1):74{85. Barham, T. and Rowberry, J. (2013). Living longer: The eect of the mexican conditional cash transfer program on elderly mortality. Journal of Development Economics, 105:226{236. Barker, D. J. (1990). The fetal and infant origins of adult disease. BMJ: British Medical Journal, 301(6761):1111. Bartik, T. J. (1991). Who benets from state and local economic development policies. W. E. Upjohn Institute for Employment Research, Kalamazoo, Michigan. Becker, G. S. (1962). Investment in human capital: A theoretical analysis. Journal of political economy, 70(5, Part 2):9{49. Behrman, J. R., Fernald, L., Gertler, P., Neufeld, L. M., and Parker, S. (2008). Long-term eects of Oportunidades on rural infant and toddler development, education and nutrition after almost a decade of exposure to the program, volume I, chapter 1, pages 15{58. Secretar a de Desarrollo Social. Behrman, J. R., Parker, S. W., and Todd, P. E. (2009). Medium-term impacts of the oportunidades conditional cash transfer program on rural youth in mexico. Poverty, Inequality and Policy in Latin America, pages 219{70. Behrman, J. R., Parker, S. W., and Todd, P. E. (2011). Do conditional cash transfers for schooling generate lasting benets? A ve-year followup of PROGRESA/Oportunidades. Journal of Human Resources, 46(1):93{122. Behrman, J. R. and Rosenzweig, M. R. (2004). Returns to Birthweight. The Review of Economics and Statistics, 86(2):586{601. 106 Behrman, J. R. and Todd, P. E. (1999). Randomness in the experimental samples of progresa (education, health, and nutrition program). International Food Policy Research Institute, Washington, DC. Belzil, C. (2007). The return to schooling in structural dynamic models: a survey. European Economic Review, 51(5):1059{1105. Benyamini, Y., Leventhal, E. A., and Leventhal, H. (2000). Gender dierences in processing information for making self-assessments of health. Psychosomatic Medicine, 62(3):354{364. Bhalotra, S. and Venkataramani, A. (2013). Cognitive Development and Infectious Disease: Gender Dierences in Investments and Outcomes. Bhalotra, S. R. and Venkataramani, A. (2011). The captain of the men of death and his shadow: Long-run impacts of early life pneumonia exposure. Technical report, Discussion Paper series, Forschungsinstitut zur Zukunft der Arbeit. Bharadwaj, P., Gibson, M., Zivin, J., and Neilson, C. (2014). Gray Matters: Fetal Pollution Exposure and Human Capital Formation. econweb.ucsd.edu. Bitler, M. P., Hoynes, H. W., and Domina, T. (2014). Experimental evidence on distributional eects of head start. Technical Report 20434, National Bureau of Economic Research. Black, S., B utikofer, A., Devereux, P., and Salvanes, K. (2014). This Is Only a Test? Long-Run and Intergenerational Impacts of Prenatal Exposure to Radioactive Fallout. Scientic American, pages 1{50. Black, S. E., Devereux, P. J., and Salvanes, K. G. (2007). From the Cradle to the Labor Market? The Eect of Birth Weight on Adult Outcomes. The Quarterly Journal of Economics, 122(1):409{439. Blattman, C., Fiala, N., and Martinez, S. (2013). The economic and social returns to cash transfers: Evidence from a ugandan aid program. Technical report, CEGA Working Paper. Bleakley, H. (2007). Disease and development: evidence from hookworm eradication in the American South. The Quarterly Journal of Economics, 122(1):73. Bleakley, H. (2010). Malaria eradication in the Americas: A retrospective analysis of childhood exposure. American economic journal. Applied economics, 2(2). Bobonis, G. J. (2009). Is the allocation of resources within the household ecient? new evidence from a randomized experiment. Journal of Political Economy, 117(3):453{503. Bobonis, G. J., Miguel, E., and Puri-Sharma, C. (2006). Anemia and school participation. Journal of Human resources, 41(4):692{721. Case, A. and Paxson, C. (2005). Sex dierences in morbidity and mortality. Demography, 42(2):189{214. Chay, K. Y. and Greenstone, M. (2003). The Impact of Air Pollution on Infant Mortality: Evidence from Geographic Variation in Pollution Shocks Induced by a Recession. The Quarterly Journal of Economics, 118(3):1121{1167. Chetty, R., Hendren, N., and Katz, L. F. (2016). The eects of exposure to better neighborhoods on children: New evidence from the moving to opportunity experiment. American Economic Review, 106(4):855{902. Conley, T. (1999). Gmm estimation with cross sectional dependence. Journal of Econometrics, 92(1):1{ 45. 107 Conti, G. and Heckman, J. J. (2014). Economics of Child Well-Being, pages 363{401. Springer Nether- lands, Dordrecht. Conti, G., Heckman, J. J., and Pinto, R. (2015). The eects of two in uential early childhood interven- tions on health and healthy behaviors. Technical report, National Bureau of Economic Research. Courtenay, W. H. (2000). Constructions of masculinity and their in uence on men's well-being: a theory of gender and health. Social Science & Medicine, 50(10):1385{1401. Crimmins, E. M., Kim, J. K., and Sol e-Aur o, A. (2010). Gender dierences in health: results from share, elsa and hrs. The European Journal of Public Health, page ckq022. Cunha, F. and Heckman, J. (2007). The Technology of Skill Formation. The American Economic Review, 97(2):31{47. Cunha, F. and Heckman, J. J. (2008). Formulating, identifying and estimating the technology of cognitive and noncognitive skill formation. Journal of human resources, 43(4):738{782. Cunha, F., Heckman, J. J., and Schennach, S. M. (2010). Estimating the technology of cognitive and noncognitive skill formation. Econometrica, 78(3):883{931. Currie, J. and Neidell, M. (2005). Air Pollution and Infant Health: What Can We Learn From Califor- nia's Recent Experience? Quarterly Journal of Economics, 120(3):1003{1030. Currie, J. and Vogl, T. (2013). Early-Life Health and Adult Circumstance in Developing Countries. Annu. Rev. Econ, 5:1{36. Currie, J., Zivin, J. G., Mullins, J., and Neidell, M. (2014). What Do We Know About Short- and Long-Term Eects of Early-Life Exposure to Pollution? Annual Review of Resource Economics, 6(March):217{247. Cutler, D., Fung, W., Kremer, M., Singhal, M., and Vogl, T. (2010). Early-life malaria exposure and adult outcomes: Evidence from malaria eradication in India. American Economic Journal: Applied Economics, pages 72{94. Cutler, D. M. and Lleras-Muney, A. (2006). Education and health: evaluating theories and evidence. Technical report, National Bureau of Economic Research. De Janvry, A., Emerick, K., Gonzalez-Navarro, M., and Sadoulet, E. (2015). Delinking land rights from land use: Certication and migration in mexico. The American Economic Review, 105(10):3125{ 3149. DeSalvo, K. B., Bloser, N., Reynolds, K., He, J., and Muntner, P. (2006). Mortality prediction with a single general self-rated health question. Journal of general internal medicine, 21(3):267{275. Dinkelman, T. (2013). Mitigating long-run health eects of drought: Evidence from south africa. Technical Report 19756, National Bureau of Economic Research. Dobbing, J. and Sands, J. (1973). Quantitative growth and development of human brain. Archives of disease in childhood, 48(10):757{767. Dow, W. H., Gertler, P., Schoeni, R. F., Strauss, J., and Thomas, D. (1997). Health care prices, health and labor outcomes: Experimental evidence. RAND. 108 Dowd, J. B. and Todd, M. (2011). Does self-reported health bias the measurement of health inequalities in us adults? evidence using anchoring vignettes from the health and retirement study. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences, 66(4):478{489. Duque, V. (2017). Early-life conditions and child development: Evidence from a violent con ict. Social Science & Medicine, 3:121 { 131. Eisenhauer, P., Heckman, J. J., and Mosso, S. (2015). Estimation of dynamic discrete choice models by maximum likelihood and the simulated method of moments. International economic review, 56(2):331{357. Fernald, L. C., Gertler, P. J., and Hou, X. (2008a). Cash component of conditional cash transfer program is associated with higher body mass index and blood pressure in adults. The Journal of nutrition, 138(11):2250{2257. Fernald, L. C., Gertler, P. J., and Neufeld, L. M. (2008b). Role of cash in conditional cash transfer programmes for child health, growth, and development: an analysis of mexico's oportunidades. The Lancet, 371(9615):828{837. Fernald, L. C., Gertler, P. J., and Neufeld, L. M. (2009). 10-year eect of oportunidades, mexico's condi- tional cash transfer programme, on child growth, cognition, language, and behaviour: a longitudinal follow-up study. The Lancet, 374(9706):1997{2005. Fernald, L. C., Hou, X., and Gertler, P. J. (2008c). Oportunidades program participation and body mass index, blood pressure, and self-reported health in mexican adults. Prev Chronic Dis, 5(3):A81. Field, E., Robles, O., and Torero, M. (2009). Iodine Deciency and Schooling Attainment in Tanzania. American Economic Journal: Applied Economics, pages 140{169. Fink, G., Venkataramani, A., and Zanolini, A. (2015). Do it well or not at all? malaria control and child development in zambia. Malaria Control and Child Development in Zambia (December 18, 2015). Finkelstein, A., Taubman, S., Wright, B., Bernstein, M., Gruber, J., Newhouse, J. P., Allen, H., and Baicker, K. (2012). The Oregon Health Insurance Experiment: Evidence from the First Year. Quarterly Journal of Economics. Gertler, P. (2004). Do conditional cash transfers improve child health? evidence from progresa's control randomized experiment. American Economic Review, 94(2):336{341. Gertler, P. and Gruber, J. (2002). Insuring consumption against illness. American Economic Review, 92(1):51{70. Gertler, P., Heckman, J., Pinto, R., Zanolini, A., Vermeersch, C., Walker, S., Chang, S. M., and Grantham-McGregor, S. (2014). Labor market returns to an early childhood stimulation interven- tion in jamaica. Science, 344(6187):998{1001. Gertler, P. J., Martinez, S. W., and Rubio-Codina, M. (2012). Investing cash transfers to raise long-term living standards. American Economic Journal: Applied Economics, pages 164{192. Gould, E. D., Lavy, V., and Paserman, M. D. (2011). Sixty years after the magic carpet ride: The long-run eect of the early childhood environment on social and economic outcomes. The Review of Economic Studies, 78(3):938{973. 109 Griliches, Z. and Mason, W. M. (1972). Education, Income, and Ability. Journal of Political Economy, 80(3):S74{S103. Grol-Prokopczyk, H. (2014). Age and sex eects in anchoring vignette studies: Methodological and empirical contributions. In Survey research methods, volume 8, page 1. NIH Public Access. Grol-Prokopczyk, H., Freese, J., and Hauser, R. M. (2011). Using anchoring vignettes to assess group dierences in general self-rated health. Journal of health and social behavior, 52(2):246{261. Grol-Prokopczyk, H., Verdes-Tennant, E., McEniry, M., and Isp any, M. (2015). Promises and pitfalls of anchoring vignettes in health survey research. Demography, 52(5):1703{1728. Grossman, M. (2006). Education and nonmarket outcomes. Handbook of the Economics of Education, 1:577{633. Gunnsteinsson, S., Adhvaryu, A., Christian, P., Labrique, A., Sugimoto, J., Shamim, A. A., and West Jr., K. P. (2014). Vitamin A and Resilience to Early Life Shocks. Working Paper. Gunnsteinsson, S., Adhvaryu, A., Christian, P., Labrique, A., Sugimoto, J., Shamim, A. A., and West Jr., K. P. (2016). Resilience to early life shocks. Technical report. Haushofer, J. and Shapiro, J. (2013). Household response to income changes: Evidence from an uncon- ditional cash transfer program in kenya. Technical report. Health and Retirement Study, public use dataset (2014). Produced and distributed by the University of Michigan with funding from the National Institute on Aging (grant number NIA U01AG009740). Ann Arbor, MI: Accessed July, 2014. Heckman, J. (2006). Skill formation and the economics of investing in disadvantaged children. Science, 1688(June):2005{2007. Heckman, J., Moon, S. H., Pinto, R., Savelyev, P., and Yavitz, A. (2010). Analyzing social experiments as implemented: A reexamination of the evidence from the highscope perry preschool program. Quantitative economics, 1(1):1{46. Heckman, J., Pinto, R., and Savelyev, P. (2013). Understanding the mechanisms through which an in uential early childhood program boosted adult outcomes. American Economic Review, 103(6):1{ 35. Heckman, J. and Scheinkman, J. (1987). The importance of bundling in a Gorman-Lancaster model of earnings. The Review of Economic Studies, 54(2):243{255. Heckman, J. J. (2007). The economics, technology, and neuroscience of human capability formation. Proceedings of the national Academy of Sciences, 104(33):13250{13255. Heckman, J. J. and Mosso, S. (2014). The economics of human development and social mobility. Technical report, National Bureau of Economic Research. Heckman, J. J. and Sedlacek, G. (1985). Heterogeneity, aggregation, and market wage functions: an empirical model of self-selection in the labor market. Journal of political Economy, 93(6):1077{1125. Hoddinott, J., Maluccio, J. A., Behrman, J. R., Flores, R., and Martorell, R. (2008). Eect of a nutrition intervention during early childhood on economic productivity in Guatemalan adults. The lancet, 371(9610):411{416. 110 Hoddinott, J. and Skouas, E. (2004). The impact of progresa on food consumption. Economic devel- opment and cultural change, 53(1):37{61. Hoynes, H., Schanzenbach, D. W., and Almond, D. (2016). Long-Run Impacts of Childhood Access to the Safety Net. The American Economic Review, 106(4):903{934. Idler, E. L. and Benyamini, Y. (1997). Self-rated health and mortality: a review of twenty-seven community studies. Journal of health and social behavior, pages 21{37. Isen, A., Rossin-Slater, M., and Walker, W. (2014). Every Breath You TakeEvery Dollar You'll Make: The Long-Term Consequences of the Clean Air Act of 1970. Jacobson, M. Z. (2002). Atmospheric pollution: history, science, and regulation. Cambridge University Press. Jans, J., Johansson, P., and Nilsson, J. P. (2014). Economic Status, Air Quality, and Child Health: Evidence from Inversion Episodes. (7929). Jayachandran, S. (2009). Air Quality and Early-Life Mortality: Evidence from Indonesia's Wildres. Journal of Human Resources, 44(4):916{954. Jensen, R. (2010). The (perceived) returns to education and the demand for schooling. Quarterly Journal of Economics, 125(2). Jensen, R. (2012). Do labor market opportunities aect young women's work and family decisions? experimental evidence from india. The Quarterly Journal of Economics, page qjs002. Jones, K., Smith, D., Ulleland, C., and Streissguth, A. (1973). Pattern of malformation in ospring of chronic alcoholic mothers. The Lancet, 301(7815):1267{1271. Kapteyn, A., Smith, J., and Soest, A. v. (2007). Vignettes and self-reports of work disability in the United States and the Netherlands. The American Economic Review, 1. Kapteyn, A., Smith, J. P., and van Soest, A. (2010). Life satisfaction. International dierences in well-being, pages 70{104. Kaufmann, K. M. (2014). Understanding the income gradient in college attendance in Mexico: The role of heterogeneity in expected returns. Quantitative Economics, 5(3):583{630. Keane, M. P. and Wolpin, K. I. (1997). The Career Decisions of Young Men. Journal of political Economy, 105(3):473{522. King, G., Murray, C. J. L., Salomon, J. a., and Tandon, A. (2004). Enhancing the Validity and Cross- Cultural Comparability of Measurement in Survey Research. American Political Science Review, 98(01). Lacasa~ na, M., Esplugues, A., and Ballester, F. (2005). Exposure to ambient air pollution and prenatal and early childhood health eects. European journal of epidemiology, 20(2):183{199. Lavy, V. and Schlosser, A. (2005). Targeted remedial education for underperforming teenagers: Costs and benets. Journal of Labor Economics, 23(4). Lavy, V., Schlosser, A., and Shany, A. (2016). Out of africa: Human capital consequences of in utero conditions. Technical report, National Bureau of Economic Research. 111 Le, H. Q., Batterman, S. A., Wirth, J. J., Wahl, R. L., Hoggatt, K. J., Sadeghnejad, A., Hultin, M. L., and Depa, M. (2012). Air pollutant exposure and preterm and term small-for-gestational-age births in Detroit, Michigan: long-term trends and associations. Environment international, 44:7{17. Lei, X., Smith, J. P., Sun, X., and Zhao, Y. (2013). Gender dierences in cognition in China and reasons for change over time: Evidence from CHARLS. Maccini, S. and Yang, D. (2009). Under the Weather: Health, Schooling, and Economic Consequences of Early-Life Rainfall. American Economic Review, 99(3):1006{1026. Macintyre, S., Ford, G., and Hunt, K. (1999). Do women over-report morbidity? men's and women's responses to structured prompting on a standard question on long standing illness. Social science & medicine, 48(1):89{98. Malamud, O., Pop-Eleches, C., and Urquiola, M. (2016). Interactions between family and school environ- ments: Evidence on dynamic complementarities? Technical report, National Bureau of Economic Research. Maluccio, J. A., Hoddinott, J., Behrman, J. R., Martorell, R., Quisumbing, A. R., and Stein, A. D. (2009). The impact of improving nutrition during early childhood on education among Guatemalan adults. The Economic Journal, 119(537):734{763. Manning, W. G., Newhouse, J. P., Duan, N., Keeler, E. B., Leibowitz, a., and Marquis, M. S. (1987). Health insurance and the demand for medical care: evidence from a randomized experiment. The American Economic Review, 77(3):251{77. Marmot, M., Oldeld, Z., Clemens, S., Blake, M., Phelps, A., Nazroo, J., Steptoe, A., Rogers, N., and Banks, J. (2014). English Longitudinal Study of Ageing: Waves 0-6, 1998-2013. UK Data Archive, Colchester, Essex, 21 edition. SN: 5050 , http://dx.doi.org/10.5255/UKDA-SN-5050-8. Accessed July 2014. McBride, W. (1961). Thalidomide and Congenital Abnormalities. The Lancet, 278(7216):1358. Mesinger, F., DiMego, G., Kalnay, E., Mitchell, K., Shafran, P. C., Ebisuzaki, W., Jovic, D., Woollen, J., Rogers, E., Berbery, E. H., et al. (2006). North American regional reanalysis. Bulletin of the American Meteorological Society, 87(3):343{360. Minnesota Population Center (2015). Integrated Public Use Microdata Series, International: Version 6.4 [Machine-readable database]. University of Minnesota, Minneapolis. Molina, T. (2014). Adjusting for heterogeneous response thresholds in cross-country comparisons of mid-aged and elderly self-reported health. Mu, R. (2014). Regional disparities in self-reported health: Evidence from chinese older adults. Health economics, 23(5):529{549. Nathanson, C. A. (1975). Illness and the feminine role: a theoretical review. Social Science & Medicine (1967), 9(2):57{62. Nguyen, T. (2008). Information, Role Models and Perceived Returns to Education: Experimental Evidence from Madagascar. OECD (2006). Agricultural and Fisheries Policies in Mexico: Recent Achievements, Continuing the Reform Agenda. Organisation for Economic Co-operation and Development. 112 Otake, M. (1998). Review: Radiation-related brain damage and growth retardation among the prena- tally exposed atomic bomb survivors. International journal of radiation biology, 74(2):159{171. Paxson, C. H. (1992). Using weather variability to estimate the response of savings to transitory income in thailand. American Economic Review, 82(1):15{33. Peet, E. D. (2016). Environment and Human Capital: The Eects of Early-Life Exposure to Pollutants in the Philippines. Perera, F. P., Whyatt, R. M., Jedrychowski, W., Rauh, V., Manchester, D., Santella, R. M., and Ottman, R. (1998). Recent developments in molecular epidemiology: a study of the eects of environmental polycyclic aromatic hydrocarbons on birth outcomes in Poland. American Journal of Epidemiology, 147(3):309{314. Persson, P. and Rossin-Slater, M. (2014). Family ruptures and intergenerational transmission of stress. Technical Report 1022, Research Institute of Industrial Economics. Peterson, B. S., Rauh, V. A., Bansal, R., Hao, X., Toth, Z., Nati, G., Walsh, K., Miller, R. L., Arias, F., Semanek, D., et al. (2015). Eects of Prenatal Exposure to Air Pollutants (Polycyclic Aromatic Hydrocarbons) on the Development of Brain White Matter, Cognition, and Behavior in Later Childhood. JAMA psychiatry. Pitt, M. M., Rosenzweig, M. R., and Hassan, N. (2012). Human capital investment and the gender division of labor in a brawn-based economy. The American economic review, 102(7):3531. Politi, D. (2015). The eects of the generalized use of iodized salt on occupational patterns in Switzer- land. Working Paper. Rosenzweig, M. R. and Zhang, J. (2013). Economic growth, comparative advantage, and gender dier- ences in schooling outcomes: Evidence from the birthweight dierences of Chinese twins. Journal of Development Economics, 104:245{260. Rossin-Slater, M. and W ust, M. (2015). Are Dierent Early Investments Complements or Substitutes? Long-Run and Intergenerational Evidence from Denmark. Roy, A. D. (1951). Some thoughts on the distribution of earnings. Oxford economic papers, 3(2):135{146. Rubalcava, L. and Teruel, G. (2006). Mexican Family Life Survey, Second Round. www.ennvih-mxfls. org. Rubalcava, L. and Teruel, G. (2007). The Mexican Family Life Survey: First Wave. www.mxfls.uia.mx. Rubalcava, L. and Teruel, G. (2013). Mexican Family Life Survey, Third Round. www.ennvih-mxfls. org. Sadoulet, E., De Janvry, A., and Davis, B. (2001). Cash transfer programs with income multipliers: Procampo in mexico. World development, 29(6):1043{1056. Saenen, N., Plusquin, M., Bijnens, E., Janssen, B., Gyselaers, W., Cox, B., Fierens, F., Molenberghs, G., Penders, J., Vrijens, K., et al. (2015). In Utero Fine Particle Air Pollution and Placental Expression of Genes in the Brain-Derived Neurotrophic Factor Signaling Pathway: An ENVIRONAGE Birth Cohort Study. Sanders, N. J. (2012). What Doesn't Kill you Makes you Weaker: Prenatal Pollution Exposure and Educational Outcomes. Journal of Human Resources, 47(3):826{850. 113 Schultz, T. P. (2004). School subsidies for the poor: evaluating the mexican progresa poverty program. Journal of Development Economics, 74(1):199{250. Schwandt, H. (2016). The Lasting Legacy of Seasonal In uenza: In-utero Exposure and Human Capital Development. Technical report. Shah, M. and Steinberg, B. M. (2013). Drought of opportunities: Contemporaneous and long term impacts of rainfall shocks on human capital. Technical Report 19140, National Bureau of Economic Research. Shah, M. and Steinberg, B. M. (2015). Workfare and Human Capital Investment: Evidence from India. Technical report, National Bureau of Economic Research. Skouas, E. (2005). Progresa and its impacts on the welfare of rural households in mexico. Technical Report 139, INTERNATIONAL FOOD POLICY RESEARCH INSTITUTE. Skouas, E. and Parker, S. W. (2001). Conditional cash transfers and their impact on child work and schooling: Evidence from the progresa program in mexico. Economia, 2(1):45{96. Stein, Z., Susser, M., Saenger, G., and Marolla, F. (1975). Famine and human development: The Dutch hunger winter of 1944-1945. Strauss, J., Gertler, P. J., Rahman, O., and Fox, K. (1993). Gender and life-cycle dierentials in the patterns and determinants of adult health. Journal of Human Resources. Strauss, J., Witoelar, F., Sikoki, B., and Wattie, A. (2009). The Fourth Wave of the Indonesian Family Life Survey (IFLS4): Overview and Field Report. Technical report, WR-675/1-NIA/NICHD. Tau, G. Z. and Peterson, B. S. (2010). Normal development of brain circuits. Neuropsychopharmacology, 35(1):147{168. Thomas, D. and Strauss, J. (1997). Health and wages: Evidence on men and women in urban Brazil. Journal of Econometrics, 77(1):159{185. van Soest, A., Delaney, L., Harmon, C., Kapteyn, A., and Smith, J. P. (2011). Validating the use of anchoring vignettes for the correction of response scale dierences in subjective questions. Journal of the Royal Statistical Society. Venkataramani, A. S. (2012). Early life exposure to malaria and cognition in adulthood: evidence from Mexico. Journal of health economics, 31(5):767{80. Verbrugge, L. M. (1989). The twain meet: empirical explanations of sex dierences in health and mortality. Journal of health and social behavior, pages 282{304. Vogl, T. (2012). Education and Health in Developing Economies. (December 2012). Vogl, T. S. (2014). Height, skills, and labor market outcomes in Mexico. Journal of Development Economics, 107:84{96. Von Lenz, W. and Knapp, K. (1962). Die thalidomid-embryopathie. Deutsche Medizinishe Wochen- schrift, 87(24):1232{1242. Zhao, Y., Hu, Y., Smith, J. P., Strauss, J., and Yang, G. (2012). Cohort prole: The china health and retirement longitudinal study (charls). International journal of epidemiology. Zhao, Y., Strauss, J., Yang, G., Giles, J., Hu, P., Hu, Y., Lei, X., Park, A., Smith, J. P., and Wang, Y. (2013). China Health and Retirement Longitudinal Study{2011-2012 National Baseline Users Guide. Beijing: National School of Development, Peking University. 114 Appendices 115 Appendix A Appendix to Chapter 2 (Pollution, Ability, and Gender Dierences) A.1 Appendix Tables Table A1 reports ISCO occupation shares, by gender, from the 2010 Mexican census. I group the occupation categories into two groups using the brain-intensive and brawn-intensive classication used by Vogl (2014), who calculates average skill and strength intensities of each occupational category using job requirement scores in the Dictionary of Occupational Titles. Table A2 reports summary statistics for all of the relevant variables for the structural estimation described in section 2.4.2. Tables A3 to A6 provide the coecient estimates, standard errors, and observation counts for the graphs in section 2.5.1. For each variable, the rst column includes the basic xed eects (municipality, month, and year) and the second adds state-specic season xed eects and state- specic quadratic trends. Table A8 demonstrates the robustness of the labor market mechanism results to the use of other proxies of p jg . The regression in column 2 assigns individuals with the relevant white-collar proportion from the census decade in which they turned 12. Results are consistent with those in Table 2.3. Column 3 reports the results from using a continuous version of the discrete measure used in Table 2.3, demeaned so that the main eects can be interpreted as the eects for the average individual. Though more imprecise, these results are consistent with the previous nding 116 that gender-specic labor market opportunities appear to be playing a more important role than gender itself { the male interaction with second trimester inversions is much smaller in magnitude than in column 1. Figures A1 and A2 show the robustness of my main results to the inclusion of additional xed eects. Table A9 repeats the analysis conducted in Table 2.3, except using Raven's test scores as the dependent variable. Unlike the eect of second trimester pollution on high school completion, the eect of second trimester pollution on cognitive ability is not heterogeneous across white-collar proportions. Table A10 repeats the analysis conducted in Tables 2.3 and A8, but adds interactions between thermal inversions and agricultural shares. Zone-specic agricultural shares are calculated using industry codes in the census and matched to individuals using the commuting zone in which they lived at age 12 and the census decade during which they turned 12. The main results from Table 2.3 are robust to the inclusion of these additional interactions. 117 Table A1 Occupation Distributions by Gender ISCO Occupation Code & Description Male Female White-Collar ("Brains") 19.76 34.85 1 Legislators, senior officials and managers 5 4.47 2 Professionals 7.39 7.88 3 Technicians and associate professionals 4.08 12.04 4 Clerks 3.29 10.46 Blue-Collar ("Brawn") 80.25 65.15 5 Service workers and shop and market sales 17.62 29.63 6 Skilled agricultural and fishery workers 12.25 1.7 7 Crafts and related trades workers 24.23 8.26 8 Plant and machine operators and assemblers 14.19 5.44 9 Elementary occupations (domestic workers, laborers, etc) 11.96 20.12 Notes: Notes: Brain and brawn categorizations from Vogl (2014). Weighted percentages calculated from adults aged 30 to 50 in the 2010 Mexican census. Table A2 Summary Statistics for Structural Estimation Sample Variable Name Mean Standard Deviation N Individual-Level Variables Total annual income (inverse hyperbolic sine) 7.13 5.246 5265 1(Completed high school) 0.25 0.433 5265 1(White collar sector) 0.21 0.407 5265 1(Blue collar sector) 0.45 0.497 5265 1(Not employed) 0.34 0.475 5265 Raven's test score (% correct) 0.52 0.240 5265 Age 38.72 5.795 5265 1(Male) 0.41 0.491 5265 1(Urban) 0.62 0.486 5265 Mother's education 3.35 3.477 5265 Father's education 3.68 3.891 5265 Labor Market Variables White collar proportion 0.33 0.170 5265 Unemployment rate 0.02 0.0127 5265 Youth employment rate, per 10 children 3.21 1.972 5265 Number of teachers per 10 school-aged children 0.11 0.0923 5265 Notes: Sample includes individuals aged 30 to 50 who are either currently employed (with non- missing sector and income information) or reported never having worked before. Individual-level variables are from the Mexican Family Life Survey. Labor market variables are from the 1970 to 2000 censuses and matched to individuals by their commuting zone of residence. White collar proportion is the fraction of adult males (for men) and adult females (for women) working in the white collar sector during individual's early working years. Unemployment rate is the adult unemployment rate during an individual's early working years. Youth employment rate is the proportion of boys (for men) or girls (for women) in the stated age category who report being employed during an individual's school-aged years. 118 Table A3 Eects of Pollution on Health (1) (2) (3) (4) Average monthly inversions… Raven's test z-score Raven's test z-score Height z-score Height z-score BEFORE CONCEPTION 19-21 months before birth 0.00453 0.00351 -0.00105 -0.00283 (0.00463) (0.00466) (0.00580) (0.00605) 16-18 months before birth 0.00435 0.00297 -0.00177 -0.00242 (0.00387) (0.00398) (0.00519) (0.00515) 13-15 months before birth 0.00767 0.00556 0.00356 0.00221 (0.00497) (0.00507) (0.00501) (0.00507) 10-12 months before birth 0.00284 0.000185 0.00888* 0.00533 (0.00542) (0.00557) (0.00527) (0.00536) DURING PREGNANCY Trimester 1 0.00398 0.00324 0.00519 0.00570 (0.00595) (0.00592) (0.00542) (0.00526) Trimester 2 -0.0119** -0.0130** 0.00187 0.000907 (0.00561) (0.00581) (0.00550) (0.00557) Trimester 3 0.00465 0.00350 0.00254 0.00116 (0.00543) (0.00535) (0.00519) (0.00514) AFTER BIRTH 0-2 months after birth -0.00349 -0.00448 -0.00196 -0.00442 (0.00615) (0.00634) (0.00467) (0.00470) 3-5 months after birth 0.00321 0.00318 0.000718 0.000341 (0.00457) (0.00477) (0.00539) (0.00530) 6-8 months after birth 0.000462 -0.000623 -0.000524 -0.00208 (0.00466) (0.00470) (0.00571) (0.00589) 9-11 months after birth -0.00594 -0.00846 -0.00133 -0.00207 (0.00502) (0.00521) (0.00539) (0.00552) N 10320 10320 10398 10398 Mean of dependent variable 0.0164 0.0164 -1.008 -1.008 Additional Fixed Effects None state-by-season, state- by-quadratic-year None state-by-season, state- by-quadratic-year Standard errors in parentheses (clustered at municipality level) * p<0.1 ** p<0.05*** p<0.01 Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01 All regressions control for birth month, birth year, municipality of birth, and survey wave by birth year xed eects, gender, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. 119 Table A4 Eects of Pollution on Health by Gender (1) (2) (3) (4) Average monthly inversions… Raven's test z-score Raven's test z-score Height z-score Height z-score FEMALE Trimester 1 0.00464 0.00221 0.00182 0.00603 (0.00774) (0.00790) (0.00810) (0.00818) Trimester 2 -0.00971 -0.0107 0.00565 0.00453 (0.00832) (0.00850) (0.00729) (0.00756) Trimester 3 0.00392 0.00257 -0.00178 -0.00153 (0.00770) (0.00784) (0.00668) (0.00668) N 5455 5455 5506 5506 Dependent variable mean -0.00429 -0.00429 -1.043 -1.043 MALE Trimester 1 0.00193 0.00155 0.00746 0.00521 (0.00754) (0.00761) (0.00754) (0.00759) Trimester 2 -0.0139* -0.0127 -0.00539 -0.00830 (0.00814) (0.00883) (0.00805) (0.00825) Trimester 3 0.00438 0.00294 0.0107 0.00790 (0.00862) (0.00842) (0.00831) (0.00826) N 4865 4865 4892 4892 Dependent variable mean 0.0397 0.0397 -0.970 -0.970 MALE-FEMALE DIFFERENCE Trimester 1 -0.00272 -0.000663 0.00564 -0.000822 (0.0101) (0.0109) (0.0111) (0.0115) Trimester 2 -0.00422 -0.00199 -0.0110 -0.0128 (0.0117) (0.0122) (0.0108) (0.0110) Trimester 3 0.000465 0.000376 0.0124 0.00943 (0.0120) (0.0120) (0.0105) (0.0106) Additional Fixed Effects None state-by-season, state- by-quadratic-year None state-by-season, state- by-quadratic-year Standard errors in parentheses (clustered at municipality level) * p<0.1 ** p<0.05*** p<0.01 Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01 All regressions control for birth month, birth year, municipality of birth, and survey wave by birth year xed eects, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m tem- peratures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. Separate regressions are conducted for men and women. 120 Table A5 Eects of Pollution on Schooling by Gender (1) (2) (3) (4) Average monthly inversions… Years of Schooling Years of Schooling HS Completion HS Completion FEMALE Trimester 1 0.0296 0.0254 0.00375 0.00363 (0.0217) (0.0213) (0.00352) (0.00352) Trimester 2 -0.0165 -0.0232 -0.00773** -0.00792*** (0.0208) (0.0202) (0.00305) (0.00298) Trimester 3 -0.0140 -0.0109 0.000748 0.00179 (0.0210) (0.0196) (0.00307) (0.00311) N 5634 5634 5634 5634 Dependent variable mean 9.521 9.521 0.288 0.288 MALE Trimester 1 -0.000200 -0.00251 -0.000848 -0.000607 (0.0194) (0.0192) (0.00298) (0.00304) Trimester 2 0.00665 0.00771 -0.000714 -0.000211 (0.0183) (0.0196) (0.00261) (0.00270) Trimester 3 -0.00524 -0.00777 0.00315 0.00291 (0.0178) (0.0174) (0.00320) (0.00333) N 5081 5081 5081 5081 Dependent variable mean 9.199 9.199 0.241 0.241 MALE - FEMALE DIFFERENCE Trimester 1 -0.0298 -0.0279 -0.00460 -0.00423 (0.0304) (0.0293) (0.00476) (0.00478) Trimester 2 0.0231 0.0309 0.00702* 0.00771* (0.0281) (0.0295) (0.00393) (0.00405) Trimester 3 0.00874 0.00314 0.00240 0.00112 (0.0256) (0.0243) (0.00429) (0.00431) Additional Fixed Effects None state-by-season, state- by-quadratic-year None state-by-season, state- by-quadratic-year Standard errors in parentheses (clustered at municipality level) * p<0.1 ** p<0.05*** p<0.01 Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01 All regressions control for birth month, birth year, municipality of birth, and survey wave by birth year xed eects, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m tem- peratures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. Separate regressions are conducted for men and women. 121 Table A6 Eects of Pollution on Income by Gender (1) (2) Average monthly inversions… Annual income Annual income FEMALE Trimester 1 -140.1 10.54 (545.4) (763.1) Trimester 2 -1131.4** -1067.9* (570.1) (585.7) Trimester 3 -38.51 95.39 (735.5) (847.6) N 946 946 Dependent variable mean 24314.0 24314.0 MALE Trimester 1 -404.8 -766.2 (430.1) (521.2) Trimester 2 -465.2 -653.8 (372.4) (415.4) Trimester 3 -225.6 -508.7 (305.6) (312.1) N 1833 1833 Dependent variable mean 31101.5 31101.5 MALE - FEMALE DIFFERENCE Trimester 1 -264.7 -776.7 (649.7) (783.3) Trimester 2 666.2 414.1 (656.4) (698.4) Trimester 3 -187.1 -604.1 (666.6) (747.9) Additional Fixed Effects None state-by-season, state- by-quadratic-year Standard errors in parentheses (clustered at municipality level) * p<0.1 ** p<0.05*** p<0.01 Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01. All regressions control for birth month, birth year, munic- ipality of birth, and survey wave by birth year xed eects, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. Separate regressions are conducted for men and women. 122 Table A7 Eects of Pollution on Early Educational Attainment, by Gender (1) (2) (3) (4) Average monthly inversions… Elementary School Completion Elementary School Completion Junior High School Completion Junior High School Completion FEMALE Trimester 1 -0.000130 -0.000326 0.00264 0.00210 (0.00207) (0.00203) (0.00326) (0.00325) Trimester 2 0.00217 0.00167 0.000556 -0.000529 (0.00179) (0.00187) (0.00343) (0.00342) Trimester 3 -0.00173 -0.00129 -0.00285 -0.00324 (0.00180) (0.00179) (0.00361) (0.00344) N 5634 5634 5634 5634 Dependent variable mean 0.929 0.929 0.709 0.709 MALE Trimester 1 -0.000277 -0.000402 -0.00475 -0.00493 (0.00187) (0.00197) (0.00344) (0.00351) Trimester 2 0.00185 0.00122 0.00229 0.00188 (0.00208) (0.00221) (0.00364) (0.00373) Trimester 3 -0.00252 -0.00303 -0.00175 -0.00162 (0.00209) (0.00208) (0.00283) (0.00293) N 5081 5081 5081 5081 Dependent variable mean 0.908 0.908 0.664 0.664 MALE - FEMALE DIFFERENCE Trimester 1 -0.000147 -0.0000758 -0.00739 -0.00702 (0.00300) (0.00307) (0.00470) (0.00486) Trimester 2 -0.000324 -0.000447 0.00174 0.00241 (0.00262) (0.00283) (0.00550) (0.00545) Trimester 3 -0.000790 -0.00174 0.00111 0.00162 (0.00256) (0.00243) (0.00450) (0.00441) Additional Fixed Effects None state-by-season, state- by-quadratic-year None state-by-season, state- by-quadratic-year Standard errors in parentheses (clustered at municipality level) Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01 All regressions control for birth month, birth year, municipality of birth, and survey wave by birth year xed eects, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m tem- peratures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. Separate regressions are conducted for men and women. 123 Table A8 Eects of Pollution on High School Graduation, by Alternative White-Collar Variables (1) (2) (3) Average monthly inversions… HS Completion HS Completion HS Completion Trimester 1 0.00363 0.00155 0.000103 (0.00353) (0.00498) (0.00503) Trimester 2 -0.00792*** 0.000500 -0.00509 (0.00299) (0.00441) (0.00436) Trimester 3 0.00179 0.00216 -0.000695 (0.00311) (0.00546) (0.00475) Trimester 1 -0.00423 -0.00337 0.000263 x 1(Male) (0.00478) (0.00584) (0.00606) Trimester 2 0.00771* 0.00112 0.00510 x 1(Male) (0.00405) (0.00481) (0.00515) Trimester 3 0.00112 0.00000468 0.00266 x 1(Male) (0.00431) (0.00531) (0.00556) Trimester 1 0.00238 0.0165 x White Collar Variable (0.00389) (0.0156) Trimester 2 -0.00971** -0.0152 x White Collar Variable (0.00378) (0.0142) Trimester 3 -0.000640 0.00828 x White Collar Variable (0.00435) (0.0137) N 10715 10677 10572 Dependent variable mean 0.266 0.265 0.264 Additional Fixed Effects White Collar Variable None Discrete, assigned by census Continuous, predicted state-by-season, state-by-quadratic-year Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01 All regressions control for the following variables and their interactions with a male indicator (as well as the main eect of gender): birth month, birth year, municipality of birth, and survey wave by birth year xed eects, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period, as well as inversions in all other three-month periods. In columns 2 and 3, the main eect of the white collar variable and the interactions with inversions in all other three month periods are also included. 124 Figure A1 Eects of Pollution on Cognitive Ability, with Additional Fixed Eects Notes: Intervals represent 90% condence intervals. All regressions control for birth month, birth year, municipality of birth, and survey wave by birth year xed eects, state-by-quadratic year trends, gender, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. Figure A2 Eects of Pollution on High School Completion by Gender, with Additional Fixed Eects Notes: Separate regressions are conducted for men and women. Intervals represent 90% and 75% condence intervals. Controls include birth month, birth year, municipality of birth, and survey wave by birth year xed eects, state-by-quadratic year trends, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period. 125 Table A9 Eects of Pollution on Cognitive Ability, by White Collar Opportunities (1) (2) (3) (4) Average monthly inversions… Raven's test z- score Raven's test z- score Raven's test z- score Raven's test z- score Trimester 1 0.00464 0.0195 0.00221 0.0181 (0.00777) (0.0122) (0.00793) (0.0126) Trimester 2 -0.00971 -0.0134 -0.0107 -0.0198* (0.00835) (0.0115) (0.00853) (0.0115) Trimester 3 0.00392 0.0144 0.00257 0.0153 (0.00772) (0.0133) (0.00786) (0.0133) Trimester 1 -0.00272 -0.0143 -0.000663 -0.0134 x 1(Male) (0.0101) (0.0132) (0.0109) (0.0140) Trimester 2 -0.00422 -0.000782 -0.00199 0.00564 x 1(Male) (0.0117) (0.0138) (0.0122) (0.0143) Trimester 3 0.000465 -0.00880 0.000376 -0.0105 x 1(Male) (0.0120) (0.0154) (0.0120) (0.0153) Trimester 1 -0.0154 -0.0165* x 1(Predicted white collar proportion in top quartile) (0.00961) (0.00988) Trimester 2 0.00535 0.0115 x 1(Predicted white collar proportion in top quartile) (0.00896) (0.00918) Trimester 3 -0.0116 -0.0140 x 1(Predicted white collar proportion in top quartile) (0.0108) (0.0108) N 10320 10171 10320 10171 Dependent variable mean 0.0164 0.0201 0.0164 0.0201 Additional Fixed Effects None state-by-season, state-by-quadratic- year table:laborbyvarsag Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01 All regressions control for the following variables and their interactions with a male indicator (as well as the main eect of gender): birth month, birth year, municipality of birth, and survey wave by birth year xed eects, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period, as well as inversions in all other three-month periods. In columns 2 and 4, the main eect of the white collar variable and the interactions with inversions in all other three month periods are also included. Predicted white collar proportions calculated using census data and annual industry growth rates from ENIGH. See Data Appendix for details on the construction of predicted white collar proportions. 126 Table A10 Eects of Pollution on High School Graduation, by White Collar Opportunities and Agricultural Shares (1) (2) Average monthly inversions… HS Completion HS Completion Trimester 1 0.00454 0.00152 (0.00551) (0.00495) Trimester 2 -0.000317 0.00143 (0.00439) (0.00442) Trimester 3 0.000787 0.00162 (0.00583) (0.00553) Trimester 1 -0.00502 -0.00399 x 1(Male) (0.00619) (0.00582) Trimester 2 0.00225 0.00107 x 1(Male) (0.00483) (0.00478) Trimester 3 0.000563 -0.000221 x 1(Male) (0.00582) (0.00528) Trimester 1 -0.000625 0.00247 x 1(White collar variable in top quartile) (0.00481) (0.00392) Trimester 2 -0.00943** -0.0105*** x 1(White collar variable in top quartile) (0.00377) (0.00384) Trimester 3 0.000744 0.00000423 x 1(White collar variable in top quartile) (0.00459) (0.00438) Trimester 1 0.00401 0.00211 x 1(Agricultural share variable in top quartile) (0.00994) (0.00986) Trimester 2 -0.00623 -0.00806 x 1(Agricultural share variable in top quartile) (0.00561) (0.00641) Trimester 3 -0.000157 0.00638 x 1(Agricultural share variable in top quartile) (0.00659) (0.00957) N 10572 10677 Dependent variable mean 0.264 0.265 Additional Fixed Effects White Collar Variable Predicted Assigned by census Agricultural Share Variable Assigned by census Assigned by census state-by-season, state-by-quadratic-year Notes: Standard errors (clustered at municipality level) in parentheses. * p< 0:1 ** p< 0:05 *** p< 0:01 All regressions control for the following variables and their interactions with a male indicator (as well as the main eect of gender): birth month, birth year, municipality of birth, and survey wave by birth year xed eects, mother's education, father's education, cubic functions of average monthly mean, minimum, and maximum 2m temperatures, monthly relative humidity, monthly precipitation, and monthly cloud coverage during each relevant 3-month period, as well as inversions in all other three-month periods. The main eect of the white collar variable, the agricultural share variable, and each ones interactions with inversions in all other three month periods are also included. See Data Appendix for details on the construction of predicted white collar proportions. 127 A.2 Data Appendix A.2.1 Construction of Thermal Inversion Variables As described in the main body of the paper, the NARR dataset provides temperature values on a 0.3 by 0.3 degree grid for 29 pressure levels (extending vertically into the atmosphere), every three hours. For each latitude-longitude grid point and for each recorded hour, I create an indicator equal to 1 if the 2-meter temperature (equivalent to what is usually reported in weather reports) is higher than the temperature at the rst pressure level above the surface, which lies roughly 300 meters above the surface. Because surface pressure varies across space, I use the temperatures from dierent pressure levels depending on the altitude at a particular grid point. For a municipality at sea level (1000 hPa), I use the temperature at 975 hPa, but for a higher altitude location in Mexico City, for example, I might use the temperature at 700 hPa (if surface pressure is 725 hPa). After creating inversion indicators for every recorded hour, I then collapse to two indicators per day { one for any daytime inversion and one for any nighttime inversion. I then match each Mexican municipality to its four closest grid points and assign each municipality with the inverse-distance weighted average of the nighttime inversion indicator for each day. I sum this indicator over the month and then average over three-month periods. I assign inversions to individuals based on municipality of birth, a restricted use variable obtained from the migration module of the MxFLS, which is directed to individuals aged 15 and older. For the individuals who are missing this variable, 1 I assign them the inversions in their municipality of residence. I do this instead of dropping these individuals because over 80% of individuals who move municipalities between birth and the survey date report that they are currently living in their state of birth and about 70% report living in their municipality of birth. Therefore, for over half of the individuals with missing birth municipalities, using municipality of residence is the correct imputation, while a majority of the remainder only moved short distances (i.e. within state). To create trimester-specic inversion variables for each individual, because I do not have the actual date of conception or date of birth, I simply count backwards, in three month increments, from an individual's month of birth and average over each three month periods. 1 Less than 5% of individuals in each wave of the migration module listed either no municipality at all or a municipality name that could not be mapped to a unique municipality code. A slightly larger percentage of individuals in my nal sample were missing this variable simply because they had not completed a migration module in any wave, despite being older than 15 by the most recent wave. 128 A.2.2 Construction of Individual-Level Variables Both my reduced form and structural analyses only require one cross-section of data. I therefore merge all waves of data for each individual and extract the relevant information from the relevant waves. For variables that should be consistent across waves (gender, birth year), I use data from all available waves and resolve inconsistencies by prioritizing values that are consistent across at least two waves. For other variables, I pick one wave for each individual. In particular, I use the Raven's test score from the rst wave the individual took the test (to avoid capturing any learning eects). For all other reduced-form outcome variables (schooling and income), I use the variable from the most recent survey wave available, with one important exception. In 2009, the share of individuals who report being a technician in their main job rises by 11 percentage points, from 1% in 2005 and 2% in 2002. This dramatic increase does not show up when comparing the share of technicians in the 2000 and 2010 censuses and therefore seems to be driven by a change in coding, rather than an actual increase in the share of individuals in this occupation. In order to avoid using variables that are coded dierently across survey waves, I ignore all work-related variables for individuals who report working as a technician in 2009. This does not mean that I drop these individuals from the sample { this simply means that their work-related variables (income and occupation category) are taken from the most recent available wave prior to 2009. To represent occupation types, the MxFLS uses a dierent categorization system from the ISCO codes that are used in the census (summarized in Appendix Table A1). Appendix Table A11 lists how I map these Mexican Classication of Occupations (CMO) codes to the white-collar and blue-collar categories. This mapping was fairly straightforward, based simply on comparing CMO descriptions to ISCO descriptions (and then using the Vogl (2014) classication to categorize into white-collar and blue-collar). A.2.3 Predicting White Collar Proportions Conceptually, p g represents the perceived likelihood of an individual entering the white collar sector. If parents or individuals use current conditions to inform this expectation, then this probability should vary with the local labor market opportunities at times when children are making important schooling transitions. I aim to match individuals to the relevant labor market variables in the year they turn 12 years old, in the municipality in which they are living at that age. For the vast majority of the 129 Table A11 Mexican Classication of Occupation (CMO) Codes CMO Code and Description White-Collar ("Brains") 11 Professionals 12 Technicians 13 Education Workers 14 Arts, sports, performance, and sports workers 21 Employees and directors of the public, private, and social sectors 61 Department chiefs, coordinators and supervisors of the administrative activities and services 62 Workers in the support of the administrative activities Blue-Collar ("Brawn") 41 Agricultural, cattle activities, foresting, hunting, and fishing workers 51 Chiefs, supervisors, and other control workers in craft and industrial manufacture and in maintenance and repairing activities 52 Craftsmen and manufacturers in the transformation industry and workers of maintenance and repairing activities 53 Operators of fixed machinery of continuous movement and equipment in the process of industrial production 54 Assistants, laborers, and similar in the process of artisan and industrial manufacture and in repairing and maintenance activities 55 Conductors and assistants of conductors of movable machinery and means of trasnport 71 Retailers, employees in commerce, and sales agents 72 Street sales and services workers 81 Workers in personal establishments 82 Workers in domestic services Notes: CMO codes are rst matched to ISCO codes. Brain and brawn catego- rizations from Vogl (2014). sample, I know exactly where they are living at age 12. If individuals report that they are currently living in the same municipality in which they were living at age 12, I use their current residence. If individuals report that they were living in their municipality of birth when they were 12 years old, I use their municipality of birth. For the remainder of individuals, who make up less than 10% of the sample, I also assign them to their municipality of birth, acknowledging that there will be some measurement error, because municipality of residence at age 12 is a restricted-use variable. For the actual data on local labor market conditions, I use the 1990, 2000, and 2010 Mexican censuses, which span the decades during which individuals in my sample transitioned from elementary to junior high. I use the provided International Standard Classication of Occupations (ISCO) codes to categorize individuals as working in white-collar or blue-collar jobs using the same classication as in Vogl (2014). I then calculate the proportion of men and women in white-collar jobs, separately for each 130 commuting zone. Following Atkin (2016), I use commuting zones instead of individual municipalities because these better represent local labor markets. For instance, large metropolitan areas are often composed of many municipalities, with individuals often working and residing in dierent ones. I combine all municipalities in the same Zona Metropolitan (according to the 2000 INEGI classication) into a single commuting zone and also combine municipalities where over 10% of the working population in one reports commuting to another for work (according to the more detailed version of the 2000 census, obtained from INEGI). The census data provides me with gender-specic white-collar proportions for each commuting zone for 1990, 2000 and 2010. However, my empirical test requires knowledge about labor market conditions for each birth cohort at age 12. I use two dierent methods to assign values to the individuals who turn twelve during intercensal years. The simplest method involves assigning individuals with the relevant value from the census just prior to the decade in which they turned 12. This would be the 2000 census for those who were born in the years 1988 to 1997, for example. These results are reported in Table A8. The results discussed in the main body of the paper (Table 2.3) combine census data with national- level growth rates from ENIGH to predict intercensal years. For each year y, I calculate national-level growth rates of six major industries 2 (subscripted by j) relative to the most recent census decade d. I denote these growth ratesg yjd . From the census, in addition to the gender-specic proportions of white collar jobs in each decade (p gd ), I also calculate the gender-specic share of brain-intensive jobs in each industry: s gjd . My predicted proportion, ^ p gyjd , is simply: ^ p gyjd =p gd + 5 X j=1 s gjd g yjd : (A.1) A.3 Structural Estimation Details In this section, I outline the decision rules, transition probabilities, and likelihood function used to estimate the structural model in section 2.4.2. The last sub-section discusses model t. 2 The six broad industry categories I use are: (1) agriculture, (2) oil, natural gas, and construction, (3) education, health, and government, (4) manufacturing, (5) service and hospitality, and (6) trade. 131 A.3.1 Agent's Decision Rules To simplify future notation, I collect all the non-stochastic terms of each state's net rewards into one parameter, so that they can now be written Y (hw)c(hw) = hw + 20 X t=0 t (hw;t) ((hw)(hb)) Y (lw)c(lw) = lw + 20 X t=0 t (lw;t) ((lw)(lb)) Y (hb)c(hb) = hb + 20 X t=0 t (hb;t) Y (lb)c(lb) = lb + 20 X t=0 t (lb;t) Y (hn)c(hn) = hn ((hn)(hb)) Y (ln)c(ln) = ln ((ln)(lb)) Y (h)c(h) = h ((h)(l)); where hw = 1 21 1 2 4 w0 + w1 + ( w2 + w3 ) + kw X j=10 wj X wj 3 5 + 1 5 1 6 X j=4 ( wj + w(j+3) ) 5(j3)+1 0 @ c hw + hw + q hw X j=1 hwj Q hwj 1 A lw = 1 21 1 2 4 w0 + w2 + kw X j=10 wj X wj 3 5 + 1 5 1 6 X j=4 wj 5(j7)+1 0 @ c lw + lw + q lw X j=1 lwj Q lwj 1 A hb = 1 21 1 2 4 b0 + b1 + ( b2 + b3 ) + k b X j=10 bj X bj 3 5 + 1 5 1 6 X j=4 ( bj + b(j+3) ) 5(j3)+1 lb = 1 21 1 2 4 b0 + b2 + k b X j=10 bj X bj 3 5 + 1 5 1 6 X j=4 bj 5(j7)+1 132 hn =(c hn + hn + q hn X j=1 hnj Q hnj ) ln =(c ln + ln + q ln X j=1 lnj Q lnj ) h =(c h + h + q h X j=1 Hj Q hj ): The agent's value function at state s is V (s) =Y (s) + max s 0 2S f (s) c(s 0 ) +E V (s 0 )jI(s) | {z } CV (s)=Continuation Value ; (A.2) where I(s) denotes the agent's information set at state s, and S f (s) denotes the set of feasible states at s. The second term on the right-hand side is the continuation value of state s, CV (s), . This value function determines the agent's decision in each state. I solve for the optimal decision rules at each node, starting with the terminal nodes. An agent in s = h chooses hw if, given her high school degree, the expected lifetime net rewards from the white collar sector exceed expected lifetime net rewards from the blue collar sector and the expected net rewards from not working, i.e. if hw (hw) > hb (hb) and (A.3) hw (hw) > hn (hn): (A.4) An agent chooses hb if hb (hb) hw (hw) and (A.5) hb (hb) > hn (hn); (A.6) and chooses hn otherwise. Similarly, an agent in s =l chooses lw if lw (lw) > lb (lb) and (A.7) lw (lw) > ln (ln); (A.8) 133 chooses lb if lb (lb) lw (lw) and (A.9) lb (lb) > ln (ln); (A.10) and chooses ln otherwise. These decision rules also help to solve for the agent's optimal decision in s = 0. Here, an agent will choose h if the expected net rewards and continuation value of a high school degree exceeds the expected net rewards plus continuation value of dropping out. E [Y (h)c(h) +CV (h)jI(0)]> E [Y (l)c(l) +CV (l)jI(0)]; (A.11) where E [Y (h)c(h) +CV (h)jI(0)] = E [Y (h) +CV (h)jI(0)]c(h) = E [CV (h)jI(0)] + h ((h)(l)) = E max s 0 2fhw; hb; hng c(s 0 ) + E V (s 0 )jI(h) + h ((h)(l)) = h ln(exp( hw h ) + exp( hb h ) + exp( hn h )) + h ((h)(l)) (A.12) where the simplication in the nal line is due to the Type 1 extreme value distribution assumption. Similarly, E [Y (l)c(l) +CV (l)jI(0)] = E [CV (l)jI(0)] = l ln(exp( lw l ) + exp( lb l ) + exp( ln l )): (A.13) Combining equations A.11, A.12, and A.13, we can derive a cuto rule for the agent's rst decision. She will choose h if h ln(exp( hw h ) + exp( hb h ) + exp( hn h )) + h (h) > l ln(exp( lw l ) + exp( lb l ) + exp( ln l ))(l) (A.14) 134 A.3.2 Transition Probabilities and Likelihood Function The individual likelihood contribution of a particular agent is the joint probability of observing that agent's schooling choice, sectoral choice, and (for working individuals) income that is realized in the data. 3 Beginning with the choice probabilities, I dene for each agent an indicator function d(s) which equals 1 if the agent visits states, and calculate the conditional probability of visiting each state inS v (s), the set of visited states. Collecting all of the observed characteristics in D and structural parameters in a vector , we can use the above cuto rules to write out these transition probabilities as follows: Pr(d(h) = 1jD; ;) = 1 + exp 1 0 h ln(e hw h +e hb h +e hn h ) + h l ln(e lw l +e lb l +e ln l ) 1 Pr(d(l) = 1jD; ;) = 1 + exp 1 0 h ln(e hw h +e hb h +e hn h ) + h l ln(e lw l +e lb l +e ln l ) 1 Pr(d(hw) = 1jD; ;) = exp( hw h ) exp( hw h ) + exp( hb h ) + exp( hn h ) 1 Pr(d(hb) = 1jD; ;) = exp( hb h ) exp( hw h ) + exp( hb h ) + exp( hn h ) 1 Pr(d(hn) = 1jD; ;) = exp( hn h ) exp( hw h ) + exp( hb h ) + exp( hn h ) 1 Pr(d(lw) = 1jD; ;) = exp( lw l ) exp( lw l ) + exp( lb l ) + exp( ln l ) 1 Pr(d(lb) = 1jD; ;) = exp( lb l ) exp( lw l ) + exp( lb l ) + exp( ln l ) 1 Pr(d(ln) = 1jD; ;) = exp( ln l ) exp( lw l ) + exp( lb l ) + exp( ln l ) 1 These transition probabilities are combined with the per-period wage functions (of which we only observe one per working individual) to construct the individual likelihood function for observation i. Recall that t i is the age of the individual (in the year their income is observed) minus 30. Then, i's 3 Although the agent observes the idiosyncratic shocks(s 0 ) (before deciding on their next state) and(s) (after making their decision), the researcher does not. 135 contribution to the likelihood function is: Z 1 1 2 4 Y s2S ( Pr(d i (s) = 1jD i ; i (); ) 20 Y t=0 f(Y i (s;t)jD; i (); ) 1(t i =t) ) 1(s2S v i ) 3 5 dF (); (A.15) where f(Y (hw;t)jD;; ) = 0 @ w0 + w1 + ( w2 + w3 ) + 6 X j=4 ( wj + w(j+3) )A j3 (t)Y (hw;t) 1 A f(Y (lw;t)jD;; ) = 0 @ w0 + w2 + 6 X j=4 wj A j7 (t)Y (lw;t) 1 A f(Y (hb;t)jD;; ) = 0 @ b0 + b1 + ( b2 + b3 ) + 6 X j=4 ( bj + b(j+3) )A j3 (t)Y (hb;t) 1 A f(Y (lb;t)jD;; ) = 0 @ b0 + b2 + 6 X j=4 bj A j7 (t)Y (lb;t) 1 A f(Y (s;t)jD;; ) = 1 8s2fh; l; hn; lng: is equal to the individual's standardized Raven's test score plus a normal error term . The discount factor is set to 0.04. I use maximum likelihood to estimate the structural parameters , using 500 simulations to calculate the integral over for each individual. The wage parameters are reported in Table 2.6, and the cost parameters are reported in Table A12. 136 Table A12 Cost Parameter Estimates Panel A: Sectoral Choice Estimate Standard Error White Collar - Blue Collar HS: Constant 3.336 (9.512) HS: Ability -0.632 (2.683) HS: White Collar Proportion 6.369 (6.486) HS: Male 7.619 (2.868)*** HS: Urban -6.379 (2.713)** HS: Mother's Education -0.559 (0.309)* HS: Father's Education -0.455 (0.271)* No HS: Constant 17.275 (8.404)** No HS: Ability -6.260 (1.508)*** No HS: White Collar Proportion -6.151 (6.126) No HS: Male -5.378 (1.386)*** No HS: Urban -4.005 (1.771)** No HS: Mother's Education -0.196 (0.202) No HS: Father's Education -0.250 (0.259) No Work - Blue Collar HS: Constant -167.781 (6.305)*** HS: Ability -3.151 (1.893)* HS: Unemployment -1.214 (3.664) HS: Male 34.195 (8.723)*** HS: Urban -3.865 (2.103)* HS: Mother's Education -0.133 (0.324) HS: Father's Education 0.399 (0.249) No HS: Constant -146.447 (1.827)*** No HS: Ability 3.906 (2.301)* No HS: Unemployment -2.271 (2.868) No HS: Male 0.660 (11.151) No HS: Urban -5.400 (1.798)*** No HS: Mother's Education 0.092 (0.107) No HS: Father's Education 0.040 (0.053) HS: Scale Parameter 11.429 (2.313)*** No HS: Scale Parameter 2.451 (2.442) Panel B: Schooling Choice Estimate Standard Error Constant 31.699 (6.254)*** Ability -6.991 (2.256)*** Male -12.480 (3.878)*** Mother's Education -0.237 (0.332) Father's Education -0.462 (0.350) Urban -1.181 (1.746) Youth employment rate, per 10 children 2.297 (1.827) Number of teachers per 10 school-aged children -3.302 (2.301) Scale Parameter 1.769 (1.072)* Notes: Standard errors are calculated analytically using the information matrix. * p< 0:1 ** p< 0:05 *** p< 0:01 137 Appendix B Appendix to Chapter 3 (Helping Children Catch Up) B.1 Appendix Tables Figure B1 Proportion of Individuals Not Living in Household, by Age 138 Table B1 Exposure to Progresa Age in 1998 School Grade in 1998 Age in 2003 Treatment Villages Control Villages Difference in Exposure 5 - 10 3 3 0 6 1st year primary 11 4 4 0 7 2nd year primary 12 5 4 1 8 3rd year primary 13 6 4 2 9 4th year primary 14 6 4 2 10 5th year primary 15 6 4 2 11 6th year primary 16 6 4 2 12 1st year junior high 17 6 4 2 13 2nd year junior high 18 4 2 2 14 3rd year junior high 19 2 1 1 15 1st year high school 20 0 0 0 16 2nd year high school 21 0 0 0 Years Exposed to PROGRESA in 2003 139 Figure B2 Progresa Localities by Proportion of Years with a Rainfall Shock, 1985-1991 Notes: Percentages in the legend correspond to the proportion of years from 1985 to 1991 (in which rainfall data was available for that locality) that a rainfall shock was experienced. 140 Table B2 Summary Statistics for Control Variables Panel A: Household-level Panel B: Locality-level Full Sample Treatment Villages Control Villages Treatment - Control Differences Full Sample Treatment Villages Control Villages Treatment - Control Differences 7.415 7.422 7.403 0.0190 0.376 0.366 0.393 -0.0269 (2.190) (2.215) (2.150) (0.0407) (0.485) (0.483) (0.490) (0.0486) 41.73 41.42 42.21 -0.794*** Well Spring 0.481 0.510 0.436 0.0741 (11.29) (11.09) (11.58) (0.210) (0.500) (0.501) (0.497) (0.0500) 0.0565 0.0563 0.0568 -0.000474 0.148 0.121 0.190 -0.0696* (0.231) (0.231) (0.231) (0.00429) (0.355) (0.326) (0.394) (0.0354) 0.0729 0.0735 0.0719 0.00158 Bury Garbage 0.181 0.206 0.141 0.0651* (0.0865) (0.0860) (0.0872) (0.00161) (0.385) (0.405) (0.349) (0.0385) 0.101 0.103 0.0992 0.00336* Public Dumpster 0.0167 0.00778 0.0307 -0.0229* (0.0961) (0.0961) (0.0960) (0.00178) (0.128) (0.0880) (0.173) (0.0128) 0.0520 0.0509 0.0537 -0.00275* Public Drainage 0.0381 0.0350 0.0429 -0.00793 (0.0774) (0.0761) (0.0793) (0.00144) (0.192) (0.184) (0.203) (0.0192) 0.124 0.126 0.121 0.00488** Public Phone 0.519 0.518 0.521 -0.00396 (0.113) (0.113) (0.113) (0.00210) (0.500) (0.501) (0.501) (0.0501) 0.0696 0.0699 0.0692 0.000671 0.150 0.132 0.178 -0.0456 (0.0947) (0.0958) (0.0929) (0.00176) (0.357) (0.339) (0.384) (0.0358) 0.0513 0.0516 0.0508 0.000818 13.52 13.74 13.17 0.574 (0.0763) (0.0766) (0.0758) (0.00142) (24.43) (24.31) (24.67) (2.449) 0.120 0.119 0.121 -0.00226 DICONSA store 0.238 0.261 0.202 0.0582 (0.112) (0.111) (0.114) (0.00208) (0.426) (0.440) (0.403) (0.0427) 0.0658 0.0653 0.0667 -0.00144 Distance to Bank 38.72 40.50 36.01 4.482 (0.0911) (0.0914) (0.0908) (0.00169) (51.76) (59.25) (37.62) (5.497) 0.160 0.159 0.160 -0.000961 0.117 0.128 0.0982 0.0302 (0.0611) (0.0608) (0.0617) (0.00114) (0.321) (0.335) (0.298) (0.0322) 0.0185 0.0182 0.0190 -0.000871 11.82 12.17 11.33 0.836 (0.0506) (0.0506) (0.0507) (0.000941) (15.89) (15.95) (15.91) (2.438) 0.0173 0.0166 0.0184 -0.00179* 0.581 0.599 0.552 0.0471 (0.0503) (0.0496) (0.0513) (0.000934) (0.494) (0.491) (0.499) (0.0495) 3.926 3.924 3.928 -0.00403 (2.071) (2.068) (2.075) (0.0476) 0.342 0.333 0.357 -0.0238*** (0.474) (0.471) (0.479) (0.00881) 3.977 4.033 3.889 0.144*** (2.247) (2.313) (2.136) (0.0502) 0.307 0.304 0.312 -0.00798 (0.461) (0.460) (0.463) (0.00857) 0.378 0.373 0.385 -0.0124 (0.485) (0.484) (0.487) (0.00920) 0.0408 0.0390 0.0436 -0.00460 (0.198) (0.194) (0.204) (0.00367) 0.392 0.383 0.405 -0.0215** (0.488) (0.486) (0.491) (0.00953) 0.0957 0.0969 0.0939 0.00305 (0.294) (0.296) (0.292) (0.00546) Number of households 6233 3795 2438 Number of localities 257 163 420 Notes: Distance to Secondary School Distance to Secondary School Missing Standard errors in parentheses (*** p<0.01, ** p<0.05, * p<0.1). Missing indicators for parental education and language are binary variables equal to 1 for individuals missing the relevant information. Community well, well spring, public water network, public dumpster, public drainage, public phone, hospital or health center, and DICONSA store are all indicators equal to 1 for localities that have the relevant public good or facility. Bury garbage is an indicator equal to 1 for localities that report burying garbage as their main form of garbage dispolsal. Distances reported in kilometers. Missing distance variables are indicators for localities that did not report a distance to the nearest secondary school or bank. Community Well Public Water Network Hospital or health center Distance to health center Distance to Bank Missing Father speaks indigenous language Father's language missing Number of boys aged 13- 18 Number of girls aged 6-7 Number of girls aged 8-12 Number of girls aged 13- 18 Mother's language missing Number of women aged 19-54 Number of men aged 55 and over Number of women aged 55 and over Mother's educational attainment Mother's educational attainment missing Father's educational attainment Father's educational attainment missing Mother speaks indigenous language Female household head Number of children aged 0-2 Number of children aged 3-5 Number of boys aged 6-7 Number of boys aged 8-12 Household size Household head age 141 To verify that our results are not being driven by the imbalance in rainfall shock prevalence across treatment and control, we repeat our analysis using the trimmed sample described in Section 3.3, in which rainfall shock prevalence is the same across treatment and control villages. As Tables B3, B4, and B5 show, our results are virtually identical to the full sample results. Table B3 Eects of Progresa and Rainfall on Education Outcomes: Trimmed Sample (1) (2) (3) (4) (5) (6) Panel A: Main Effects Only Years of Progresa Exposure 0.144 0.0152 0.0191 0.0832 -0.00219 0.00265 (0.0410)*** (0.0108) (0.00829)** (0.0499)* (0.0138) (0.0119) No Rainfall Shock 0.129 0.0101 0.0321 0.0698 -0.00624 0.0200 (0.0585)** (0.0151) (0.0117)*** (0.0578) (0.0145) (0.0114)* Panel B: Main Effects and Interaction Years of Progresa Exposure 0.234 0.0315 0.0352 0.192 0.0170 0.0235 (0.0576)*** (0.0138)** (0.0114)*** (0.0611)*** (0.0163) (0.0146) No Rainfall Shock 0.711 0.116 0.136 0.767 0.118 0.154 (0.293)** (0.0579)** (0.0531)** (0.277)*** (0.0594)** (0.0553)*** No Shock x Exposure -0.119 -0.0216 -0.0213 -0.142 -0.0253 -0.0274 (0.0556)** (0.0113)* (0.0107)** (0.0528)*** (0.0118)** (0.0111)** Observations 10236 9713 10236 10236 9713 10236 Mean of Dependent Variable 6.780 0.586 0.470 6.780 0.586 0.470 Fixed Effects Birth year x state Birth year x state Birth year x state Birth year x state, Municipality Birth year x state, Municipality Birth year x state, Municipality Years of Education Grade Progression Appropriate Grade Completion Years of Education Grade Progression Appropriate Grade Completion Notes: - Standard errors clustered at the municipality are reported in parentheses (*** p<0.01, ** p<0.05, * p<0.1). -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean -All specifications include gender, household head gender and age, household size, household composition variables, parental education, parental language, and locality characteristic s. Controls for parental language/education and locality distance include dummies for missing values 142 Table B4 Interaction Eects on Schooling Completion by Grade: Trimmed Sample (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Years of Progresa Exposure 0.00317 0.0125 0.0178 0.0217 0.0565 0.0501 0.0459 0.0147 0.00760 0.00272 (0.00399) (0.00535)** (0.00723)** (0.00925)** (0.0151)*** (0.0139)*** (0.0125)*** (0.00670)** (0.00323)** (0.00226) No Rainfall Shock -0.00988 0.00827 0.0325 0.0309 0.177 0.187 0.173 0.0586 0.0479 0.0166 (0.0216) (0.0295) (0.0394) (0.0503) (0.0643)*** (0.0633)*** (0.0634)*** (0.0393) (0.0208)** (0.0151) No Shock x Exposure 0.00199 -0.00214 -0.00487 -0.00308 -0.0278 -0.0335 -0.0316 -0.0103 -0.00742 -0.00277 (0.00416) (0.00554) (0.00735) (0.00950) (0.0126)** (0.0121)*** (0.0121)*** (0.00778) (0.00394)* (0.00286) Observations 10236 10236 10236 10236 10236 10236 10236 10236 10236 10236 Mean of Dependent Variable 0.970 0.934 0.881 0.783 0.483 0.368 0.258 0.0610 0.0311 0.0124 Fixed Effects Notes: - Standard errors clustered at the municipality are reported in parentheses (*** p<0.01, ** p<0.05, * p<0.1). -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean -All specifications include gender, household head gender and age, household size, household composition variables, parental education, parental language, and locality characteristic s. Controls for parental language/education and locality distance include dummies for missing values Primary School Junior High School High School 3 yrs 4 yrs 5 yrs 6 yrs 7 yrs 8 yrs 9 yrs 10 yrs 11 yrs 12 yrs Birth year x state Table B5 Eects of Progresa and Rainfall on Longer-Term Outcomes: Trimmed Sample (1) (2) (3) (4) (5) (6) (7) Years of Progresa Exposure 0.0101 0.0777 0.0816 0.102 0.0673 0.0857 0.105 (0.0145) (0.0462)* (0.0407)** (0.0396)** (0.0485) (0.0485)* (0.0461)** No Rainfall Shock 0.0958 0.162 0.191 0.315 0.145 0.221 0.351 (0.0559)* (0.163) (0.137) (0.129)** (0.155) (0.159) (0.148)** No Shock x Exposure -0.0161 -0.0788 -0.0844 -0.123 -0.0655 -0.0859 -0.125 (0.0178) (0.0481) (0.0419)** (0.0408)*** (0.0486) (0.0487)* (0.0456)*** Observations 1320 970 966 966 969 963 962 Mean of Dependent Variable 0.0652 0.494 0.519 0.348 0.556 0.576 0.411 Fixed Effects Enrolled or Worked in Non- Laborer Job Birth year x state Currently Enrolled w/ HS Degree Worked this Week Worked this Year Worked in Non-Laborer Job Enrolled or Currenly Working Enrolled or Worked this Year Notes: - Standard errors clustered at the municipality are reported in parentheses (*** p<0.01, ** p<0.05, * p<0.1). -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean -All specifications include gender, household head gender and age, household size, household composition variables, parental education, parental language, and locality characteristics. Controls for parental language/education and locality distance include dummies for missing values -These regressions restrict to individuals aged 18 in 2003. 143 Table B6 Eects of Progresa and Rainfall on Education and Employment Outcomes, Controlling for Rainfall Shock Interactions with Unbalanced Characteristics (1) (2) (3) (4) (5) (6) (7) (8) Years of Progresa Exposure 0.215 0.0297 0.0315 0.0225 0.102 0.0861 0.0810 0.0901 (0.0553)*** (0.0134)** (0.0110)*** (0.0142) (0.0507)** (0.0477)* (0.0513) (0.0505)* No Rainfall Shock 0.513 0.144 0.0705 0.203 0.00861 0.262 0.563 0.629 (0.356) (0.0843)* (0.0770) (0.0856)** (0.372) (0.364) (0.360) (0.375)* No Shock x Exposure -0.110 -0.0194 -0.0185 -0.0269 -0.102 -0.0843 -0.0961 -0.105 (0.0537)** (0.0110)* (0.0102)* (0.0178) (0.0524)* (0.0481)* (0.0515)* (0.0507)** Observations 11824 11216 11824 1597 1147 1143 1143 1138 Mean of Dependent Variable 6.787 0.579 0.465 0.0607 0.502 0.532 0.354 0.414 Ages 12 to 18 12 to 18 12 to 18 18 18 18 18 18 Fixed Effects Enrolled or Worked in Non- Laborer Job Birth year x state Notes: - Standard errors clustered at the municipality level are in parentheses (*** p<0.01, ** p<0.05, * p<0.1). -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean -All specifications include gender, household head gender and age, household size, household composition variables, parental education, parental language, and locality characteristics. Controls for parental language/education and locality distance include dummies for missing values -All specifications include interactions between the rainfall shock variable and each of the control variables that are unbalanced across treatment and control villages (see Table A2). Worked in Non-Laborer Job Worked this Year Years of Education Grade Progression Appropriate Grade Completion Currently Enrolled w/ HS Degree Worked this Week Table B6 reports the results of regressions on our main outcomes of interest, additionally controlling for interactions between the rainfall shock variable and each of the control variables that are not balanced across treatment and control groups. It should be noted that the main eect can no longer be interpreted as an overall endowment shock, as these specications include a number of interactions that need to be summed in order to obtain the total eect of rainfall. What is important to note is that the years of exposure and interaction coecients remain very similar to the main results reported in the body of this paper. 144 Table B7 Eects of Progresa on Woodcock-Johnson Test Scores (1) (2) (3) (4) Years of Progresa Exposure -0.0515 0.0145 0.0596 0.00792 (0.0513) (0.0500) (0.0605) (0.0448) No Rainfall Shock -0.133 0.146 0.182 0.0643 (0.229) (0.252) (0.281) (0.210) No Shock x Exposure 0.0557 -0.000391 -0.0333 0.00456 (0.0481) (0.0517) (0.0569) (0.0440) Observations 1593 1586 1581 1571 Fixed Effects Notes: - Standard errors clustered at the municipality level are in parentheses (*** p<0.01, ** p<0.05, * p<0.1). -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean -All specifications include gender, household head gender and age, household size, household composition variables, parental education, parental language, and locality characteristic s. Controls for parental language/ education and locality distance include dummies for missing values -Sample includes individuals aged 15 to 21 -Scores are standardized by test type, and the average score in column 4 takes the average across all three z-scores. Birth year x state Letter Word Identification Applied Problems Dictation Average Score In 2003, Woodcock-Johnson dictation, word identication, and applied problems tests were admin- istered to a sub-sample of individuals aged 15 to 21. Table B7 reports the results of regressions on these standardized test scores. 145 Table B8 Eects of Progresa and Rainfall on Education and Employment Outcomes, Controlling for Other Government Programs (1) (2) (3) (4) (5) (6) (7) (8) Years of Progresa Exposure 0.207 0.0292 0.0308 0.00914 0.0998 0.0877 0.0942 0.0893 (0.0542)*** (0.0126)** (0.0107)*** (0.0122) (0.0436)** (0.0382)** (0.0376)** (0.0401)** No Rainfall Shock 0.636 0.104 0.121 0.0819 0.247 0.197 0.245 0.263 (0.283)** (0.0539)* (0.0502)** (0.0467)* (0.155) (0.126) (0.118)** (0.122)** No Shock x Exposure -0.108 -0.0189 -0.0193 -0.0127 -0.0985 -0.0869 -0.111 -0.104 (0.0535)** (0.0106)* (0.0100)* (0.0149) (0.0454)** (0.0394)** (0.0374)*** (0.0390)*** Observations 11734 11135 11734 1587 1138 1134 1134 1131 Mean of Dependent Variable 6.786 0.579 0.464 0.0605 0.500 0.532 0.353 0.412 Ages 12 to 18 12 to 18 12 to 18 18 18 18 18 18 Fixed Effects -"No rainfall shock" = 1 for individuals whose birth-year rainfall was within one standard deviation of the 10-year historical locality-specific mean -All specifications include gender, household head gender and age, household size, household composition variables, parental education, parental language, and locality characteristics. Controls for parental language/education and locality distance include dummies for missing values -All specifications control for household receipt of PROCAMPO cash transfers, indicators for corn, sugar, and kidney bean growing localities interacted with birth year dummies, and the individual's age in the year PROCEDE reached its locality (along with a dummy for individuals missing PROCEDE information, for whom the PROCEDE age variable is set to zero). Worked this Year Worked in Non-Laborer Job Enrolled or Worked in Non- Laborer Job Birth year x state Notes: - Standard errors clustered at the municipality level are in parentheses (*** p<0.01, ** p<0.05, * p<0.1). Worked this Week Years of Education Grade Progression Appropriate Grade Completion Currently Enrolled w/ HS Degree Table B8 reports the results of regressions on our main outcomes of interest, taking into account other contemporaneous programs and policies, including PROCAMPO, PROCEDE, and crop-specic agricultural policies. 146 Appendix C Appendix to Chapter 4 (Reporting Heterogeneity and Health Disparities) C.1 Description of Datasets C.1.1 Indonesian Family Life Survey (IFLS) I use the 2007 wave of the IFLS, an ongoing longitudinal household survey of individuals in 13 out of the 27 Indonesian provinces, representative of 80% of the Indonesian population. This paper utilizes information from the individual-level demographic and health status modules. IFLS 4 also randomly chose 2,500 households to participate in the health vignette module. In selected households, all adults over 40 were asked the six domain-specic health questions. Crucially, the IFLS included three anchoring vignettes per health domain in addition to the above self reports. While all vignette households were asked all of the questions listed above, due to time constraints each vignette household was only assigned to respond to anchoring vignettes for two randomly chosen domains out of the six, leaving between 1100- 1300 individuals per domain. During the interview, the interviewers read aloud a vignette like the one described in Section 4.2 (see Appendix section C.2 for a list all of the vignettes). They then repeated the domain-relevant question from the list of questions above (of course replacing the word \you" with the name of the hypothetical vignette person). The gender of the hypothetical individuals, depicted through their names, was randomized at the household level. Answers to the health status questions and anchoring vignettes form the outcome variables of interest for this analysis. Purposely focusing on a set of simple explanatory variables in order to facilitate comparisons with 147 the three other datasets, I use gender, age, and education levels. Specically, I create a dummy variable for males, a dummy for high school graduates, and a dummy for those who completed primary but not high school. C.1.2 Health and Retirement Study (HRS) Since 1992, the HRS has interviewed a representative sample of Americans older than 50, re-interviewing the original sample and adding new cohorts every 2 years. In 2007, an \o-year" in between two main interview years, the Disability Vignette Study (DVS) was sent out as a mail survey to a subsample, of which 81.7% (over 4,000) responded. This survey included the exact same anchoring vignettes for the same six domains found in the IFLS vignette modules, except with American instead of Indonesian names. Unlike the IFLS, two versions of the questionnaires, which ordered the questions dierently and used dierent genders for the hypothetical individuals, were used. I combine data from this o-year study with data from the most recent main survey prior to it, which took place in 2006. From the 2006 interviews, I obtain the basic explanatory variables: age, gender, and educational attainment. Since the vast majority of HRS respondents are high school graduates, I use college graduation as my \high education" group and high school graduates (who have not completed college) as my \medium education" group. C.1.3 English Longitudinal Study of Aging (ELSA) Similar to the HRS, the ELSA is a longitudinal panel of individuals aged over 50 living in England (Marmot et al., 2014). Since 2002, the representative sample, which was initially drawn from the Health Survey for England, has been re-interviewed every two years. The ELSA sample was also refreshed at waves 3, 4, and 6. I use data from the third wave, collected during 2006 and 2007, which included self-completion vignette questionnaires that were handed out to a randomly selected third of the sample (and completed by almost 2,500 individuals). Individuals were asked to rate their own health in the six domains and then to respond to the same vignettes found in the IFLS and HRS. Unlike the other datasets, which randomized the genders of vignette individuals in varying ways, the ELSA only had one version of the questionnaire, which had the same names (and thus genders) assigned to the same questions for all respondents. The vignette genders alternated throughout the questionnaire, with half of the vignette individuals assigned female names and the other half male names. Along with respondent age and gender, I use degree qualications as my education variable because 148 precise years of schooling are not included in this survey. The \high-education" category includes those who have received their A-levels or higher, while the \medium-education" category includes all qualications lower than A-levels. This leaves those with no qualications as the low-education group. C.1.4 China Health and Retirement Longitudinal Study (CHARLS) Finally, I also use data from the rst wave of the CHARLS, 1 conducted in 2011 (Zhao et al., 2013). Very similar to the other two longitudinal aging studies described above (the HRS and ELSA), the CHARLS has interviewed a representative sample of over 17,000 Chinese residents aged 45 and older and plans to follow up with the respondents every two years. The CHARLS is one of very few Chinese surveys that include domain-specic self-reports and vignette questions, which are asked as part of the full in-person interview for a random sub-sample of households. Like in the IFLS, each vignette household is randomly assigned to 2 out of the 6 domains, resulting in around 1100 to 1300 respondents per domain. The genders of the hypothetical individuals are also randomized at the household level. As control variables, I use age, gender, and years of schooling. Because high school graduation rates for this sample are so low (less than 10%), I use junior high school completion as my \high education" cuto and primary school completion as the boundary between the medium and low-education groups. C.1.5 Self-Report Distributions Figures C1 and C2 explore within-country dierences across gender and education. Figure C1 depicts the distribution of self-report responses by gender for each dataset separately. On each domain graph, I report the p-value corresponding to the Pearson chi-squared statistic for the test of the null hypothesis that the distribution of the responses are the same for males and females. In the IFLS and CHARLS, for pain, cognition, aect, and sleep, males and females have signicantly dierent self-report distributions, with males disproportionately falling in the healthiest category. In the HRS, there are signicantly dierent male and female distributions in the cognition, aect, and sleep domains going in the same direction. In the ELSA, the domains that exhibit signicant gender dierences are pain, sleep, and aect. Figure C2 shows even more drastically dierent distributions of self-reports, this time between high education and \lower" education groups (for which I pool the medium and low education categories). 1 CHARLS is conducted by the National School of Development (China Center for Economic Research) at Beijing University. See http://charls.ccer.edu.cn/charls/ for more detail. 149 In virtually all domains in all four samples (with the exception of cognition and aect in the IFLS), the distributions are signicantly dierent, with the higher education group disproportionally represented in the healthiest categories. 150 Figure C1 Distribution of Self-Reports by Gender 151 Figure C1 Distribution of Self-Reports by Gender, continued 152 Figure C2 Distribution of Self-Reports by Education 153 Figure C2 Distribution of Self-Reports by Education, continued 154 C.2 Anchoring Vignette Questions These vignette questions were taken from the IFLS, but the HRS, ELSA, and CHARLS data all use the same scenarios except with dierent names. C.2.1 Domain: Mobility Pak Taryono/Bu Taryini is able to walk distances of up to 200 metres without any problems but feels tired after walking one kilometer. He has no problems with day-to-day activities, such as carrying food from the market Pak Tumino/Bu Tumini does not exercise. He cannot climb stairs or do other physical activities because he is obese. He is able to carry the groceries and do some light household work. Pak Sidik/Bu Endah has a lot of swelling in his legs due to his health condition. He has to make an eort to walk around his home as his legs feel heavy. C.2.2 Domain: Pain Pak Budiarto/ Bu Budiarti has a headache once a month that is relieved after taking a pill. During the headache she can carry on with her day-to-day aairs. Pak Sumarno/ Bu Sumarni has pain that radiates down her right arm and wrist during her day at work. This is slightly relieved in the evenings when she is no longer working on her computer. Pak Mulyono/ Bu Mulyanti has pain in his knees, elbows, wrists and ngers, and the pain is present almost all the time. Although medication helps, he feels uncomfortable when moving around, holding and lifting things. C.2.3 Domain: Cognition Pak Taryono/ Bu Taryini can concentrate while watching TV, reading a magazine or playing a game of cards or chess. Once a week he forgets where his keys or glasses are, but nds them within ve minutes. Pak Suwarso/ Bu Suwarsih is keen to learn new recipes but nds that she often makes mistakes and has to reread several times before she is able to do them properly. 155 Pak Mugiono/ Bu Mugianti cannot concentrate for more than 15 minutes and has diculty paying attention to what is being said to him. Whenever he starts a task, he never manages to nish it and often forgets what he was doing. He is able to learn the names of people he meets. C.2.4 Domain: Sleep Pak Partono/ Bu Partini falls asleep easily at night, but two nights a week she wakes up in the middle of the night and cannot go back to sleep for the rest of the night. Pak Darma/ Bu Darmi wakes up almost once every hour during the night. When he wakes up in the night, it takes around 15 minutes for him to go back to sleep. In the morning he does not feel well-rested. Pak Parto/ Bu Parti takes about two hours every night to fall asleep. He wakes up once or twice a night feeling panicked and takes more than one hour to fall asleep again C.2.5 Domain: Aect Pak Arman/ Bu Lina enjoys her work and social activities and is generally satised with her life. She gets depressed every 3 weeks for a day or two and loses interest in what she usually enjoys but is able to carry on with her day-to-day activities. Pak Sukarso/ Bu Sukarsih feels nervous and anxious. He worries and thinks negatively about the future, but feels better in the company of people or when doing something that really interests him. When he is alone he tends to feel useless and empty. Pak Rano/ Bu Rina feels depressed most of the time. She weeps frequently and feels hopeless about the future. She feels that she has become a burden on others and that she would be better dead. C.2.6 Domain: Breathing Pak Sugiarto/ Bu Suwarsih has no problems while walking slowly. He gets out of breath easily when climbing uphill for 20 meters or a ight of stairs. Pak Ramlan/ Bu Badriah suers from respiratory infections about once every year. He is short of breath 3 or 4 times a week and had to be admitted in hospital twice in the past month with a 156 bad cough that required treatment with antibiotics. Pak Hamid/ Bu Karsini has been a heavy smoker for 30 years and wakes up with a cough every morning. He gets short of breath even while resting and does not leave the house anymore. He often needs to be put on oxygen. 157 C.3 Estimation Details and Likelihood Function Using maximum likelihood to estimate the model described in Section 4.3, I normalize 2 = 1 and estimate 2 u , as these are not separately identied. I also normalize 3 = 0. Due to the independence of i andv i , the individual likelihood contribution, conditional onu i , is simply the product of four cumulative normal probabilities (one for the latent health equation and one for each of the three vignettes). I calculate the unconditional likelihood contribution of each individual using simulated methods, drawing 50u i 's from a normal distribution and taking the average of the individual likelihood contribution over the u i draws. 2 In order to express the log-likelihood function, I dene the indicator function D ijj 1 j 2 j 3 = 1(Y i = j;Y 1i =j 1 ;Y 2i =j 2 ;Y 3i =j 3 ). Then, the likelihood function is: L(; ;; v ; u ) = N Y i=1 5 Y j=1 5 Y j 3 =1 5 Y j 2 =1 5 Y j 1 =1 Pr(Y i =j;Y 1i =j 1 ;Y 2i =j 2 ;Y 3i =j 3 ) D ijj 1 j 2 j 3 : I calculate the unconditional likelihood contribution of individual i, Q 5 j=1 Q 5 j 3 =1 Q 5 j 2 =1 Q 5 j 1 =1 Pr(Y i = j;Y 1i = j 1 ;Y 2i = j 2 ;Y 3i = j 3 ) D ijj 1 j 2 j 3 , by taking the average of the following conditional likelihood contribution over 50 simulated u i 's (from a standard normal distribution) for each individual. Pr(Y i =j;Y 1i =j 1 ;Y 2i =j 2 ;Y 3i =j 3 ju i ) = h ( j i (u i )X i ) ( j1 i (u i )X i ) i " ( j 1 i (u i ) 1 v ) ( j 1 1 i (u i ) 1 v ) # " ( j 2 i (u i ) 2 v ) ( j 2 1 i (u i ) 2 v ) #" ( j 3 i (u i ) v ) ( j 3 1 i (u i ) v ) # : (C.1) For j, j 1 , j 2 , j 3 > 2, this becomes = " (( 1 )X i + j X n=2 e nX i + u u i ) (( 1 )X i + j1 X n=2 e nX i + u u i ) # " ( 1 X i 1 + P j 1 n=2 e nX i + u u i v ) ( 1 X i 1 + P j 1 1 n=2 e nX i + u u i v ) # " ( 1 X i 2 + P j 2 n=2 e nX i + u u i v ) ( 1 X i 2 + P j 2 1 n=2 e nX i + u u i v ) # 2 In practice, results were not sensitive to the number of draws used. I ran the analysis using 10, 20, 40, 50, 80, and 100 draws, and obtained very similar results in all attempts. 158 " ( 1 X i + P j 3 n=2 e nX i + u u i v ) ( 1 X i + P j 3 1 n=2 e nX i + u u i v ) # : This follows directly from Eq. C.1 and the formulas for the i 's in Eq. 3 in Section 4.3 . The individual likelihood contributions for j, j 1 , j 2 , j 3 2 can be obtained in the same way. 159 C.4 Standard Error Derivations C.4.1 General Case I begin with the general case and in the next sub-section specialize to the setting relevant to this paper. I dene ^ f, an estimate of a population proportion, as ^ f = 1 N N X i=1 h(X i ; ^ ); where h(X;) is a continuous and dierentiable function. In my application, 0 h(X;) 1. The parameter vector is estimated in a preliminary step and is p N consistent with p N( ^ 0 ) d !N(0;V ) I dene ~ f as the sample fraction calculated using the true parameter: ~ f = 1 N N X i=1 h(X i ; 0 ): The population fraction is f =E[h(X; 0 )] where the expectation is over the joint distribution of X. If the uniform law of large numbers (ULLN) holds, or in other words, if E sup 2 h(X;) <1 then ^ f p !f. I decompose the dierence between my estimated ^ f and the population proportion f into two parts: p N( ^ ff) = p N( ^ f ~ f) + p N( ~ ff) (C.2) I start with the rst term. By the mean value theorem, ^ f = ~ f + 1 N N X i=1 @h @ (X i ;) 0 ( ^ 0 ); 160 where is a random variable strictly between ^ and 0 . If E sup 2 @h @ (X;) <1; then another application of the ULLN and the Slutsky theorem gives p N( ^ f ~ f) =E @h @ (X; 0 ) 0 p N( ^ 0 ) +o p (1); so the asymptotic variance of the asymptotic normal distribution is Var( ^ f ~ f) = 1 N E @h @ (X; 0 ) 0 VE @h @ (X; 0 ) 2 N : (C.3) Moving on to the second term of Eq. C.2, we have that p N( ~ ff) = 1 p N N X i=1 (h(X i ; 0 )E(h(X; 0 ))): By the central limit theorem, this has an asymptotic normal distribution with variance Var( ~ ff) = 1 N Var(h(X; 0 ))) s 2 N : (C.4) Because p N( ^ f ~ f) and p N( ~ ff) are independent, Var( ^ ff) = 2 N + s 2 N : (C.5) where 2 N is dened by Eq. C.3 , and s 2 N is dened by Eq. C.4. C.4.2 Standard Errors for Proportion Dierences In this paper, rather than the standard error of an estimated proportion, I am interested in the standard error of a dierence between estimated proportions. In fact, there are two dierences of interest. The rst is the dierence between the estimated proportion of males and females (or high vs lower-education groups) who fall into the healthiest category, calculated using their own group's coecients to estimate the model. I will denote these ^ p m and ^ p f , respectively. The second comparison is the dierence between the simulated proportion of healthy males predicted using female thresholds (which I will denote ^ p g ) 161 and the simulated proportion of healthy females using female thresholds (the same ^ p f as above). This can be thought of as a DIF-adjusted gender comparison, and an analogous analysis can be conducted to compare high and lower education groups. As the calculation of standard errors for (^ p m - ^ p f ) is a special case of the more complex second comparison, I focus on the the latter: the dierence between ^ p g and ^ p f . I formally dene ^ p g as ^ p g = 1 N m X i2M Pr(X 0 i ^ m + i X 0 i ^ f +u i ) = 1 N m X i2M Pr( i u i X 0 i (^ f ^ m )) = 1 N m X i2M ( X 0 i (^ f ^ m ) q 1 + ^ 2 uf ) The m and f subscripts indicate the sample (male or female) used to estimate the coecients. For simplicity, I omit the 1 superscript in 1 as this is the only vector that is relevant to this discussion. 3 Dening ^ p f using these coecient subscripts, ^ p f = 1 N f X i2F ( X 0 i (^ f ^ f ) q 1 + ^ 2 uf ); it is clear now that I cannot simply calculate Var(^ p f ) and Var(^ p g ) separately because of the common ^ f and ^ uf . Therefore, I consider the dierence (rather than the individual proportions) as my estimate of interest: ^ f = ^ p g ^ p f Dening h(X;; ; u ) = ( X 0 i ( ) p 1 + 2 u ); I have ^ f = 1 N m X i2M h(X i ; ^ m ; ^ f ; ^ uf ) 1 N f X i2F h(X i ; ^ f ; ^ f ; ^ uf ) 3 I also omit sampling weights from these formulas, but they simply enter as individual-specic constants. 162 f( ^ ;X) where in the last line I dene ^ = ( ^ m ^ f ^ ) 0 ; grouping the common parameters together and letting ^ (^ f ^ uf ) 0 . The analogous sample dierence, calculated using true parameters, is ~ f = ~ p g ~ p f = 1 N m X i2M h(X i ; m0 ; f0 ; uf0 ) 1 N f X i2F h(X i ; f0 ; f0 ; uf0 ) and the population dierence is f = p g p f = E[h(X i ; m0 ; f0 ; uf0 )jX2M]E[h(X i ; f0 ; f0 ; uf0 )jX2F ] Recalling that the variance of a simulated proportion consists of two terms (as shown in Eq. C.5), I begin with calculating an estimate for the rst term, 2 N . Estimating 2 N Again using the mean value theorem, I have f( ^ ;X) =f( 0 ;X) + @f(X; ) @ ( ^ 0 ) which can be decomposed into three sums that involve the the male-specic coecients, the female- specic coecients, and the common coecients. ^ f = ~ f + @f(X; ) @ m ( ^ m m0 ) + @f(X; ) @ f ( ^ f f0 ) + @f(X; ) @ (^ 0 ) = ~ f + 1 N m X i2M @h(X i ; m ; ) @ m ( ^ m m0 ) + 1 N f X i2F @h(X i ; f ; ) @ f ( ^ f f0 ) + 1 N m X i2M @h(X i ; m ; ) @ 1 N f X i2F @h(X i ; f ; ) @ ! (^ 0 ) 163 The ULLN and Slutsky theorem once again give p N( ^ f ~ f) = 0 B B B B @ E h @h @m (X; m0 ; 0 )jX2M i E h @h @ f (X; f0 ; 0 )jX2F i E @h @ (X; m0 ; 0 )jX2M E @h @ (X; f0 ; 0 )jX2F 1 C C C C A 0 p N( ^ 0 ) +o p (1) so that the variance of the asymptotic normal distribution is Var( ^ f ~ f) = 0 B B B B @ E h @h @m (X; m0 ; 0 )jX2M i E h @h @ f (X; f0 ; 0 )jX2F i E @h @ (X; m0 ; 0 )jX2M E @h @ (X; f0 ; 0 )jX2F 1 C C C C A 0 V N 0 B B B B @ E h @h @m (X; m0 ; 0 )jX2M i E h @h @ f (X; f0 ; 0 )jX2F i E @h @ (X; m0 ; 0 )jX2M E @h @ (X; f0 ; 0 )jX2F 1 C C C C A = 2 N and can be estimated by ^ 2 N = 0 B B B B B B B B B B B @ 1 Nm P i2M 1 q 1+^ 2 uf ( X 0 i (^ f ^ m) q 1+^ 2 uf )X i 1 N f P i2F 1 q 1+^ 2 uf ( X 0 i (^ f ^ f ) q 1+^ 2 uf )X i 1 Nm P i2M 1 q 1+^ 2 uf ( X 0 i (^ f ^ m) q 1+^ 2 uf )X i 1 N f P i2F 1 q 1+^ 2 uf ( X 0 i (^ f ^ f ) q 1+^ 2 uf )X i 1 Nm P i2M ^ uf (1+^ 2 uf ) 3 2 ( X 0 i (^ f ^ m) q 1+^ 2 uf )X 0 i (^ f ^ m ) 1 N f P i2F ^ uf (1+^ 2 uf ) 3 2 ( X 0 i (^ f ^ f ) q 1+^ 2 uf )X 0 i (^ f ^ f ) 1 C C C C C C C C C C C A 0 ^ V N 0 B B B B B B B B B B B @ 1 Nm P i2M 1 q 1+^ 2 uf ( X 0 i (^ f ^ m) q 1+^ 2 uf )X i 1 N f P i2F 1 q 1+^ 2 uf ( X 0 i (^ f ^ f ) q 1+^ 2 uf )X i 1 Nm P i2M 1 q 1+^ 2 uf ( X 0 i (^ f ^ m) q 1+^ 2 uf )X i 1 N f P i2F 1 q 1+^ 2 uf ( X 0 i (^ f ^ f ) q 1+^ 2 uf )X i 1 Nm P i2M ^ uf (1+^ 2 uf ) 3 2 ( X 0 i (^ f ^ m) q 1+^ 2 uf )X 0 i (^ f ^ m ) 1 N f P i2F ^ uf (1+^ 2 uf ) 3 2 ( X 0 i (^ f ^ f ) q 1+^ 2 uf )X 0 i (^ f ^ f ) 1 C C C C C C C C C C C A (C.6) Here, because involves coecients from the estimation over the male population and over the female population, the matrix V involves a combination of estimated variance-covariance matrices from both estimations. In particular, let ^ Vm Nm represent the variance-covariance matrix for the male-specic 164 parameters of interest ( ^ m ), and ^ V f N f represent the variance-covariance matrix for the female parameters of interest ( ^ f ; ^ ). Then, the relevant variance covariance equation needed for this calculation is ^ V N = 0 B @ ^ Vm Nm 0 0 ^ V f N f 1 C A: Running separate estimations for males and females, I assume independence of the male and female coecients. Note that the formula for ^ 2 N (Eq. C.6) can be easily applied to calculating ^ 2 N for the simpler dierence, ^ p m ^ p f . The female coecients in the male summations are replaced by male coecients, gender-specic 's and u 's are included in the male- and female-specic vectors, and the common coecients, ^ , are dropped. ^ Vm Nm and ^ V f N f are simply the variance-covariance matrices from the separately-conducted male estimation and female estimation, respectively. Estimating s 2 N The calculation of s 2 N is straightforward if I assume the independence of the X's across the male and female populations. Var( ~ ff) = 1 N m Var(h(X; m0 ; 0 )jX2M) + 1 N f Var(h(X; f0 ; 0 )jX2F ) = s 2 N It can be estimated by ^ s 2 N = 1 N m 0 @ 1 N m X i2M 0 @ ( X 0 i (^ f ^ m ) q 1 + ^ 2 uf ) 1 N m X k2M ( X 0 k (^ f ^ m ) q 1 + ^ 2 uf ) 1 A 2 1 A + 1 N f 0 @ 1 N f X i2F 0 @ ( X 0 i (^ f ^ f ) q 1 + ^ 2 uf ) 1 N f X k2F ( X 0 k (^ f ^ f ) q 1 + ^ 2 uf ) 1 A 2 1 A : (C.7) For simplicity in future notations, I dene ~ s(g(X i );N) as the deviation of a function g(X i ) from its sample mean, from a sample of size N: ~ s(g(X i );N) =g(X i ) 1 N N X k=1 g(X k ): (C.8) 165 Using this notation, I can rewrite Eq. C.7 as ^ s 2 N = 1 N 2 m X i2M ~ s(h(X i ; ^ m ; ^ f ; ^ uf );N m ) 2 + 1 N 2 f X i2F ~ s(h(X i ; ^ f ; ^ f ; ^ uf );N f ) 2 The calculation of s 2 N becomes more complicated if I consider correlations between couples. In all of the surveys used, when a household is (randomly) selected, both husband and wife are included in the sample if both are present and eligible. If couples match non-randomly, this would create correlations across observations (within couples), which violates the assumption of independence across male and female covariates. 4 Though the entire discussion has been framed in terms of the male-female comparison, any formulas described until now can be directly applied to the comparison between educated and non-educated groups. However, taking into account correlations within couples when comparing across high and lower-education groups requires a slightly dierent approach than what is needed when simply comparing across males and females. I rst describe the methods used to account for correlations in the gender comparisons. Let SM denote the set of single males and N SM the number of individuals in this set. Similarly, let SF represent the set of single females, N SF the number of single females, C the set of individuals belonging to a married couple with both individuals in the sample, and N C the number of such cou- ples. Within a couple j, let X m j represent the characteristics of the male in the couple and X f j the characteristics of the female in the couple. With this additional notation, I rewrite ~ f as follows: ~ f = 1 N SM X i2SM h(X i ; m0 ; 0 ) N SM N m 1 N SF X i2SF h(X i ; f0 ; 0 ) N SF N f + 1 N C X j2C (h(X m j ; m0 ; 0 ) N C N m h(X f j ; f0 ; 0 ) N C N f ): Assuming independence across couples but not within couples, I can calculate the asymptotic vari- ance as follows: Var( ~ ff) = 1 N SM Var(h(X i ; m0 ; 0 ) N SM N m ) + 1 N SF Var(h(X i ; f0 ; 0 ) N SF N f ) 4 For the validity of the maximum likelihood estimation, I require independence across observations conditional on the included covariates. Therefore, if couples only match on age and education (which are included as my regressors), this conditional independence is not violated. 166 + 1 N C Var(h(X m j ; m0 ; 0 ) N C N m h(X f j ; f0 ; 0 ) N C N f ) = 1 N SM Var(h(X i ; m0 ; 0 ) N SM N m ) + 1 N SF Var(h(X i ; f0 ; 0 ) N SF N f ) + 1 N C Var(h(X m j ; m0 ; 0 ) N C N m ) + Var(h(X f j ; f0 ; 0 ) N C N f ) 2 1 N C Cov(h(X m j ; m0 ; 0 ) N C N m ;h(X f j ; f0 ; 0 ) N C N f ) : The corresponding estimate is: ^ s 2 N gender = 1 N 2 SM X i2SM ~ s(h(X i ; ^ m ; ^ ) N SM N m ;N SM ) 2 + 1 N 2 SF X i2SF ~ s(h(X i ; ^ f ; ^ ) N SF N f ;N SF ) 2 + 1 N 2 C X j2C ~ s(h(X m j ; ^ m ; ^ ) N C N m ;N C ) 2 + ~ s(h(X f j ; ^ f ; ^ ) N C N f ;N C ) 2 2 1 N 2 C X j2C ~ s(h(X m j ; ^ m ; ^ ) N C N m ;N C )~ s(h(X f j ; ^ f ; ^ ) N C N f ;N C ) (C.9) As mentioned earlier, adjusting for correlations in the education analysis requires a slightly dierent approach. Here, I break the sample into 6 groups: N SH single educated individuals (in set SH), N SL single lower-education individuals (in set SL), N C10 couples (in set C10) where the male is educated and the female is not,N C11 couples (in setC11) where both partners are educated,N C00 couples (in set C00) where both partners are in the lower-education category, andN C01 couples (in setC01) where the female is educated but the male is not. Let N H denote the total number of high-education individuals and N L denote the total number of lower-education individuals. Using an h subscript to indicate the high education group and the l subscript for the lower-education group, I can therefore re-write ~ f as: ~ f = 1 N SH X i2SH h(X i ; h0 ; 0 ) N SH N H 1 N SL X i2SL h(X i ; l0 ; 0 ) N SL N L + + 1 N C11 X j2C11 (h(X m j ; h0 ; 0 ) N C11 N H +h(X f j ; h0 ; 0 ) N C11 N H ) + 1 N C10 X j2C10 (h(X m j ; h0 ; 0 ) N C10 N H h(X f j ; l0 ; 0 ) N C10 N L ) + 1 N C01 X j2C01 (h(X f j ; h0 ; 0 ) N C01 N H h(X m j ; l0 ; 0 ) N C01 N L ) 1 N C00 X j2C00 (h(X m j ; l0 ; 0 ) N C00 N L +h(X f j ; l0 ; 0 ) N C00 N L ) 167 The asymptotic variance is Var( ~ ff) = 1 N SH Var(h(X i ; h0 ; 0 ) N SH N H ) + 1 N SL Var(h(X i ; l0 ; 0 ) N SL N L ) + 1 N C11 Var(h(X m j ; h0 ; 0 ) N C11 N H ) + Var(h(X f j ; h0 ; 0 ) N C11 N H ) +2 1 N C11 Cov(h(X m j ; h0 ; 0 ) N C11 N H ;h(X f j ; h0 ; 0 ) N C11 N H ) + 1 N C10 Var(h(X m j ; h0 ; 0 ) N C10 N H ) + Var(h(X f j ; l0 ; 0 ) N C10 N L ) 2 1 N C10 Cov(h(X m j ; h0 ; 0 ) N C10 N H ;h(X f j ; l0 ; 0 ) N C10 N L ) + 1 N C01 Var(h(X f j ; h0 ; 0 ) N C01 N H ) + Varh(X m j ; l0 ; 0 ) N C01 N L ) 2 1 N C01 Cov(h(X f j ; h0 ; 0 ) N C01 N H ;h(X m j ; l0 ; 0 ) N C01 N L ) + 1 N C00 Var((h(X m j ; l0 ; 0 ) N C00 N L ) + Var(h(X f j ; l0 ; 0 ) N C00 N L ) +2 1 N C00 Cov(h(X m j ; l0 ; 0 ) N C00 N L ;h(X f j ; l0 ; 0 ) N C00 N L ); and sample analog follows directly: ^ s 2 N educ = 1 N 2 SH X i2SH ~ s(h(X i ; h0 ; 0 ) N SH N H ;N SH ) 2 + 1 N 2 SL X i2SL ~ s(h(X i ; l0 ; 0 ) N SL N L ;N SL ) 2 + 1 N 2 C11 X i2C11 ~ s(h(X m j ; h0 ; 0 ) N C11 N H ;N C11 ) 2 + ~ s(h(X f j ; h0 ; 0 ) N C11 N H ;N C11 ) 2 +2 1 N 2 C11 X j2C11 ~ s(h(X m j ; h0 ; 0 ) N C11 N H ;N C11 )~ s(h(X f j ; h0 ; 0 ) N C11 N H ;N C11 ) + 1 N 2 C10 X i2C10 ~ s(h(X m j ; h0 ; 0 ) N C10 N H ;N C10 ) 2 + ~ s(h(X f j ; l0 ; 0 ) N C10 N L ;N C10 ) 2 2 1 N 2 C10 X j2C10 ~ s(h(X m j ; h0 ; 0 ) N C10 N H ;N C10 )~ s(h(X f j ; l0 ; 0 ) N C10 N L ;N C10 ) + 1 N 2 C01 X i2C01 ~ s(h(X f j ; h0 ; 0 ) N C01 N H ;N C01 ) 2 + ~ sh(X m j ; l0 ; 0 ) N C01 N L ;N C01 ) 2 2 1 N 2 C01 X j2C01 ~ s(h(X f j ; h0 ; 0 ) N C01 N H ;N C01 )~ s(h(X m j ; l0 ; 0 ) N C01 N L ;N C01 ) + 1 N 2 C00 X i2C00 ~ s((h(X m j ; l0 ; 0 ) N C00 N L ;N C00 ) 2 + ~ s(h(X f j ; l0 ; 0 ) N C00 N L ;N C00 ) 2 +2 1 N 2 C00 X j2C00 ~ s(h(X m j ; l0 ; 0 ) N C00 N L ;N C00 )~ s(h(X f j ; l0 ; 0 ) N C00 N L ;N C00 ): (C.10) 168 Therefore, the estimate for the variance of ^ p g ^ p m when comparing across genders is ^ V (^ p g ^ p m ) = ^ 2 N + ^ s 2 N gender ; (C.11) while the estimate for the variance when comparing across education levels can be written ^ V (^ p g ^ p m ) = ^ 2 N + ^ s 2 N educ ; (C.12) where ^ 2 N is dened by Eq. C.6, ^ s 2 N gender by Eq. C.9, and ^ s 2 N educ by Eq. C.10. Another way to deal with correlations between couples is to randomly select one individual from each household in order to generate a random sample of individuals. Now, with no correlations between men and women or educated and uneducated individuals, I can simply calculate ^ s 2 N using Eq. C.7, for both the gender and education comparisons. The results from this analysis are reported in Tables C1 and C2. Even though the sample sizes have fallen dramatically, the basic story remains the same: education dierences either remain constant or are exacerbated after adjusting for DIF. On the other hand, signi- cant dierences between males and females lose signicance in most domains for the ELSA and CHARLS after accounting for thresholds, while there is stronger evidence for remaining gender dierences in the HRS and IFLS. 169 Table C1 Standard Errors and t-statistics for Simulated Gender Dierences using a Random Sample of Individuals std errs random stars Page 4 IFLS (1) (2) (3) (4) (5) (6) Domain Gender Difference Standard Error t-statistic Gender Difference Standard Error t-statistic Mobility 0.0709 0.0408 1.738* 0.0575 0.0511 1.1267 Pain 0.1467 0.0396 3.708*** 0.1008 0.0479 2.102** Cognition 0.1227 0.0409 2.998*** 0.0874 0.0465 1.88* Sleep 0.1302 0.0384 3.396*** 0.0650 0.0401 1.6225 Affect 0.0893 0.0431 2.074** 0.0558 0.0514 1.0858 Breathing 0.0202 0.0375 0.5380 0.0018 0.0513 0.0351 HRS Domain Gender Difference Standard Error t-statistic Gender Difference Standard Error t-statistic Mobility 0.0156 0.0224 0.6963 0.1027 0.0338 3.041*** Pain 0.0229 0.0139 1.645* 0.0475 0.0213 2.232** Cognition 0.0482 0.0206 2.338** -0.0044 0.0387 -0.1138 Sleep 0.0337 0.0166 2.027** 0.0637 0.0278 2.29** Affect 0.0904 0.0212 4.267*** 0.0299 0.0433 0.6915 Breathing 0.0068 0.0235 0.2913 -0.0476 0.0508 -0.9375 ELSA Domain Gender Difference Standard Error t-statistic Gender Difference Standard Error t-statistic Mobility 0.0470 0.0226 2.075** 0.0282 0.0353 0.7993 Pain 0.0822 0.0188 4.375*** 0.0013 0.0231 0.0543 Cognition 0.0332 0.0230 1.4439 -0.0620 0.0456 -1.3595 Sleep 0.1279 0.0212 6.031*** 0.1048 0.0291 3.598*** Affect 0.1204 0.0220 5.48*** 0.0250 0.0497 0.5022 Breathing 0.0458 0.0202 2.27** -0.0412 0.0512 -0.8053 CHARLS Domain Gender Difference Standard Error t-statistic Gender Difference Standard Error t-statistic Mobility 0.0367 0.0586 0.6272 0.0348 0.0654 0.5320 Pain 0.0831 0.0454 1.83* 0.0018 0.0431 0.0408 Cognition 0.1078 0.0605 1.782* 0.0539 0.0613 0.8802 Sleep 0.1605 0.0514 3.121*** 0.0532 0.0504 1.0557 Affect 0.1113 0.0566 1.967** 0.0937 0.0665 1.4098 Breathing 0.0568 0.0567 1.0009 0.0680 0.0630 1.0788 Notes: *** p<0.01, ** p<0.05, * p<0.1 - "Gender Difference" is the difference between the proportion of males in the healthiest category and the proportion of females in the healthiest category. - Simulated proportions are calculated using coefficients from a HOPIT specification with the following explanatory variables: two age dummies, 1(High Ed), 1(Medium Ed), and all age-education interactions. - Standard errors are calculated analytically . Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds 170 Table C2 Standard Errors and t-statistics for Simulated Education Dierences using a Random Sample of Individuals std errs random stars Page 3 IFLS (1) (2) (3) (4) (5) (6) Domain Education Difference Standard Error t-statistic Education Difference Standard Error t-statistic Mobility 0.1061 0.0490 2.164** 0.1655 0.0517 3.202*** Pain 0.0977 0.0497 1.965** 0.1998 0.0626 3.19*** Cognition 0.0188 0.0474 0.3968 0.1003 0.0529 1.897* Sleep 0.0902 0.0449 2.007** 0.1170 0.0477 2.451** Affect -0.0053 0.0506 -0.1050 0.0625 0.0596 1.0491 Breathing 0.0097 0.0431 0.2259 0.0781 0.0470 1.661* HRS Domain Education Difference Standard Error t-statistic Education Difference Standard Error t-statistic Mobility 0.1678 0.0259 6.474*** 0.2552 0.0406 6.29*** Pain 0.0785 0.0172 4.56*** 0.1662 0.0341 4.871*** Cognition 0.1087 0.0240 4.536*** 0.3406 0.0517 6.594*** Sleep 0.0399 0.0192 2.078** 0.2322 0.0409 5.679*** Affect 0.1030 0.0247 4.177*** 0.1927 0.0540 3.569*** Breathing 0.1375 0.0263 5.231*** 0.2377 0.0416 5.717*** ELSA Domain Education Difference Standard Error t-statistic Education Difference Standard Error t-statistic Mobility 0.1293 0.0228 5.668*** 0.0991 0.0359 2.764*** Pain 0.0623 0.0196 3.178*** 0.0956 0.0330 2.894*** Cognition 0.1087 0.0235 4.62*** 0.2181 0.0541 4.034*** Sleep 0.0015 0.0220 0.0668 0.1592 0.0327 4.871*** Affect 0.0368 0.0228 1.6169 0.1746 0.0421 4.143*** Breathing 0.1015 0.0205 4.955*** 0.1191 0.0412 2.889*** CHARLS Domain Education Difference Standard Error t-statistic Education Difference Standard Error t-statistic Mobility 0.1182 0.0710 1.666* 0.0820 0.0755 1.0859 Pain 0.1577 0.0501 3.147*** 0.1857 0.0505 3.679*** Cognition 0.1999 0.0734 2.724*** 0.1651 0.0727 2.269** Sleep 0.1665 0.0632 2.633*** 0.2108 0.0655 3.219*** Affect 0.1765 0.0652 2.708*** 0.2325 0.0726 3.203*** Breathing 0.1219 0.0714 1.707* 0.1979 0.0773 2.559** Notes: *** p<0.01, ** p<0.05, * p<0.1 -"Education Difference" is the difference between the proportion of high-ed individuals in the healthiest category and the proportion of lower-ed individuals in the healthiest category. - Simulated proportions are calculated using coefficients from a HOPIT specification with the following explanatory variables: two age dummies, 1(Male), and all age-gender interactions. - Standard errors are calculated analytically. Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds Using Different Thresholds Using Same Thresholds 171 C.5 Pooled HOPIT Results C.5.1 HOPIT Estimation of the IFLS In this section, I discuss the results of estimating an ordered probit and a HOPIT model on the entire IFLS sample, in order to illustrate the importance of reporting heterogeneity. Table C3 reports the coecients from the main self-report equation ( in Eq. 1) using both an ordered probit and the HOPIT model, for each of the six health domains. The threshold equations for the cognition domain are discussed in this section, and the threshold equations for the remainder of the domains are available upon request. Since a one represents the healthiest response choice, negative coecients mean the regressors are associated with better health. More educated people appear to be healthier in the HOPIT model, across all domains except aect and breathing. Interestingly, the coecient on the high school graduate dummy is often smaller and sometimes even indistinguishable from zero in the ordered probit model but negative and signicant in the HOPIT model, suggesting that ignoring the possibility of DIF underestimates the positive relation- ship between educational attainment and health. The threshold equations for the cognition domain in Table C4 shed light on this hypothesis. In the rst threshold equation, the coecients on both education dummies are negative and signicant, which means that more educated people have lower i 's. In other words, they set a higher bar for what they deem as having \no diculty," in both their own self-reports and for the hypothetical vignettes. Failing to account for thresholds makes it seem like high school graduates are no dierent from non graduates, even though there are signicant dierences in both the true latent variable as well as the reporting behavior across groups. Although the high school graduate coecient is positive and signicant in the next threshold equation, coecients in higher threshold equations are harder to interpret as they represent the eects of the covariates on the relative distance between one threshold and the next. Furthermore, higher thresholds are less important because the majority of individuals in the full sample fall in the \healthiest" category. Across all health domains except mobility and breathing, gender is signicantly related to self- reported heath at the 5% level in the ordered probit, with males seemingly healthier. Moreover, males also appear signicantly healthier in the HOPIT models for pain, sleep, and aect, suggesting that response thresholds do not explain much of the gender gap in these domains. In cognition, however, it appears as though the gender gap can be explained by threshold dierences since the gender dummy is no longer signicant in the HOPIT specication. It should be noted that this pooled analysis, unlike 172 Table C3 Ordered Probit and HOPIT Estimation of Health in the IFLS alldomain ifls Page 4 Ordered Probit HOPIT Ordered Probit HOPIT Ordered Probit HOPIT Ordered Probit HOPIT Ordered Probit HOPIT Ordered Probit HOPIT 1(55 < Age <= 70) 0.201 0.195 0.508*** 0.546*** 0.308** 0.231 0.207 0.185 0.146 0.257 0.210 0.0805 (0.171) (0.183) (0.154) (0.164) (0.152) (0.162) (0.151) (0.158) (0.184) (0.195) (0.205) (0.218) 1(Age > 70) 0.971*** 1.077*** 0.332 0.493* 0.495* 0.572* 0.443* 0.446* -0.484 -0.513 0.335 0.214 (0.257) (0.273) (0.256) (0.273) (0.287) (0.306) (0.256) (0.266) (0.371) (0.393) (0.447) (0.484) 1(Male) -0.110 -0.0929 -0.228*** -0.195** -0.177** -0.131 -0.245*** -0.307*** -0.337*** -0.323*** 0.0480 0.0537 (0.102) (0.109) (0.0883) (0.0953) (0.0892) (0.0963) (0.0861) (0.0908) (0.101) (0.107) (0.118) (0.124) 1(High Education) -0.409*** -0.543*** -0.226* -0.334** -0.117 -0.317** -0.233** -0.375*** -0.0286 -0.174 0.0961 -0.132 (0.145) (0.155) (0.120) (0.130) (0.114) (0.123) (0.114) (0.121) (0.133) (0.143) (0.154) (0.164) 1(Medium Education) -0.0894 -0.128 -0.0847 -0.0813 -0.225** -0.341*** -0.0261 -0.0753 0.0142 0.0271 0.0252 -0.0862 (0.112) (0.119) (0.102) (0.109) (0.105) (0.114) (0.0990) (0.104) (0.115) (0.122) (0.139) (0.145) 1(Male) -0.0346 -0.0114 -0.123 -0.144 -0.185 -0.197 -0.136 -0.0708 0.0862 0.0901 -0.156 -0.0788 x 1(55 < Age <= 70) (0.187) (0.200) (0.159) (0.170) (0.165) (0.176) (0.164) (0.173) (0.200) (0.212) (0.206) (0.218) 1(High Education) 0.0532 0.211 -0.356 -0.387 -0.00835 -0.00718 0.0560 0.109 -0.336 -0.467 0.00799 0.183 x 1(55 < Age <= 70) (0.279) (0.298) (0.221) (0.239) (0.230) (0.249) (0.230) (0.243) (0.317) (0.340) (0.302) (0.320) 1(Medium Education) 0.112 0.0877 -0.144 -0.133 0.227 0.344* -0.0594 -0.0655 0.0155 -0.104 0.402* 0.537** x 1(55 < Age <= 70) (0.203) (0.218) (0.180) (0.192) (0.184) (0.197) (0.185) (0.194) (0.215) (0.227) (0.235) (0.250) 1(Male) -0.474 -0.807** 0.106 0.138 -0.506 -0.730** 0.195 -0.0649 0.354 0.331 -0.203 -0.387 x 1(Age > 70) (0.319) (0.345) (0.280) (0.300) (0.339) (0.361) (0.309) (0.328) (0.403) (0.432) (0.552) (0.598) 1(High Education) 0.358 0.609 -0.0138 -0.126 0.0781 0.261 0.544 1.011 0.945* 0.980* 0.00791 0.622 x 1(Age > 70) (0.528) (0.565) (0.596) (0.678) (0.519) (0.575) (0.576) (0.642) (0.513) (0.550) (0.759) (0.789) 1(Medium Education) -0.515 -0.745** -0.154 -0.355 0.732** 0.681* -0.380 -0.476 0.169 0.0511 -3.942 -4.459 x 1(Age > 70) (0.344) (0.375) (0.290) (0.310) (0.349) (0.369) (0.326) (0.344) (0.400) (0.432) (82.41) (112.8) Constant -1.711*** -1.270*** -1.125*** -1.061*** -1.672*** -2.092*** (0.112) (0.0978) (0.0976) (0.0879) (0.112) (0.134) Cutoff 1(probit)/ 0.521*** -0.741*** -0.144* -0.643*** 0.0487 -0.431*** 0.0985 -0.392*** 0.458*** -0.653*** 1.140*** -0.469*** theta 1 (HOPIT) (0.0941) (0.0480) (0.0870) (0.0345) (0.0886) (0.0296) (0.0825) (0.0238) (0.0946) (0.0416) (0.115) (0.0366) Cutoff 2(probit)/ 1.066*** -0.432*** 0.620*** -0.498*** 0.736*** -0.208*** 0.549*** -0.186*** 0.973*** -0.381*** 1.515*** -0.221*** theta 2 (HOPIT) (0.0980) (0.0345) (0.0881) (0.0308) (0.0906) (0.0248) (0.0834) (0.0190) (0.0979) (0.0293) (0.119) (0.0225) Cutoff 3 (probit)/ 1.662*** 0.506*** 1.235*** 0.491*** 1.339*** 0.481*** 1.127*** 0.370*** 1.446*** 0.388*** 2.189*** 0.320*** sigma v (HOPIT) (0.109) (0.0305) (0.0941) (0.0221) (0.0978) (0.0232) (0.0880) (0.0173) (0.106) (0.0234) (0.138) (0.0236) Cutoff 4 (probit)/ 2.374*** 0.369*** 2.127*** 0.441*** 2.259*** 0.434*** 2.090*** 0.398*** 2.242*** 0.366*** 2.768*** 0.379*** sigma u (HOPIT) (0.152) (0.0242) (0.128) (0.0213) (0.142) (0.0222) (0.122) (0.0190) (0.150) (0.0232) (0.192) (0.0277) Observations 1003 1003 1027 1027 1018 1018 1122 1122 944 944 996 996 Notes: t-statistics in parentheses (*** p<0.01, ** p<0.05, * p<0.1). (1) Mobility (2) Pain (3) Cognition (4) Sleep (5) Affect (6) Breathing 173 Table C4 Threshold Equations for Cognition Domain in the IFLS domain3ifls Page 5 HOPIT Threshold 1 ln(Threshold2 – Threshold1) ln(Threshold3 – Threshold2) ln(Threshold4- Threshold3) 1(55 < Age <= 70) 0.231 -0.0705 0.0345 -0.0204 -0.121 (0.162) (0.0934) (0.113) (0.109) (0.145) 1(Age > 70) 0.572* 0.130 -0.0115 0.0515 0.280 (0.306) (0.173) (0.227) (0.218) (0.302) 1(Male) -0.131 0.0601 -0.00451 -0.0253 -0.0636 (0.0963) (0.0545) (0.0651) (0.0607) (0.0867) 1(High Education) -0.317** -0.270*** 0.211** 0.0887 0.120 (0.123) (0.0722) (0.0842) (0.0790) (0.110) 1(Medium Education) -0.341*** -0.157** 0.114 0.100 -0.0456 (0.114) (0.0626) (0.0772) (0.0735) (0.104) 1(Male) -0.197 -0.0268 0.0334 -0.0686 0.175 x 1(55 < Age <= 70) (0.176) (0.104) (0.122) (0.119) (0.160) 1(High Education) -0.00718 -0.0495 0.129 -0.175 0.188 x 1(55 < Age <= 70) (0.249) (0.157) (0.167) (0.172) (0.206) 1(Medium Education) 0.344* 0.116 -0.0665 0.0173 0.120 x 1(55 < Age <= 70) (0.197) (0.113) (0.138) (0.133) (0.183) 1(Male) -0.730** -0.454** 0.331 0.207 -0.162 x 1(Age > 70) (0.361) (0.221) (0.258) (0.245) (0.345) 1(High Education) 0.261 0.319 -0.516 0.0942 1.795 x 1(Age > 70) (0.575) (0.333) (0.416) (0.308) (42.05) 1(Medium Education) 0.681* 0.0344 -0.146 -0.210 -0.307 x 1(Age > 70) (0.369) (0.230) (0.262) (0.250) (0.352) Constant -1.125*** -0.998*** -0.472*** -0.549*** -0.539*** (0.0976) (0.0659) (0.0737) (0.0744) (0.0957) Cutoff 1(probit)/ -0.431*** theta 1 (HOPIT) (0.0296) Cutoff 2(probit)/ -0.208*** theta 2 (HOPIT) (0.0248) Cutoff 3 (probit)/ 0.481*** sigma v (HOPIT) (0.0232) Cutoff 4 (probit)/ 0.434*** sigma u (HOPIT) (0.0222) Observations 1018 Notes: t-statistics in parentheses (*** p<0.01, ** p<0.05, * p<0.1). 174 the subgroup simulations conducted in the body of this paper, only allow for gender and education to have level eects on latent health and thresholds. In the subgroup analysis, by estimating the model separately, I allow the slope coecients and standard errors ( u , , v ) to vary across subgroups as well. Across all domains except aect and breathing, older people appear to be in worse health in both the ordered probit and HOPIT models (the omitted category in these regressions is the youngest age category, 55 and younger). However, overall, there is little evidence that age changes the eect of gender and education on health, as most interactions are insignicant. C.5.2 Robustness to Alternative Functional Form Table C5 reports the coecients for the HOPIT estimation of the cognition domain in the IFLS, including all threshold equations. Instead of the exponential function in Eq. 3, I estimate the model using a squared term: 3a. 0 i =1, 5 i =1, 1 i = 1 X i +u i , j i = j1 i + ( j X i ) 2 , j = 2, 3, 4 The coecients in the latent variable equation and the rst threshold equation are almost identical when comparing Table C5 with Table C4, as are the signs and signicance levels in the coecients in the second to fourth threshold equations, alleviating concerns about sensitivity to functional form assumptions. This lack of sensitivity to functional form holds for all domains and datasets. It is also possible to drop the requirement that 1 2 3 4 and instead use a linear specication for the threshold equations, as below. Bago d'Uva et al. (2011) use this specication because they nd that 1 2 3 4 is always satised. 3b. 0 i =1, 5 i =1, j i = j X i +u i , j = 1, 2, 3, 4 As Table C6 shows, the latent variable equation coecients are virtually identical when this linear specication is used instead. The threshold coecients for j > 1 cannot be directly compared because in the exponential and square specications, these coecients represent the marginal eect on the dierence between two thresholds, while in the linear specication, they represent the marginal eect on the level of one specic threshold. 175 Table C5 HOPIT Estimation of Cognition Domain in the IFLS (Using Alternative Functional Form) domain3ifls (2) Page 6 HOPIT Threshold 1 ln(Threshold2 – Threshold1) ln(Threshold3 – Threshold2) ln(Threshold4- Threshold3) 1(55 < Age <= 70) 0.231 -0.0715 0.0147 -0.00782 -0.0469 (0.162) (0.0938) (0.0464) (0.0419) (0.0542) 1(Age > 70) 0.568* 0.124 0.000760 0.0191 0.117 (0.306) (0.171) (0.0877) (0.0848) (0.126) 1(Male) -0.131 0.0600 -0.00235 -0.00976 -0.0210 (0.0962) (0.0544) (0.0271) (0.0236) (0.0326) 1(High Education) -0.317** -0.269*** 0.0878** 0.0340 0.0450 (0.123) (0.0722) (0.0352) (0.0304) (0.0424) 1(Medium Education) -0.341*** -0.157** 0.0462 0.0388 -0.0164 (0.114) (0.0626) (0.0313) (0.0284) (0.0388) 1(Male) -0.196 -0.0242 0.0104 -0.0212 0.0662 x 1(55 < Age <= 70) (0.176) (0.104) (0.0510) (0.0447) (0.0608) 1(High Education) -0.00688 -0.0492 0.0628 -0.0665 0.0804 x 1(55 < Age <= 70) (0.249) (0.157) (0.0750) (0.0622) (0.0824) 1(Medium Education) 0.344* 0.116 -0.0252 0.00186 0.0464 x 1(55 < Age <= 70) (0.196) (0.113) (0.0561) (0.0501) (0.0685) 1(Male) -0.730** -0.457** 0.142 0.0837 -0.0719 x 1(Age > 70) (0.361) (0.221) (0.108) (0.0968) (0.133) 1(High Education) 0.265 0.328 -0.225 0.0487 1.168 x 1(Age > 70) (0.575) (0.335) (0.171) (0.138) (27.67) 1(Medium Education) 0.688* 0.0558 -0.0760 -0.0806 -0.122 x 1(Age > 70) (0.369) (0.229) (0.111) (0.0991) (0.135) Constant -1.125*** -0.998*** 0.790*** 0.760*** 0.763*** (0.0976) (0.0659) (0.0293) (0.0284) (0.0364) Cutoff 1(probit)/ -0.431*** theta 1 (HOPIT) (0.0296) Cutoff 2(probit)/ -0.208*** theta 2 (HOPIT) (0.0248) Cutoff 3 (probit)/ 0.481*** sigma v (HOPIT) (0.0232) Cutoff 4 (probit)/ 0.434*** sigma u (HOPIT) (0.0222) Observations 1018 Notes: t-statistics in parentheses (*** p<0.01, ** p<0.05, * p<0.1). 176 Table C6 HOPIT Estimation of Cognition Domain in the IFLS (Using Linear Functional Form) domain3ifls (3) Page 7 HOPIT Threshold 1 ln(Threshold2 – Threshold1) ln(Threshold3 – Threshold2) ln(Threshold4- Threshold3) 1(55 < Age <= 70) 0.230 -0.0725 -0.0473 -0.0597 -0.132 (0.162) (0.0942) (0.0800) (0.0820) (0.0997) 1(Age > 70) 0.564* 0.119 0.128 0.157 0.350 (0.306) (0.169) (0.155) (0.168) (0.241) 1(Male) -0.131 0.0599 0.0553 0.0402 0.0127 (0.0962) (0.0543) (0.0446) (0.0459) (0.0591) 1(High Education) -0.317** -0.269*** -0.123** -0.0707 -0.00291 (0.123) (0.0722) (0.0578) (0.0593) (0.0779) 1(Medium Education) -0.341*** -0.158** -0.0824 -0.0224 -0.0463 (0.114) (0.0626) (0.0531) (0.0551) (0.0699) 1(Male) -0.195 -0.0220 -0.00991 -0.0357 0.0647 x 1(55 < Age <= 70) (0.176) (0.104) (0.0847) (0.0868) (0.112) 1(High Education) -0.00697 -0.0496 0.0713 -0.0296 0.107 x 1(55 < Age <= 70) (0.249) (0.157) (0.118) (0.119) (0.157) 1(Medium Education) 0.344* 0.116 0.0767 0.0736 0.146 x 1(55 < Age <= 70) (0.196) (0.113) (0.0947) (0.0978) (0.125) 1(Male) -0.731** -0.465** -0.213 -0.0795 -0.201 x 1(Age > 70) (0.361) (0.222) (0.182) (0.191) (0.247) 1(High Education) 0.270 0.340 -0.0573 0.0406 3.321 x 1(Age > 70) (0.575) (0.336) (0.261) (0.279) (102.3) 1(Medium Education) 0.695* 0.0779 -0.0763 -0.200 -0.391 x 1(Age > 70) (0.368) (0.229) (0.188) (0.196) (0.253) Constant -1.125*** -0.998*** -0.373*** 0.205*** 0.785*** (0.0976) (0.0659) (0.0493) (0.0484) (0.0683) Cutoff 1(probit)/ -0.431*** theta 1 (HOPIT) (0.0296) Cutoff 2(probit)/ -0.208*** theta 2 (HOPIT) (0.0248) Cutoff 3 (probit)/ 0.481*** sigma v (HOPIT) (0.0232) Cutoff 4 (probit)/ 0.434*** sigma u (HOPIT) (0.0222) Observations 1018 Notes: t-statistics in parentheses (*** p<0.01, ** p<0.05, * p<0.1). 177 C.6 Vignette Equivalence Vignette equivalence is an important assumption underlying this model, which is not always tested in existing applications of this methodology. I test for vignette equivalence using the methods outlined by Bago d'Uva et al. (2011). This test is based on the idea that vignette equivalence rules out systematic dierences in respondents' understanding or interpretation of the vignettes. In other words, covariates can be excluded from the equation for the latent variable for vignette health, Y li = l + li . In order to test this necessary condition for vignette equivalence, Bago d'Uva et al. (2011) suggest including covariates in all but one of the vignette equations. This allows for systematic variation in vignette responses that are not captured by the dierent response thresholds. In other words, I replace the original vignette equations (Eq. 4) with the following: 4a. Y 1i = 1 + 1i 4b. Y li = l + l X i + li ;l6= 1 Under the null of vignette equivalence, l = 0 for l = 2; 3. Following Bago d'Uva et al. (2011), I run the HOPIT model again but replace the original vignette equations with Eq. 4a and Eq. 4b. Table C7 displays the results from the original HOPIT model (which assumed vignette equivalence) and compares this to the model which tests for vignette equivalence by including covariates in the vignette equations. I report the coecients from Eq. 1 in the basic HOPIT, Eq. 1 in the HOPIT testing for vignette equivalence, then Eq. 4a and 4b in the vignette equivalence tests. The rst 4 columns show the results from the pain domain in the IFLS and the last 4 show the results from the cognition domain (results for all other domains and datasets available upon request). To test whether the covariates belong in the vignette equations, I run a likelihood ratio test. For the cognition domain, the likelihood ratio test cannot reject the null of vignette equivalence. Moreover, both the AIC and the BIC prefer the simpler model. For the pain domain, I reject the null that all coecients in the vignette equivalence equations are equal to zero, which translates to a rejection of the vignette equivalence assumption. However, in order to judge the severity of the consequences of this violation, I compare the coecients in the rst two columns of each dataset, Eq. 1 in the basic model and Eq. 1 in the enhanced model. None of the coecients are signicantly dierent from each other, which suggests that including the covariates in the vignette equations leaves all major interpretations unchanged. In fact, although the AIC favors the more complex model, the BIC prefers the simpler specication due to its penalization of extra parameters. In short, although vignette equivalence may 178 not hold in this particular scenario, adjusting the model to allow for violations does little to change the conclusions. In fact, using the IFLS data, the likelihood ratio test rejects the null for four out of the six domains, but in none of these domains are the coecients from the simple and enhanced model signicantly dierent from each other. This result is not unique to the IFLS. In the other three datasets, although vignette equivalence is more consistently rejected across domains (as in Bago d'Uva et al. (2011) using the ELSA data), the vast majority of coecients are statistically indistinguishable across specications. Table C7 Testing Vignette Equivalence ve ifls 23 Page 8 Assuming V.E. Testing V.E. Vignette Equation 1 Vignette Equation 2 Assuming V.E. Testing V.E. Vignette Equation 1 Vignette Equation 2 1(55 < Age <= 65) 0.546*** 0.504*** -0.0803 -0.0180 0.231 0.290* 0.122 0.0416 (0.164) (0.178) (0.112) (0.109) (0.162) (0.173) (0.103) (0.100) 1(65 < Age <= 75) 0.493* 0.426 -0.0275 -0.127 0.572* 0.695** 0.0969 0.246 (0.273) (0.296) (0.187) (0.185) (0.306) (0.328) (0.205) (0.198) 1(Male) -0.195** -0.234** -0.0456 -0.0584 -0.131 -0.124 0.0300 -0.0176 (0.0953) (0.104) (0.0621) (0.0609) (0.0963) (0.103) (0.0572) (0.0559) 1(High Education) -0.334** -0.549*** -0.322*** -0.206** -0.317** -0.290** 0.0618 0.000590 (0.130) (0.143) (0.0862) (0.0832) (0.123) (0.132) (0.0740) (0.0723) 1(Medium Education) -0.0813 -0.149 -0.0718 -0.110 -0.341*** -0.307** 0.0357 0.0579 (0.109) (0.118) (0.0711) (0.0702) (0.114) (0.121) (0.0683) (0.0665) 1(Male) -0.144 -0.0941 0.123 0.0144 -0.197 -0.210 -0.0661 0.0334 x 1(55 < Age <= 65) (0.170) (0.186) (0.117) (0.113) (0.176) (0.189) (0.110) (0.107) 1(High Education) -0.387 -0.337 0.0162 0.118 -0.00718 -0.340 -0.513*** -0.310** x 1(55 < Age <= 65) (0.239) (0.264) (0.162) (0.155) (0.249) (0.272) (0.160) (0.153) 1(Medium Education) -0.133 -0.237 -0.182 -0.106 0.344* 0.307 -0.0751 -0.0229 x 1(55 < Age <= 65) (0.192) (0.209) (0.132) (0.129) (0.197) (0.210) (0.122) (0.119) 1(Male) 0.138 0.357 0.378* 0.239 -0.730** -0.822** -0.0770 -0.139 x 1(65 < Age <= 75) (0.300) (0.326) (0.204) (0.203) (0.361) (0.390) (0.239) (0.233) 1(High Education) -0.126 0.311 0.574 0.521 0.261 0.0467 -0.116 -0.480 x 1(65 < Age <= 75) (0.678) (0.724) (0.441) (0.442) (0.575) (0.624) (0.351) (0.350) 1(Medium Education) -0.355 -0.366 -0.0720 0.0235 0.681* 0.305 -0.277 -0.763*** x 1(65 < Age <= 75) (0.310) (0.336) (0.206) (0.206) (0.369) (0.402) (0.250) (0.250) Constant -1.270*** -1.168*** -0.507*** -0.370*** -1.125*** -1.139*** -0.465*** -0.214*** (0.0978) (0.103) (0.0623) (0.0601) (0.0976) (0.103) (0.0604) (0.0565) Observations 1027 1027 1018 1018 AIC 10697.72 10694.87 10645.82 10659.85 BIC 11013.52 11119.22 10961.06 11083.45 LR test Chi-squared stat 46.85 29.97 LR test df 22 22 LR test p-value 0.00153 0.119 Pain Cognition Notes: t-statistics in parentheses (*** p<0.01, ** p<0.05, * p<0.1). 179
Abstract (if available)
Abstract
This thesis explores questions related to the early-life determinants of human capital and the measurement of one crucial component of human capital—health. ❧ The long-term effects of early-life health shocks on later-life human capital are well-documented, but the reasons why men and women often respond differently to these shocks are less well-studied. In Chapter 2, using data from Mexico, I show that exposure to pollution in the second trimester of gestation leads to significantly lower cognitive ability in adulthood for both men and women. For women only, however, this shock to cognitive ability also leads to lower high school completion and income. I identify two labor market features that explain why women adjust their schooling decisions more than men: (1) women sort into the white-collar sector at higher rates, and (2) schooling and ability are more complementary in the white-collar sector than in the blue-collar sector. I verify the higher degree of complementarity in white-collar jobs by structurally estimating the wage parameters for each sector, using a dynamic discrete choice model of education and occupational choice. ❧ Can investing in children who faced adverse events in early childhood help them catch up? In Chapter 3, Achyuta Adhvaryu, Anant Nyshadham, Jorge Tamayo, and I answer this question using two orthogonal sources of variation—resource availability at birth (local rainfall) and cash incentives for school enrollment—to identify the interaction between early endowments and investments in children. We find that adverse rainfall in the year of birth decreases grade attainment, post-secondary enrollment, and employment outcomes. But children whose families were randomized to receive conditional cash transfers experienced a much smaller decline: each additional year of program exposure during childhood mitigated more than 20 percent of early disadvantage. ❧ Self-reported measures of health are becoming more widely used to study health inequalities both across and within countries, but comparisons of these subjective measures can be distorted by the use of different response thresholds across individuals. In Chapter 4, I use anchoring vignettes from Indonesia, the U.S., England, and China to study the extent to which differences in self-reported health across genders and education levels can be explained by the use of different response thresholds. To determine whether statistically significant differences between groups remain after adjusting thresholds, I calculate standard errors for the simulated probabilities, largely ignored in previous literature. Accounting for reporting heterogeneity reduces the gender gap in many health domains across the four countries, but to varying degrees. Health disparities across education levels persist and even widen after equalizing thresholds across the two groups.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Selection and impacts of early life events on later life outcomes
PDF
Beyond revealed preferences: how gender, relative socioeconomic status, and social norms drive happiness and behavior
PDF
Essays on health and aging with focus on the spillover of human capital
PDF
Essays on human capital accumulation -- health and education
PDF
Three essays on economics of early life health in developing countries
PDF
Three essays on health & aging
PDF
Intergenerational transfers & human capital investments in children in the era of aging
PDF
Value in health in the era of vertical integration
PDF
Discriminating changes in health using patient-reported outcomes
PDF
Essays on health economics
PDF
Essays on health insurance programs and policies
PDF
Inter-temporal allocation of human capital and economic performance
PDF
Long-term impacts of childhood adversity on health and human capital
PDF
Essays on development and health economics
PDF
Essays on innovation, human capital, and COVID-19 related policies
PDF
Four essays on how policy, the labor market, and age relate to subjective well-being
PDF
Essays in macroeconomics
PDF
Three essays in international macroeconomics and finance
PDF
Essays in labor economics: demographic determinants of labor supply
PDF
Essays on the empirics of risk and time preferences in Indonesia
Asset Metadata
Creator
Molina, Teresa
(author)
Core Title
The determinants and measurement of human capital
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Degree Conferral Date
2017-05
Publication Date
04/17/2017
Defense Date
03/21/2017
Publisher
Los Angeles, California
(original),
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
cash transfers,early-life shocks,human capital,OAI-PMH Harvest,self-reported health
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Strauss, John (
committee chair
), Kapteyn, Arie (
committee member
), Nugent, Jeff (
committee member
), Sood, Neeraj (
committee member
)
Creator Email
tmolina13@gmail.com,tsmolina@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC11257809
Unique identifier
UC11257809
Identifier
etd-MolinaTere-5193.pdf (filename)
Legacy Identifier
etd-MolinaTere-5193
Dmrecord
355458
Document Type
Dissertation
Format
theses (aat)
Rights
Molina, Teresa
Internet Media Type
application/pdf
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
cash transfers
early-life shocks
human capital
self-reported health