Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on human capital accumulation -- health and education
(USC Thesis Other)
Essays on human capital accumulation -- health and education
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ESSAYS ON HUMAN CAPITAL ACCUMULATION - HEALTH AND EDUCATION by Subha Mani A Dissertation Presented to the FACULTY OF THE GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ECONOMICS) August 2008 Copyright 2008 Subha Mani Dedication I dedicate my Ph.D. dissertation to my parents, Mrs. Radha Mani and Mr. K.G.S. Mani for their love, support, and encouragement through all the ups and downs of life. ii Acknowledgments This dissertation would not have been possible without the the support and encour- agement of - my advisor, members of my dissertation committee, faculty members at USC, staff at USC, my friends and family. First and foremost, I am indebted to my advisor, Professor John Strauss for introduc- ing me to the field of microeconomics of development. He has patiently provided me with detailed and constructive comments on my work. He has continuously encouraged and supported me throughout these years. I have immensely benefited from his knowl- edge and experience in the field. I consider myself extremely fortunate to have had the opportunity to complete my dissertation under his guidance. I am grateful to Professor Jeffrey B Nugent for his numerous helpful comments and suggestions. He has provided me with useful professional and personal advice. I will remain thankful to Professor Nugent for all this and more. My sincere thanks to Professor John Ham for his critical comments on both my research and presentation, in particular, his feedback on the model section of my paper is invaluable to me. I express my gratitude to Dr. John Hoddinott for providing me an opportunity to work at the International Food Policy Research Institute (IFPRI) during the summer of 2005. Dr. John Hoddinott is an excellent mentor and a wonderful person. He has provided me with excellent comments and suggestions on my research work. I have admired his knowledge, enthusiasm, and work ethics. I am thankful for his professional advice and support. iii I thank the College of Letters, Arts and Sciences, University of Southern California, and United Nations University-World Institute for Development Economics Research (UNU-WIDER) for providing me financial support. I am also thankful to Dr. John Hod- dinott and International Food Policy Research Institute (IFPRI) for providing me access to data from the multiple waves of the Ethiopian Rural Household Survey (ERHS). I have benefited a lot from presenting my work at the student development work- shop at USC. I am thankful to Professor Duncan Thomas, and numerous seminar par- ticipants at the University of Melbourne, Monash University, University of Southern California, Bureau for Research in Economic Analysis and Development (BREAD) summer school, UNU-WIDER meetings, Population Association of America (PAA) meetings, and Northeast Universities Development Consortium (NEUDC) meetings, for comments and suggestions. Special thanks to Dr. Basudeb-Guha-Khasnobis for giving me an opportunity to work for UNU-WIDER. My sincere thanks are due to my father-in-law, Dr. Dipankar Dasgupta and my friend and colleague, Rahul Giri for their time and advice on the model section of my research work. My research has also benefited from discussions with my colleagues, Rubina Verma, Olga Shemyakina, and Tomoya Matsumoto. I am especially thankful to two of my closest friends Rubina Verma and Rahul Giri for being my family away from home. A special thanks goes to Dana Bhargava, Shiv Sehgal, Ashish Agarwal, Abhijit Chaudhari, Sonam Gupta, and Rajini Parameswaran for being there for me at all times. I am thankful to Young Miller and Morgan Ponder for patiently answering all admin- istrative concerns. They have gone out of their way to make graduate student life easier for me. They have also played a very important role during the job market process. iv My sincere thanks to Professor Juan Carrillo, Professor Geert Ridder, Professor Car- oline Betts, Professor Michael Magill, Professor Yong Kim, and Bodhi Ganguli for their time and advice during the job market process. I owe a lot to my husband, Utteeyo Dasgupta, who has helped me remain focused on my research goals and always kept my professional interests and long-term career goals ahead of everything. He has patiently edited my work and listened to my presen- tations. His critical comments and feedback have helped me in every stage of the Ph.D. program. His love and support has made the journey of life an enjoyable and memorable experience. I am thankful to my sister, Sudha Mani, for patiently going over my research work and providing me with useful suggestions. She has encouraged and supported me in all my endevours. She is my best friend on whom I can rely for anything and everything. I have admired my husband and my sister for being methodical and focused on their research work. Their professional achievements have been an inspiration to me. I have looked up to them for advice and support as graduate students and now as assistant professors. I thank my brother-in-law, Anand Ramakrishnan whom I have always admired for his self confidence. I owe my sincere gratitude to my parents-in-law, Dr. Dipankar Dasgupta and Mrs. Sankari Dasgupta for their constant encouragement, love, and unflinching faith in me. My father-in-law has also provided me support and advice on various professional mat- ters which have been useful to me. Last, but nevertheless the least, I am most indebted to my parents, Mr. K.G.S. Mani and Mrs. Radha Mani. I admire my father’s passion for work and ambitious nature. He always said to me from childhood, ‘Work is Worship’. My mother is the biggest pillar of strength in my life. She has always taught me to look ahead, move on, and NEVER v give up on anything. My mother’s motto, like the Trojans is ‘Fight On’! The love and feelings I have for my parents cannot be expressed in a few words. vi Table of Contents Dedication ii Acknowledgments iii List of Tables ix List of Figures xi Abstract xii Chapter 1: Introduction 1 Chapter 2: The role of the household and the community in determining child health 5 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Model and empirical specification . . . . . . . . . . . . . . . . . . . . 9 2.3 Data and variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3.1 Indonesian Family Life Survey . . . . . . . . . . . . . . . . . . 13 2.3.2 Sample size, variables, and descriptive statistics . . . . . . . . . 15 2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Chapter 3: Is there complete, partial, or no recovery from childhood malnutri- tion? - empirical evidence from Indonesia 29 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2 Literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.3 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.4 Empirical specification and identification . . . . . . . . . . . . . . . . 43 3.5 Data and variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.5.1 Indonesian Family Life Survey . . . . . . . . . . . . . . . . . . 49 3.5.2 Attrition rates . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.5.3 Sample size, variables, and descriptive statistics . . . . . . . . . 53 3.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 vii 3.6.1 Catch-up effects - complete, partial, or none? . . . . . . . . . . 58 3.6.2 Test and discussion of weak instruments for the dynamic panel specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.6.3 A test of serial correlation in the error terms . . . . . . . . . . . 68 3.6.4 Role of child, household, and community characteristics in the dynamic conditional health demand function . . . . . . . . . . 70 3.6.5 Do catch-up effects differ with age? . . . . . . . . . . . . . . . 72 3.6.6 Further implications . . . . . . . . . . . . . . . . . . . . . . . 75 3.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Chapter 4: Determinants of schooling outcomes among children from rural Ethiopia 78 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 4.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 4.3 Empirical specification . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.4.1 Ethiopian Rural Household Survey . . . . . . . . . . . . . . . . 95 4.4.2 Sample composition . . . . . . . . . . . . . . . . . . . . . . . 96 4.4.3 Attrition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.4.4 Descriptive statistics . . . . . . . . . . . . . . . . . . . . . . . 100 4.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.5.1 Static results . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 4.5.2 Dynamic results . . . . . . . . . . . . . . . . . . . . . . . . . 131 4.6 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142 Chapter 5: Conclusion 144 Bibliography 147 Appendices 157 Appendix A: Appendix to Chapter 2 157 Appendix B: Appendix to Chapter 3 159 Appendix C: Appendix to Chapter 4 162 viii List of Tables Table 2.1: Summary statistics on Height-for-age z-score for children between the age of 3 and 59 months in 1993, 1997, and 2000 . . . . . . . . . . . 16 Table 2.2: Summary statistics on Height-for-age z-score for children between the age of 3 and 59 months in 1993, who are followed through the 1997 and 2000 waves of the IFLS . . . . . . . . . . . . . . . . . . . . . . . 17 Table 2.3: Summary statistics of all variables used in the empirical specification 17 Table 2.4: Determinants of Height-for-age z-score for panel respondents, pool- ing data from 1993, 1997 and 2000 . . . . . . . . . . . . . . . . . . . . 18 Table 2.5: Determinants of Height-for-age z-score for male and females sep- arately . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Table 3.1: Mean height attained in 2000 for all panel children between the age of 3 and 59 months in 1993 . . . . . . . . . . . . . . . . . . . . . . . . 56 Table 3.2: Summary statistics of all variables used in the empirical specification 57 Table 3.3: Dynamic health demand function . . . . . . . . . . . . . . . . . . 59 Table 3.4: Dynamic health demand function with additional interaction terms 72 Table 4.1: Sample averages using data for primary school children from 1994 103 Table 4.2: Sample averages using data for primary school children from 1999 104 Table 4.3: Sample averages using data for primary school children from 2004 104 Table 4.4: Mean changes in schooling outcomes and other variables between 1994 and 2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Table 4.5: Determinants of schooling enrollment among primary school age children from 1994 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 ix Table 4.6: Determinants of schooling enrollment among primary school age children from 1999 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Table 4.7: Determinants of schooling enrollment among primary school age children from 2004 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Table 4.8: Determinants of relative grades attained (RGA) among primary school age children from 1994 . . . . . . . . . . . . . . . . . . . . . . 115 Table 4.9: Determinants of relative grade attained (RGA) among primary school age children from 1999 . . . . . . . . . . . . . . . . . . . . . . 116 Table 4.10: Determinants of relative grade attained (RGA) among primary school age children from 2004 . . . . . . . . . . . . . . . . . . . . . . 118 Table 4.11: Coefficient estimates on log (PCE) as reported in Tables 4.5-4.7 . 125 Table 4.12: Coefficient estimates on log (PCE) as reported in Tables 4.8-4.10 125 Table 4.13: Determinants of enrollment and relative grades attained among panel respondents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Table 4.14: Dynamic schooling enrollment demand function . . . . . . . . . 135 Table 4.15: Dynamic schooling relative grade attainment demand function . . 137 Table A.1: First-stage regression results for the preferred estimates reported in column 5 of table 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Table B.1: First-stage results for estimates reported in columns 5 and 6 of table 3.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Table B.2: Determinants of sample attrition . . . . . . . . . . . . . . . . . . 161 Table C.1: Determinants of sample attrition . . . . . . . . . . . . . . . . . . 162 Table C.2: Preferred IV estimates for Boys from 1994, 1999, 2004 . . . . . . 164 Table C.3: Preferred IV estimates for girls from 1994, 1999, 2004 . . . . . . 166 x List of Figures Figure 3.1: Lowess plot on height-for-age z-score against age in months for all panel children . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Figure 3.2: Lowess plot on height in cms against age in months for all panel children . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 Figure 3.3: Catch-up effects . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Figure 4.1: Male enrollment rate (%) by age in years . . . . . . . . . . . . . 100 Figure 4.2: Female enrollment rate (%) by age in years . . . . . . . . . . . . 101 Figure 4.3: Male Relative Grade Attainment . . . . . . . . . . . . . . . . . . 102 Figure 4.4: Female Relative Grade Attainment . . . . . . . . . . . . . . . . . 103 xi Abstract Investment in human capital is associated with higher economic and non-economic gains in the future for the individual, the household, and at the aggregate level for the economy. In my dissertation, I examine three different policy oriented questions that are relevant for improving health and education, two important dimensions of human capital. First, I characterize the socioeconomic factors that determine health status among children. Second, I extend the above question to capture the extent to which poor nutri- tion during childhood affects an individual’s future physical well-being. The association between nutritional deficiency at young ages and subsequent health status captures the extent to which children can recover from some of the deficits in health status caused by early malnourishment. To address these questions, empirical evidence is drawn using observations on chil- dren between ages 3 and 59 months in 1993, who are followed through the 1997 and 2000 waves of the Indonesian Family Life Survey (IFLS). The results suggest that - (1) it is parent’s genetic endowments, household income, and community infrastructure availability that is important for improving children’s health outcomes. (2) Poor nutri- tion at young ages will cause some, but not severe retardation in the growth of future height indicating partial recovery from chronic malnutrition. Finally, I also outline the socioeconomic determinants of schooling enrollment and relative grade attainments among primary school age children. To address this question, xii I use data from the the 1994, 1999, and 2004 waves of the Ethiopian Rural Household survey (ERHS). I find that it is household income that has the most important role in explaining schooling outcomes. The results also bring out the impact of past schooling outcome in explaining current schooling attainments. There exists a strong positive association between past schooling outcomes and current schooling outcomes. The empirical methodology followed in this dissertation address a number of econo- metric concerns such as measurement error, omitted variables, attrition, sample selec- tion, endogeneity, and instrument relevance. xiii Chapter 1 Introduction Investment in human capital is associated with improvements in an individual’s future economic and social well-being. For instance, Thomas and Strauss (1997) show that taller men and women are both likely to earn higher wage earnings even after con- trolling for education and other dimensions of health status. Similarly Psacharopoulos and Patrinos (2004) have estimated the gains from accumulating an additional year of schooling as 10%. At the macro level, these investments foster economic growth and development in a country. The literature suggests that the human capital to be accumu- lated during life course is predetermined at an early age, primarily under the age of 5 years [Martorell (1995, 1999), Waterlow (1988), Hoddinott and Kinsey (2001)]. Hence, it is early life conditions and socioeconomic environment during childhood that affects an individual’s complete trajectory of future health and educational outcomes. Despite the well documented benefits associated with investments in human capital - (a) at least 30% of children (under the age of 5 years) from countries like Indonesia, Ethiopia, and India suffer from chronic nutritional deficiencies, and (b) more than 100 million primary school age children remain out of school, affecting schooling enroll- ments and grade completion. The objective of this dissertation is to provide a better understanding of the human capital accumulation process among children. The aim is to use robust econometric techniques so that the empirical evidence can be used to guide policy initiatives taken to improve health and educational outcomes among children. The first objective here is to characterize the determinants of nutritional status among children. Policy initiatives can be further taken to influence these determinants in a way 1 that health outcomes can be permanently altered at an early stage of an individual’s life. The second objective is to capture the extent to which childhood malnutrition affects an individual’s subsequent health status. If children were able to recover from some the deficits in health status caused by early malnutrition, then some of long-term conse- quences associated with chronic malnutrition could be mitigated early on. The third objective is to not just outline the socioeconomic determinants of current schooling enrollments and grade progression but also, capture the impact of past schooling inputs and resources in determining an individual’s complete trajectory of future schooling out- comes. This allows policy makers to capture both short-run and long-run determinants of schooling outcomes. In chapter 2, a static conditional health demand function is estimated to capture the role played by child level, household level, and community level factors in determining child health. The empirical evidence comes from the 1993, 1997 and 2000 waves of the Indonesian Family Life Survey (IFLS). The results suggest that children born to taller (and healthier) parents are likely to be more well-nourished. In addition children who live in communities with better community infrastructure are more well-nourished. Finally, children residing in household’s with higher incomes enjoy better nutritional status. In chapter 3, a dynamic conditional health demand function is estimated to identify the extent to which individuals are subsequently able to compensate for some of the poor nutritional outcomes from the past. Empirical evidence is drawn using observations on children between the age of 3 and 59 months in 1993 who can also be followed through the 1997 and 2000 waves of the IFLS. This paper finds that malnutrition during childhood will cause only some permanent growth retardation in an individual’s physical well-being as measured by height attainments. I find that a malnourished child in the absence of any recovery, would grow to be 4.15 cm shorter than a well-nourished child. 2 However, in the presence of partial recovery, a malnourished child will grow to be only 0.95 cm shorter than a well-nourished child. This implies that at least some of the negative consequences associated with childhood malnutrition can be mitigated at an early age. In chapter 4, I estimate both a static and dynamic conditional schooling demand function to - (1) identity factors that explain improvements in schooling outcomes, (2) capture the role played by past schooling resources in explaining an individual’s com- plete trajectory of current and future enrollments and grade progression. The static framework uses observations on primary school age children from three waves of the Ethiopian Rural Household Survey (ERHS). The dynamic framework uses observations on primary school age children from 1994, who can also be followed through the 1999 and 2004 waves of the ERHS. The static results suggest that household income has the most important role in determining schooling outcomes among children. The dynamic results suggest that a child who was enrolled in school in the last period is 32 percent- age points more likely to be enrolled today compared to a child who was not enrolled in the last period. This suggests that even a one time level effect in improving children’s schooling enrollments today will translate into improvements in the future. A similar relationship is established between relative grades accumulated in the last period and current period’s relative grades of schooling. The main contributions of this dissertation are - (1) it identifies the socioeconomic determinants of nutritional status and schooling outcomes among children, (2) it iden- tifies the extent to which malnutrition during childhood affects an individual’s future physical well-being, relying on weaker stochastic assumptions compared to earlier work in the literature, and (3) it captures the impact of past schooling inputs and resources in determining an individual’s complete trajectory of current and future schooling out- comes. The empirical methodology used in this dissertation addresses a number of 3 econometric concerns such as omitted variables problem, measurement error bias, sam- ple attrition, selection, endogeneity, and instrument relevance. 4 Chapter 2 The role of the household and the community in determining child health 2.1 Introduction Health is an important indicator of an individual’s overall well-being. Improvements in health status are positively associated with greater labor productivity and higher wage earnings. For example, empirical evidence from Brazil suggests that taller men and women both earn higher wage earnings even after controlling for education and other dimensions of health status (Thomas and Strauss, 1997). The nutrition literature sug- gests that individual’s pattern of growth in height is predetermined at a young age [(Mar- torell (1999); Waterlow (1988)]. Hence, it is nutritional status at young age that is most relevant for determining final height attainments which further affects an individual’s overall economic and social well-being. There are two additional motivations for focusing on the health status of young chil- dren: (1) it is shown that better nutrition during childhood is also positively associ- ated with higher completed grades of schooling and cognitive development [Strauss and Thomas (2008); Alderman et al. (2006); Glewwe and Miguel (2008)] affecting wage earnings and overall well-being. (2) Most of the permanent deficits in height attain- ments occur during childhood when children are most vulnerable to economic shocks and health shocks [Hoddinott and Kinsey (2001); Adair (1999)]. Hence, for individuals to enjoy higher returns from investments in both health and education, it must be that factors that improve nutritional status among children are well identified and appropri- ately targeted. 5 The main objective of this chapter is to characterize the socioeconomic determinants of nutritional status among children. Understanding and analyzing the impact of such factors is essential for guiding policy initiatives that can influence these determinants in a way that health outcomes can be permanently altered at an early stage of an individual’s life. The most widely used indicators of child health are height-for-age z-score (HAZ 1 ), weight-for-height z-score and weight-for-age z-score. 2 Among the three indicators, HAZ is identified as a long-run indicator of nutritional status as it captures for the entire stock of nutrition accumulated since birth (Waterlow, 1988). In addition, anthropometric outcomes are not subject to systematic measurement error, a standard problem encoun- tered while using subjective measures of health status; such as self reported morbid- ity. Hence anthropometric outcomes are more reliable indicators of long-run nutritional deficiency. Children with HAZ less than -2 are classified as undernourished and or stunted by the WHO (World Health Organization). Stunting in young children remains to be a seri- ous source of concern among policy makers in several developing countries, including Indonesia. For example, the country experienced a period of rapid economic growth between the years 1990 and 1996. During this period, the average growth in GDP per capita remained at and around 6%. However, even with such high levels of economic growth, 40.6% of children under the age of 5 were malnourished and/stunted. Shortly after the period of rapid economic growth, Indonesia suffered a sharp reversal in its economic performance during late 1997 and early 1998. Sudden depreciation of the Indonesian Rupiah led to an increase in the relative price of tradable goods, especially 1 HAZ is standardized height calculated using the 1977 NCHS (National Center for Health Services) tables drawn from the United States population conditional upon age (in months) and sex. 2 Weight-for-height z-score and weight-for-age z-score are standardized weights calculated using the 1977 NCHS tables drawn from the U.S. population conditional upon height in cm and age respectively 6 foodstuffs. Nominal price of food increased resulting in an inflation of about 150% within months. However, by 2000, Indonesia witnessed rapid recovery in the growth rate of GDP per capita along with lower inflation rates. During the recovery period, the country also witnessed significant declines in the percentage of stunted children. How- ever, in absolute terms, the percentage of children suffering from chronic nutritional deficiencies still remains at a high 35.1% - comparable to several poor African nations. The primary goal of this chapter is to examine the impact of the various child level, household level, and community level characteristics in determining child health status as measured by HAZ. Gender differences in health outcomes at a young age result in differences in attained height as an adult, which can potentially manifest into differences in future earnings [Waterlow (1988); Thomas and Strauss (1997)]. Hence, we also examine the gender specific determinants of child health outcomes. This chapter uses data from the three waves of the Indonesian Family Life Survey (IFLS). I construct a panel data for children between 3 and 59 months in 1993 and follow them through 1997 and 2000 waves of the survey. A static conditional health demand function is estimated to capture the impact of current socioeconomic factors in explaining current health status. The static estimation results indicate that it is parental height, household income, provision of electricity and availability of a paved road in the community that are most important in determining children’s nutritional status. We find that a 1 cm increase in mother’s height is associated with a 0.04 standard deviation (s.d) improvement chil- dren’s HAZ scores. Similarly a 1 cm increase in father’s height corresponds to a 0.03 s.d. improvement in z-score. Parent’s genetic endowments have a strong positive effect on child health, whereas parental schooling has little independent influence on child health. Household income has the strongest impact, a 100% increase in real per capita consumption expenditure is associated with 0.24 s.d. improvement in z-scores. Income 7 effects are strong and positive even after controlling for the endogeneity in PCE. Com- munity infrastructure also has some role determining child health. I also examine if the socioeconomic determinants of child health vary by gender. The pooling tests on the joint (male and females pooled together) sample suggests that none of the socioeco- nomic characteristics that affect child health differ by gender. The chapter contributes to the already existing literature on health outcomes in mul- tiple ways: first, growing evidence shows that child health is strongly correlated with adult health outcomes and hence a contribution to the literature examining the determi- nants of child health becomes even more relevant [see Strauss and Thomas (2008) for a recent review]. Second, Ghuman et al. (2005) show that correlation between community level unobservables and household specific unobservables can bias the estimated coef- ficient on the household characteristics. They show that not accounting for this correla- tion overestimates or underestimates the estimated coefficient on the family background characteristics by almost 40-50%. To address this issue, the chapter captures the inde- pendent effects of the family background characteristics on child health, controlling for community level fixed-effects. This method also addresses the endogeneity in the com- munity level characteristics allowing us to obtain reliable parameter estimates on both the community level characteristics and household level characteristics. Third, we treat our measure of long run household income as captured by logarithm of real per capita household consumption expenditure (PCE) as endogenous and compare the extent and sources of bias between the OLS and IV estimates of PCE. The rest of the chapter is organized as follows. Section 2.2 outlines the theoretical model and the empirical framework used for estimation purposes. Section 2.3 describes the data, provides summary statistics and other descriptive statistics. The main results of the chapter are described in section 2.4. Concluding remarks follow in section 2.5. 8 2.2 Model and empirical specification Parents make investments in children’s health with the aim of improving the child’s current and future economic and physical well-being. Following Strauss and Thomas (1998), the static health production function, H t can be specified a function of health inputs, environmental factors, individual demographic characteristics, household back- ground characteristics, genetic endowments, time-varying health shocks, and time- invariant health endowments. It is assumed here that health status observed today is only a function of current period characteristics and that history does not matter. 3 H t =h(M t ,I t ,D t ,θ ct ,θ c ,μ ht ,μ h ,G) (2.1) H t is current health status measured by height-for-age z-score. M t is health input at time t which includes food and non-food consumption goods used towards the mainte- nance and or improvement of child health. It is assumed that households do not derive any direct utility from the consumption of health inputs except from its indirect use in the accumulation of child health output. I t characterizes the environment where the child lives capturing water and sanitation facility and other infrastructure in the commu- nity. D t captures time-varying demographic characteristics such as the child’s age. θ t includes all time-varying health shocks like fever and diarrhea. θ c summarizes informa- tion about all time-invariant characteristics such as the child’s gender and time-invariant health endowments like the child’s innate ability to absorb nutrients and fight diseases. μ ht and μ h capture household specific time-varying and time-invariant demographics 3 This assumption is relaxed in chapter 3 and I show how history of health inputs do matter in deter- mining health outcomes. 9 and background characteristics such as parents rearing and caring practices. G sum- marizes information about all genetic endowments capturing genotype 4 and phenotype 5 influences on child health. Households allocate health based on the following utility maximization problem described below. The household maximizes utility (2.2), subject to a period specific budget constraint (2.3), and a period specific child health production function (2.4). It is assumed that the utility function is concave and twice differentiable. Max :U =u[C t ,H t ,L t ;θ pt ] (2.2) Subject to: P c t C t +P m t M t =w t (T t −L t )+π t (2.3) H t =f(M t ,I t ,D t ,θ ct ,θ c ,μ ht ,μ h ,G) (2.4) The utility function depends upon consumption goods that include food and non- food consumption commodities, C t , leisure, L t , health status of the child, H t , and cer- tain unobserved preference shocks, θ pt . P c t is a vector of prices of food and non-food consumption goods. P m t is a vector of price of health inputs. w t is the wage rate (price 4 Genotype influences include genetic endowments that are passed from the parents to the child via their DNA. 5 Phenotype influences capture all observable characteristics of an individual, such as shape, size, color, and behavior that result from the interaction of genotype influences with the environment. 10 of leisure). T t is parents total time endowment. Profit income from farm and non-farm activities and all other sources of non-labor income is captured byπ t . Using simple first-order conditions, we can solve for the optimal amount of child health input,M ∗ t as follows: M ∗ t =m(P c t ,P m t ,w t ,I t ,π t ,D t ,θ ct ,θ c ,μ ht ,μ h ,G,θ pt ) (2.5) The static conditional health demand function (2.6) can be obtained by replacingM t in equation (2.4) byM ∗ t in equation (2.5): H ∗ t =h(P c t ,P m t ,w t ,I t ,π t ,D t ,θ ct ,θ c ,μ ht ,μ h ,G,θ pt ) (2.6) The empirical counter part of the static conditional health demand function (2.6) can be written as follows: H it =β 0 + R X j=1 β X j X jit + S X j=1 β Z j Z ji +ǫ c +υ it ; υ it =ǫ i +ǫ h +ǫ it (2.7) H it , is the child’s height-for-age z-score at time t, where subscript i refers to the individual. Xs include time-varying regressors and Zs are time-invariant regressors. There are two sources of unobservables in the static specification (equation 2.7) - υ it and ǫ c where, υ it is assumed to be a time-varying i.i.d term and ǫ c is time-invariant community specific unobservable which can be removed using community fixed-effects. There are not enough observations with at least two children from the same mother or household to be able to separately control for household specific time-invariant unob- servables and hence we treat the time-invariant unobservables at the individual (ǫ i ) and 11 household level (ǫ h ) as random. These time-invariant unobservables can be explicitly addressed in the dynamic model estimated in chapter 3. The time-varying child level and household level regressors include age of the child and measure of household income as captured by log(real per capita household con- sumption expenditure). Information on age of the child, and per capita consumption expenditure is obtained from the household questionnaires. The time-invariant child level regressors include a male dummy where the variable takes a value of 1 if male and 0 otherwise. The time-invariant household level regressors include - mother’s height, father’s height, mother’s completed grades of schooling and father’s completed grades of schooling to capture genetic endowments and parent’s rearing and caring practices. All height measurements are recorded to the nearest 0.1 cm (or 1 mm). The time-varying community characteristics included are - indicator for whether the individual lives in a rural area, log of real price of rice, log of real price of condensed milk, log of real price of cooking oil, distance to health center in km, a dummy variable indicating the presence of paved road, percentage of households with electricity, log of real hourly male wage rates, log of real hourly female wage rates, and number of health posts in a community. Prices of food consumption goods such as price of rice, price of cooking oil, and price of condensed milk are obtained from the community questionnaires. All prices are converted in real terms using the consumer price index and expressed in logs. Hourly male and female wage rates are also converted in real terms and expressed in logs. Information on whether the community has a paved road or not, number of health posts located in a community, distance to the health center in km, and percentage of households with electricity in a community capture community resource availability. The following section describes the data used for estimating equation (2.7). 12 2.3 Data and variables 2.3.1 Indonesian Family Life Survey The data used in this chapter comes from the 1993, 1997 and 2000 waves of the Indonesian Family Life Survey (IFLS), a large-scale socioeconomic survey conducted in Indonesia. The IFLS collects extensive information at the individual, the household, and the community level. The survey includes modules on measures of health, house- hold composition, labor and non-labor income, farm and non-farm assets, pregnancy, schooling, consumption expenditure, contraceptive use, sibling information, and immu- nization [see Frankenberg et. al (1995, 2000) and Strauss et. al (2004) for more details on sample selection and survey instruments]. The IFLS is an ongoing longitudinal survey, the first wave of which was fielded dur- ing late 1993 and early 1994 (IFLS1). In IFLS1, 7224 households were interviewed. The first follow-up wave was surveyed during the second half of 1997 (IFLS2) just before the major economic and financial crisis in Indonesia. In IFLS2, 7629 households were interviewed of which 6752 were original IFLS1 households and 877 were split-off households. The third wave (IFLS2+) was a special follow-up survey fielded during the late 1998. A 25% sub-sample of the original IFLS1 households were contacted in late 1998 with the aim of analyzing the immediate impact of the 1997-98 economic and financial crisis. The fourth wave of the IFLS was fielded in 2000 (IFLS3). A total of 10435 households were interviewed in 2000. Of these, 6661 were original IFLS1 house- holds and 3774 households were split-off households. The sample surveyed in 1993-94 represented 83% of the Indonesian population living in 13 of Indonesia’s 27 provinces at the time. The 13 provinces are spread across the islands of Java, Bali, Kalimantan, 13 Sumatra, West Nusa Tenggara, and Sulawesi. Provinces were selected to maximize rep- resentation of the population, capture the cultural socio-economic diversity of Indone- sia, and yet be cost-effective given the size and the terrain of the country. A total of 321 enumeration areas (EAs)/communities were selected from these 13 provinces for final survey purposes. Location information for all respondents is available at four administrative unit levels in Indonesia (from smallest to the largest): community, kecamatan (subdistrict), kabu- patan (municipality) and province. One would ideally like to use the community level code as the location variable to remove any location-specific time-invariant unobserv- ables from the model and also control for community level time-varying characteristics in the right hand side of the empirical specification. There are two challenges in using the original community codes as the location variable in this study: First, community level data is only available for respondents residing in the 321 original IFLS commu- nities. The IFLS does not provide detailed community level information for mover households except for some communities in 2000 [see details in the mini-CFS question- naire from Strauss et. al (2004)]. Second, to do any location-specific fixed-effects, data must be available on at least 2 children residing in the same community from each of the three waves of the IFLS. It becomes particularly hard to obtain observations on 2 chil- dren from the same community during the follow-up surveys, since many households have moved over time into new communities that were not initially surveyed in 1993. These households are followed over time but typically there is only 1 observation on the mover household from the new community. Hence, in order to be able to match house- holds with community level information in all three waves of the survey, and estimate fixed-effects models to remove time-invariant community level unobservables. The fol- lowing decision rule is used to create the “location” variable. 14 The “location” variable created here is assigned with the community code if there are 5 or more children residing in the same community. 6 In cases where this criterion fails, the “location” variable is assigned the code corresponding to the next level of aggrega- tion, i.e., the kecamatan 7 code following the same rules. Similarly the kabupatan and lastly the province codes are assigned to the location variable in order to obtain at least 5 children from each of the newly created location variable. This new aggregation of the geographic units helps us combine household level and community level information and also allows the use of fixed-effects estimation techniques at the location level. It is this “location” variable which captures geographic information corresponding to each household in all three waves of the IFLS. All community level characteristics reported in the tables vary at the location level created here and not at the original community id level. 2.3.2 Sample size, variables, and descriptive statistics Martorell and Habicht (1986) and Satyanarayana et. al (1980) point out that decline in growth in height during the first few years of life largely determines the small stature exhibited by adults in developing countries. In addition height measured at young ages is also strongly correlated with attained body size as an adult [Spurr (1988), Martorell (1995)]. Hence, the initial sample is restricted to observations on children less than 5 years of age in 1993. In addition, the sample is restricted to include children who are less than 12 years of age in 2000 in order to keep the child health production function 6 It is usually the case that less than 5 children are found only in communities which were not the original IFLS1 communities and are communities where mover households resided. 7 The kecamatan and kabupatan codes are based on BPS (Indonesian central bureau of statistics) codi- fication that can be easily linked to other nationally representation data like the SUSENAS. The definition of a kecamatan and a kabupatan continues to change over time. In order to use systematic codes of the kecamatan and kabupatans over time, I use the 1999 BPS codes that define the kecamatan and kabuptan codes for all IFLS communities from all three years of the survey. 15 time-invariant for the complete sample here. 8 This restriction does not result in the loss of several observations because the initial sample includes children who are younger than 60 months in 1993 and hence, by 2000, over 99% of the sample is still under 144 months of age. The final sample includes 1819 children for whom there exist complete anthropometric details from all three waves of the survey. Table 2.1 shows trends in mean height-for-age z-scores and percentage of children classified as stunted over the three waves of the IFLS. Table 2.1: Summary statistics on Height-for-age z-score for children between the age of 3 and 59 months in 1993, 1997, and 2000 Years Observations HAZ<-2 Mean Mean difference (years) 1993 2203 40.26 -1.57 -0.127*** (1997-1993) (0.01) (0.03) (0.05) 1997 2356 41.38 -1.70 0.272*** (2000-1997) (0.01) (0.03) (0.04) 2000 3537 34.88 -1.43 0.145*** (2000-1993) (0.008) (0.02) (0.04) - Source: IFLS - 1993, 1997, and 2000; *** significant at 1%, ** significant at 5%, * significant at 10% - Standard errors reported in parenthesis are robust to clustering at the household level There exists significant improvement in mean height-for-age z-scores over time for children using the repeated cross-sectional 9 data. The statistics indicate that mean height-for-age z-scores worsen until 1997 and then improve during 1997-2000. The percentage of children classified as stunted also increases between 1993 and 1997 and then declines between 1997 and 2000. In summary, trends in child health status as mea- sured by height-for-age z-scores have improved by the year 2000. A similar trend is 8 The child health production function varies between young children and teenagers going through pubescent growth spurts (Waterlow, 1988). 9 Cross-section data includes data for children between the ages 3 and 59 months in 1993, 1997, and 2000 waves of the IFLS. 16 seen using observations on the panel respondents 10 , see table 2.2. Summary statistics of variable used in the regression specifications are outlined in table 2.3. Table 2.2: Summary statistics on Height-for-age z-score for children between the age of 3 and 59 months in 1993, who are followed through the 1997 and 2000 waves of the IFLS Years Observations HAZ<-2 Mean Mean difference (years) 1993 1819 40.62 -1.62 -0.133*** (1997-1993) (0.01) (0.03) (0.03) 1997 1819 42.05 -1.75 0.077*** (2000-1997) (0.01) (0.02) (0.01) 2000 1819 38.64 -1.68 -0.055*** (2000-1993) (0.01) (0.02) (0.03) - Source: IFLS - 1993, 1997, and 2000; *** significant at 1%, ** significant at 5%, * significant at 10% - Standard errors reported in parenthesis are robust to clustering at the household level Table 2.3: Summary statistics of all variables used in the empirical spec- ification Variables Mean Std. dev Height-for-age z-score (HAZ) -1.68 1.30 Height in cm 105.86 19.42 Mother’s height in cm 150.5 5.1 Father’s height in cm 161.3 5.3 Mother’s completed grades of schooling 5.96 3.93 Father’s completed grades of schooling 6.90 4.33 Log of real per capita household consumption expenditure 9.87 0.76 Square root of real per capita household productive assets 1.51 2.60 Square root of real per capita household total assets 4.48 3.78 Distance to the community health center in km 5.07 4.58 Percentage of households with electricity 76.68 26.92 Log of real male wage rate 6.55 0.52 Log of real female wage rate 6.19 0.84 Log of real price of rice 0.85 0.20 Log of real price of condensed milk 5.17 1.51 Log of real price of cooking oil 1.74 0.43 Dummy=1 if the community has paved road 0.74 0.43 Number of health posts in a community 6.67 4.73 - Source: IFLS - 1993, 1997, and 2000; No. of observations - 5457 10 Panel respondents includes data for children initially between the ages 3 and 59 months in 1993 who are followed through the 1997, and 2000 waves of the IFLS. 17 2.4 Results The estimation results of equation 2.7 are reported in table 2.4. The location inter- acted time dummies specified in table 2.4, control for a full set of location level time- varying unobservables. In column 5 of table 2.4 these location interacted time dum- mies are replaced with actual location level time-varying observable characteristics. The regression coefficients reported in table 2.4 follow an OLS/IV estimation strategy with location fixed-effects. The standard errors reported are adjusted for clustering at the individual level, and are also robust to the presence of any arbitrary form of het- eroskedasticity. Table 2.4: Determinants of Height-for-age z-score for panel respondents, pooling data from 1993, 1997 and 2000 Covariates (1) OLS (2) OLS (3) OLS (4) IV (5) IV HAZ HAZ HAZ HAZ HAZ Male dummy -0.7659*** -0.7528*** -0.7647*** -0.7890*** -0.6848** (0.28) (0.28) (0.28) (0.30) (0.28) Spline in age in months -0.0780*** -0.0773*** -0.0778*** -0.0793*** -0.0774*** (< 24 months) (0.009) (0.009) (0.009) (0.01) (0.009) Spline in age in months -0.0013 -0.0012 -0.0013 -0.0015 0.0017* (>= 24 months) (0.001) (0.001) (0.001) (0.001) (0.0008) Spline in age in 0.0340*** 0.0333*** 0.0338*** 0.0352*** 0.0303** months (<24)*male (0.01) (0.01) (0.01) (0.01) (0.01) dummy Spline in age in -0.0029*** -0.0028*** -0.0028*** -0.0030*** -0.0029* months (>=24)*male (0.001) (0.001) (0.001) (0.001) (0.001) dummy Mother’s height 0.0480*** 0.0482*** 0.0480*** 0.0475*** 0.0473*** (0.004) (0.004) (0.004) (0.003) (0.003) Father’s height 0.0360*** 0.0364*** 0.0357*** 0.0351*** 0.0348*** (0.003) (0.003) (0.003) (0.003) (0.003) Mother’s schooling 0.0154** 0.0187*** 0.0161** 0.0094 0.0082 (0.007) (0.006) (0.007) (0.006) (0.007) Father’s schooling 0.0026 0.0051 0.0024 -0.0018 -0.0029 (0.006) (0.006) (0.006) (0.005) (0.005) log(PCE) 0.0886*** 0.2478*** 0.2478*** (0.03) (0.08) (0.07) Productive assets -0.0012 (0.007) Total assets 0.0158*** (0.005) 18 Table 2.4: Continued Covariates (1) OLS (2) OLS (3) OLS (4) IV (5) IV Price of rice 0.3038* (0.16) Price of cooking oil -0.0948* (0.04) Price of condensed milk -0.0036 (0.01) Rural dummy 0.0230 (0.18) Rural dummy*price -0.3083* of rice (0.18) Number of health posts 0.0180 (0.01) Distance to health center 0.0070 (0.005) Electricity 0.0025** (0.001) Dummy for paved road 0.1170* (0.06) Male wage rate 0.0127 (0.05) Female wage rate 0.0135 (0.03) observations 5457 5457 5457 5457 5457 Location interacted Yes Yes Yes Yes No time fixed-effects Location No No No No Yes fixed-effects - Source: IFLS - 1993, 1997, and 2000; *** significant at 1%, ** significant at 5%, * significant at 10% - Standard errors reported in parenthesis are robust to clustering at the individual level - In (4), log(PCE) is instrumented with household total assets. The F statistic on the excluded instruments is 161.19 - In (5), log(PCE) is instrumented with total household assets. The F on the excluded instruments is 174.14 - The first-stage regression estimates for column 5 are reported in table A.1 of the appendix - Also included in the regressions are dummy variables capturing missing observations on mother’s schooling, father’s schooling, mother’s height, and father’s height, where the missing observation was imputed by the sample mean - Prices of consumption goods and hourly wage rates are converted in real terms and expressed in logs The coefficient estimates obtained on the child and household characteristics from columns 4 and 5 of table 2.4 are not statistically different from each other, indicating that the choice of using location interacted time dummies vs. location time-varying charac- teristics is not likely to bias the estimated coefficients on the household characteristics reported in columns 4 and 5 of table 2.4. 19 The coefficient on the male dummy from table 2.4 has a negative sign, suggesting that females have better health than male children. This result is striking when compared to other Asian countries like India and Bangladesh which exhibit comparable levels of stunting, where one finds large significant gender differentials in favor of boys vis-a-vis girls. For Indonesia this is not very surprising, since the country does not traditionally suffer from large gender differences in human capital accumulation outcomes. The relationship between height-for-age z-score and age in months is non-linear and the coefficient on the spline variables captures this non-linearity; indicating that z-scores decline till the age of 24 months and then improve and remain steady and or unchanged after 48 months. 11 The interaction terms between the spline variables and male dummy captures the gender specific changes in health outcomes. Overall, females have higher z-scores as compared to their male counterparts. Household characteristics included in the regression estimates are parent’s com- pleted grades of schooling, parental height in centimeters, and measure of household income. Parents schooling variable captures for the efficiency with which health inputs are transformed into health output. The coefficient estimates on mother’s completed grades of schooling and father’s completed grades of schooling reported in table 2.4 shows an expected positive relationship between parental schooling and child health. Every additional year of mother’s schooling increases z-scores by 0.015 (column 1, table 2.4) standard deviations. Father’s schooling has a positive though insignificant impact on z-scores. The IV estimates reported in column 4, table 2.4 also our preferred estimates indicate that neither of the parental schooling variables have a statistically sig- nificant impact on child health. The positive correlation between household per capita consumption expenditure and mother’s schooling is likely to have biased the coefficient estimate on mother’s schooling upwards in column 1, table 2.4. This is contrary to much 11 This is consistent with much of the literature on health outcomes (see Strauss et. al, 2004). 20 of the evidence in the literature (see Strauss and Thomas, 1995 for review). The present specification uses a linear measure of completed grades of schooling. To capture the dif- ferential impact of the various levels of schooling completion on child health, I split the measure of completed grades of schooling into 4 separate dummy variables. The first dummy variable takes a value equal to 1 if the mother has 6 or less grades of schooling and 0 otherwise. The second dummy variable takes a value equal to 1 if the mother has between 6 and 9 completed grades and 0 otherwise. The third dummy variable takes a value equal to 1 if the mother has between 9 and 12 grades of schooling and 0 otherwise. The last dummy variable takes a value equal to 1 if the mother completes 12 or more grades of schooling and 0 otherwise. Similarly four separate dummy variables are con- structed to capture father’s schooling completion. The exclusion of parent’s completed grades of schooling with the three separate dummy indicators results in non-significant impacts of parental schooling indicators on child health. A joint test on the newly con- structed dummy variables for mother’s schooling gives a chi2 of 3.66 with p-value of 0.16 suggesting that the impact of mother’s schooling does not differ by her level of schooling completion. A joint test on the dummy variables capturing father’s schooling levels gives a chi2 of 2.44 with a p-value of 0.29, again implying that the impact of father’s schooling on child health does not vary by his level of schooling completion. Parental height variables capture the impact of genetic endowments in determining current health. Mother’s height in centimeters and father’s height in centimeters both capture the impact of different genetic endowments in ascertaining the child’s current health status. 12 Every 1 centimeter increase in mother’s height improves z-scores by 12 See Thomas and Strauss (1992) for discussion on the role played by parent-specific genetic endow- ments in explaining current health status. 21 0.04 standard deviations and every 1 centimeter increase in father’s height improves z- scores by 0.03 standard deviations (column 4, table 2.4). 13 Mother’s height has a higher impact in determining child health compared to father’s height. This is similar to the results found by Ghuman et. al (2005) and Thomas, Strauss, and Henriques (1992). The final household characteristic included in the regression specification is that of household income. Logarithm of real per capita household consumption expendi- ture is used to capture the household’s complete resource availability. OLS estimates of log(PCE) from column 1, table 2.4 can be both biased upwards due to its correla- tion with time-invariant household-specific unobservables and biased downwards due to measurement error in data. Assets are exogenously determined in a static model and hence, log(PCE) is replaced with productive assets and total assets respectively in columns 2 and 3 of table 2.4. The results indicate that children residing in house- holds with higher income enjoy better health. IV estimates of log (PCE) are reported in column 4 of table 2.4 where log(PCE) is instrumented with the sum of household productive assets, unproductive assets, and unearned income, which are assumed to be exogenous in a static model. The coefficient estimate on log(PCE) increases from 0.08 (column 1, table 2.4) to 0.24 (column 4, table 2.4) showing that IV estimates of income have much larger impact on current health status. The increase in the coefficient esti- mate of log(PCE) from OLS to IV regressions indicates that OLS estimates of log(PCE) is likely to be biased downward due to measurement error and not biased upwards due to omitted variables. 14 The role of income is largely consistent with most related work 13 It takes about 10 years for the average height in a population to increases by 1 cm and hence the magnitude of these impacts on future height’s are less. 14 The F statistic on the excluded instruments and the Hansen J statistic from the first-stage regression for the IV estimates reported in table 2.4 are appended at the end of table 2.4 and the complete first-stage regression estimates are summarized in table A.1 of the appendix. 22 examining the determinants of child health. 15 Household income can also possibly have non-linear effects on child health. To capture this non-linearity, I include a spline in the measure of household income at the sample median. The preferred IV specification is re-estimated with the non-linear measures of PCE. The two measures of pce in the non-linear specification are not significantly different from each other. A chi2 test on the two measures of pce is 0.48 with p-value of 0.48 rejecting any non-linear effect of pce on child health. The role of community/location time-varying characteristics is also important in determining child health. In the light of endogenous program placement effects, not accounting for the correlation between community infrastructure variables and commu- nity level time-invariant unobservables can bias coefficient estimates on the community characteristics [Rosenzweig and Wolpin (1986)]. To address this issue, the preferred IV estimates include location fixed-effects allowing me to identify the exogenous impact of the time-varying community level characteristics on child health. These estimates are valid under the assumption that the time-varying community level unobservables affect- ing program placement are uncorrelated with the community level observable charac- teristics. Among the community level time-varying characteristics, the chapter controls for prices of consumption goods, health inputs, wage rates, and community infrastructure variables. Prices of consumption goods included are - price of rice, price of cooking oil, and price of condensed milk. 16 The increase in the price of rice is associated with improvements in child health in urban areas and has almost no impact in rural areas (column 5, table 2.4). In rural areas, households are more likely to be net producers of 15 Thomas et. al (1991); Thomas and Strauss (1992); Haddad et. al (2003); Glick and Shan (1998); all find a strong positive effect of per capita consumption expenditure in determining child health. 16 prices are converted in real terms and expressed in logs throughout the chapter 23 rice and hence fluctuations in rice price is likely to have a positive or at best no impact on children’s health. As for urban areas, the positive coefficient on the price of rice is still surprising as residents in urban areas are likely to be net consumers of rice and not net producers of rice. One possible explanation for this anomaly is that if households had access to cheaper and better substitutes of rice then the prices of the substitutes would be more important in determining child health compared to price of rice. An increase in the price of cooking oil is associated with decline in child health (column 5, table 2.4). Spending on cooking oil may not be a large proportion of house- hold per capita consumption expenditure but reflects spending on essential consumption goods. One important consumption good aimed only for children is condensed milk, also included in the regression results. The advantage of using condensed milk is that it does not need refrigeration, an important advantage in a country where not all house- holds own a refrigerator. The price of condensed milk has a positive but insignificant impact in determining child health. Due to a lot of the missing variables in the price data for other consumption goods, this chapter can only control for the price of rice, price of cooking oil, and price of condensed milk among our right hand side variables. It is acknowledged that ideally a range of consumption goods must be included in the right hand side. However data constraints do not allow us to control for prices of more consumption goods. Also included in the regressions are prices of health inputs as captured by distance to health center, and price of parents time as captured by male and female specific hourly wage rates in a community. Measures of community infrastructure availability such as number of health posts (access to health care), presence of paved road (access to bigger cities), and measure of electricity (storage facility) are used as additional control variables. The number of health posts in a community has a positive but insignificant impact on child health. 24 Presence of paved road and measure of electricity in the community, are both positively associated with improvements in child health. Children residing in communities with a paved road have 0.11 standard deviation higher z-scores as compared to their counter- parts from other communities. Similarly children residing in communities with greater prevalence of electricity have 0.0025 standard deviation higher z-scores. The next objective is to investigate the gender specific determinants of child health. To be able to capture this, I run separate regressions for male and female children and report the preferred IV estimates for the two separate samples of boys and girls in columns 1 and 2 of table 2.5. Table 2.5: Determinants of Height-for-age z-score for male and females separately Covariates (1) IV Males (2) IV Females HAZ HAZ Spline in age in months (< 24 months) -0.0483*** -0.0779*** (0.008) (0.01) Spline in age in months (>= 24 months) -0.0011 0.0012 (0.0009) (0.001) Mother’s height 0.0550*** 0.0444*** (0.004) (0.005) Father’s height 0.0308*** 0.0433*** (0.004) (0.004) Mother’s schooling 0.0119 -0.0040 (0.008) (0.01) Father’s schooling 0.0029 -0.0105 (0.007) (0.009) log(PCE) 0.2504** 0.1882 (0.11) (0.11) Price of rice 0.0115 0.6840*** (0.22) (0.22) Price of cooking oil -0.0620 -0.1162* (0.06) (0.06) Price of condensed milk -0.0076 -0.0009 (0.02) (0.02) Rural dummy -0.1938 0.2437 (0.26) (0.27) Rural dummy*price of rice -0.2130 -0.4092 (0.24) (0.26) Number of health posts 0.0339 0.0035 (0.02) (0.02) Distance to health center 0.1024 0.0035 25 Table 2.5: Continued Covariates (1) IV Males (2) IV Females (0.006) (0.008) Electricity 0.0038** 0.0017 (0.001) (0.002) Dummy for paved road 0.0184 0.2004** (0.07) (0.09) Male wage rate 0.0590 -0.0515 (0.06) (0.08) Female wage rate -0.0049 0.0375 (0.04) (0.05) observations 5457 5457 Location Yes Yes fixed-effects - Source: IFLS - 1993, 1997, and 2000; *** significant at 1%, ** significant at 5%, * significant at 10% - Standard errors reported in parenthesis are adjusted clustering at the individual level - In columns (1 and 2), log(PCE) is instrumented with household total assets - Also included in the regressions are dummy variables capturing missing observations on mother’s schooling, father’s schooling, mother’s height, and father’s height, where the missing observation was imputed by the sample mean - Prices of consumption goods and hourly wage rates are converted in real terms and expressed in logs A pooling test on the joint sample of boys and girls gives an overall chi-square of 32.46 (0.05), which favors separating the sample for boys from girls and then estimating the static equation. However, a chi-square test on all the right hand side variables except the age and gender interacted coefficients is 24.79 (0.16). This suggests that the deter- minants of child health vary between boys and girls only due to the age and sex specific differences in growth of height attainments and not due to the gender differential impact of socioeconomic characteristics in explaining child health. Hence, in this chapter the preferred estimates reported in table 2.4 pool the sample on boys and girls together con- trolling for interactions between the male dummy and age in months variables to capture the gender specific growth patterns in height attainments. In examining mortality rates, Kevane and Levine (2001) find no evidence of ‘missing girls’ that is, daughters are not likely to suffer from higher rates of mortality as compared to sons. Also, Levine and Ames (2003) show that even in the aftermath of the crisis, girls did not fare worse than 26 boys. Hence, we can conclude that there exists almost no evidence to suggest that the determinants of child health vary between boys and girls. 2.5 Conclusion This chapter examines - (1) the socioeconomic determinants of child health and (2) captures the gender specific determinants of nutritional status among children. To address these objectives, we construct a panel data for children between 3 and 59 months in 1993 and follow them through 1997 and 2000 waves of the Indonesian Family Life Survey. A static conditional health demand function is estimated to obtain the parameter estimates on the various child level, household level and community level factors that determine nutritional status among children. The main results indicate that mother’s height and father’s height have an important role in determining child health. This con- firms to the role played by genetic endowments in determining health. Mother’s height continues to have a larger role in determining child health as compared to father’s height. We find little role of parental schooling in determining child health. Parental schooling could affect child health through a number of other variables like total household income and community resources. Hence, the lack of independent impact of parental school- ing on child health does not suggest there is no impact, only that the mechanism of the impact is potentially unknown. Household income has a strong positive effect on child health. Among the food prices - price of oil is negatively and price of rice positively related to child health. The community infrastructure variables in particular measure of electricity and paved road both have a positive impact on child health. We do not find any gender differential allocation of household and community resources in explaining child health. We find that Indonesia is not plagued with the problem of gender bias, as discussed earlier in the chapter as well. 27 The findings suggests that it is mother’s height, father’s height, log of real per capita consumption expenditure, price of consumption goods, and measures of community infrastructure that are important for improving nutritional outcomes among children. The relationship between community infrastructure variables and prices of consumption goods calls attention to programs and policies that focus on community level infrastruc- ture development and regulating prices of essential consumption goods. The positive dependence between household income and child health, under certain conditions can reflect household’s limited access to credit and hence improving access to credit can also potentially improve children’s health outcomes. 28 Chapter 3 Is there complete, partial, or no recovery from childhood malnutrition? - empirical evidence from Indonesia 3.1 Introduction Social scientists from diverse fields such as economics, nutrition, and epidemiology have come to agree that childhood malnutrition affects future well-being by decreas- ing the total human capital accumulated over an individual’s life course. 1 For example: Alderman et. al (2006) show using data from Zimbabwe that undernourishment at young ages lowers both attained height and completed grades of schooling measured in adoles- cence, of which the decline in educational outcomes is estimated to translate into a 14% reduction in lifetime earnings. 2 However, if individuals are able to recover from some of the deficits in health outcomes caused by nutritional deficiencies at young ages, then some of the negative consequences associated with poor nutrition can be mitigated. The main objective of this chapter is to identify the extent to which individuals are subsequently able to compensate for some of the poor nutritional outcomes from the past. It finds that malnutrition during childhood will cause only some permanent growth retardation in an individual’s physical well-being as measured by height attainments. This implies that at least some of the negative consequences associated with childhood malnutrition can be mitigated at an early age. 1 See Glewwe and Miguel (2008) for review on the role played by child health in determining schooling outcomes. See Strauss and Thomas (2008) for a most recent review on the association between child health and future health status. 2 Poor nutrition during childhood affects subsequent health status thereby affecting future earnings. See for example Thomas and Strauss (1997) for the role played by adult height attainments in determining wage earnings using data from Brazil. 29 This chapter uses height in cm 3 as indicator of nutritional status. These measures are particularly advantageous as they have been identified as indicators of chronic malnutri- tion and long-run physical well-being. 4 In addition, these measures are not confounded by systematic measurement error in data. 5 The existing literature classifies children with HAZ<-2 as undernourished and or stunted [Waterlow (1988); Onis et. al (2000)]. Grantham-McGregor et. al (2007) report that as of 2004, over 155 million children suffered from stunting. Their study identifies stunting as one of primary causes of poor cognitive development and schooling per- formance. They estimate that childhood stunting and poverty alone keeps almost 200 million children from fulfilling their developmental potential. Stunting in young chil- dren remains a serious source of concern in developing countries, including Indonesia as poor nutrition during childhood has long lasting impact on an individual’s overall well-being. Table 2.1 from chapter 2 indicates that in the year 2000, 34.8% of children (under age 5) from Indonesia suffered from chronic nutritional deficiencies resulting in stunting (source: Indonesian Family Life Survey (IFLS). This number is large and comparable to many poor countries of the world (Onis et. al, 2000). The degree to which this stunting actually causes severe retardation in the future physical well-being of these children from Indonesia is an empirical question - unknown to policy makers and researchers in the field. 3 The information on heights is available to the closest 0.1 cm (1 mm). 4 Waterlow (1988); Tanner (1981); Strauss and Thomas (1995); Martorell (1999); Martorell and Habicht (1986) have all discussed that height related measures capture cumulative investments in child health. Height related measures are affected by only long-term health shocks and nutritional deficiencies such as vitamin A deficiency and not short-term illnesses such as diarrhea that lasts 2-3 days. 5 An example of systematic measurement error, Thomas and Frankenberg (2002) point out that men in general tend to self-report themselves as being taller than they actually are and women tend to report themselves as being lighter than they are. 30 There exists a vast literature 6 that estimates the extent to which undernourishment at young ages affects subsequent health status [Adair (1999); Fedorov and Sahn (2005); Hoddinott and Kinsey (2001); Alderman et. al (2006)]. A major difficulty that exists in estimating such a relationship comes from the presence of unobservables such as child’s innate ability to fight diseases, parental preferences toward child health, and community connections. All these unobservables are likely to be correlated with an individual’s past nutritional status thereby confounding the coefficient estimate on the variable of interest. In addition, random measurement error in anthropometric outcomes makes it difficult to obtain an unbiased estimate on the child’s past health status. Hoddinott and Kinsey (2001), Alderman et. al (2006), and Fedorov and Sahn (2005) are some exceptions who have successfully addressed some of these econometric concerns. The contribution of this chapter is two fold - (1) It ascertains the extent to which children in Indonesia are able to recover from some of the long-run deficits in health outcomes caused by early malnutrition. (2) It identifies the extent to which childhood nutrition affects an individual’s future physical well-being, relying on weaker stochastic assumptions compared to earlier work in the literature. Empirically lagged childhood nutritional status from earlier periods is identified using time-varying community level characteristics from the baseline year (1993) in a first-difference framework. A panel data set is constructed using observations on children between the age of 3 and 59 months in 1993, who are followed through the 1997 and 2000 waves of the Indonesian Family Life Survey (IFLS). A dynamic conditional health demand function is estimated to capture the extent of recovery, if any, from childhood malnutrition. The extent of recovery from poor nutrition is determined by the coefficient on the one-period lagged health status, also known as the ‘catch-up’ term. A coefficient of zero on the one- period lagged nutritional status in the dynamic function indicates ‘complete catch-up’. 6 Section 3.2 of this chapter and Strauss and Thomas (2008) for a more detailed review. 31 A coefficient of one on the one-period lagged health status indicates ‘no catch-up’. A coefficient between zero and one on the one-period lagged health status indicates ‘par- tial catch-up’ (Hoddinott and Kinsey, 2001). Finally the chapter introduces an interac- tion term between the one-period lagged health status and lagged age in months in the dynamic specification to determine if and to what extent recovery from poor nutritional outcomes varies by age. The dynamic specification uses a first-difference GMM estimation strategy which yields a coefficient estimate of 0.23 on the one-period lagged health status. A coef- ficient of 0.23 suggests partial catch-up effects; that is, malnutrition during childhood will cause only some, permanent retardation in growth in height. Using the same first- difference GMM strategy, we find that younger children have marginally larger catch-up potential than older children. The above findings suggests that by adolescence, a malnourished child in the absence of any catch-up, that is, a coefficient of 1 on the lagged health status, would grow to be 4.15 cm shorter than a well-nourished child. However, in the presence of partial catch- up effects, such as, a coefficient of 0.23 estimated here, indicates that a malnourished child will grow to be only 0.95 cm shorter than a well-nourished child. These results have further implications on schooling attainments. For example: Maccini and Yang (2005) have examined the impact of improvements in health status as measured by height in cm on schooling attainments using data from the IFLS. Using their predictions, I find that the decline in stature by 0.95 cm here will result in indi- vidual’s accumulating 0.60 less grades of schooling. This estimate will be four times larger if there was no catch-up, that is, childhood malnutrition would lower attained height in adolescence by 4.15 cm and schooling attainment by 2.4 completed grades of schooling. 7 7 The methodology used for calculating these predictions is drawn from Alderman et. al (2006). 32 This chapter contributes to the extant literature in two ways - First, the chapter over- all contributes to the larger literature in economic development addressing concerns regarding child health outcomes. It establishes the relationship between current health status and lagged health status bringing out the permanent effects of childhood malnu- trition on individual’s future physical well-being which is further correlated with his/her overall economic and social well-being. Second, the chapter addresses a number of methodological issues that in principle can be applied to any dynamic model. It identifies a range of IV/GMM estimation strategies that can be used to address the endogeneity issues (omitted variables and or measurement error) and discusses how the estimation strategy adopted depends upon the main source of concern related to the endogeneity problem. The first-difference GMM strategy adopted here - (a) addresses biases arising from time-invariant child- specific (genetic ability), household-specific (parental preferences), and community- specific (political connections) unobservables that are likely to affect both current and lagged health status, (b) corrects for potential biases arising from random measurement error in anthropometric data, (c) uses instruments that neither rely on lack of serial cor- relation in the error terms, nor on the lack of correlation between the instruments and the time-invariant unobservables (example: genetic endowments) from the empirical speci- fication. The chapter also contributes to the growing discussion on instrument relevance and uses test statistics and hypothesis tests to support the relevance of instruments used in the first-difference GMM framework. Finally, the results obtained here are also robust to sample attrition, a common problem that arises due to the use of longitudinal data. The rest of the chapter is organized as follows. Section 3.2 provides a brief review of the related literature. Section 3.3 outlines the theoretical model specified to derive the dynamic conditional health demand function. The empirical specification and iden- tification strategy used here are described in section 3.4. Survey instruments, sample 33 composition, summary statistics, and attrition rates are provided in section 3.5. The main regression results are discussed in section 3.6. Concluding remarks follow in sec- tion 3.7. 3.2 Literature review The definition of catch-up effects 8 varies significantly in the literature. Growth retar- dation and subsequent catching-up in health outcomes depends on whether the shocks that result in growth retardation are transitory or permanent. Transitory factors are likely to inhibit growth in short-run indicators of health outcomes such as weight and hemoglobin. Whereas permanent shocks inhibit growth retardation in height attain- ments. The focus of this chapter is to investigate the extent of catch-up potential in the more long-run determinants of health, such as height. The term ‘catch-up’ here signifies the extent to which childhood malnutrition causes permanent retardation in the growth of future health status. ‘Complete catch-up’ implies that childhood malnutrition will not permanently lower the child’s future growth poten- tial and that the child can potentially also follow a higher growth path compared to his/her genetically predetermined growth path. ‘No catch-up’ implies that a child once classified as under nourished, will be permanently locked into a lower growth trajectory. ‘Partial catch-up’ implies that childhood malnutrition will cause some, but not severe, retardation in the child’s predetermined growth path. As noted earlier, growth retardation in height attainments, particularly during child- hood, if not recuperated at an earlier age can significantly lower an individual’s total 8 See Boersma and Wit (1997) for a whole range of possible definitions to define ‘catch-up’ growth in health outcomes. 34 human capital accumulated, affecting his/her overall well-being. Hence, social scien- tists have made an attempt to examine the magnitude to which individual’s can recover from some of the deficits in health outcomes caused by childhood malnutrition. Different lines of inquiry are used to examine the relationship between health dur- ing childhood and future health status. The review of this literature begins with the discussion of the important INCAP (Institute of Nutrition of Central America and Panama) study, a nutrition supplementation program started during the late 1960’s in rural Guatemala. The main finding of the INCAP study indicates that nutrition during pregnancy and the first few years of life improved health status during childhood and reduced stunting at age 3 [Martorell (1999); Martorell (1995); Habicht et. al (1995)]. The experimental design followed in the INCAP study not only shows that there exists catch-up potential in health outcomes but also suggests that nutritional interventions at early ages contributes towards the improvements in child health. More recent work by Engle et. al (2007) identifies a range of development programmes which include nutri- tional supplementation and conditional cash transfers as effective ways of ameliorating some of the long-term effects of stunting. In the absence of an experimental design, Foster (1995) using data from Bangladesh use prior period exogenous changes in weather outcomes to identify the changes in sub- sequent health, as measured by weight. The study finds that it is the better-off house- holds that were able to reduce the impact of the weather shock (flood) on child health and finds that access to credit is one of the important factors that enabled children to overcome some of the adverse economic conditions created by the flood. 9 Some of the other studies in the literature have used longitudinal data to estimate dynamic models which are used to identify the extent to which childhood malnutrition affects subsequent health status [Adair (1999); Hoddinott and Kinsey (2001); Fedorov 9 See more on this literature in Strauss and Thomas (1998) 35 and Sahn (2005); Alderman et. al (2006); Johnston and Macvean (1995)]. Among these, Adair (1999) and Johnston and Macvean (1995) fail to address attrition bias and omitted variables bias. In particular, lagged health status is not treated as endogenous. 10 Three other closely related studies that are much more sound are Fedorov and Sahn (2005); Hoddinott and Kinsey (2001); Alderman et. al (2006). These studies not only examine the actual extent to which catch-up exists but also employ estimation techniques that address econometric concerns such as attrition bias and endogeneity in the lagged dependent. The three aforementioned papers estimate dynamic conditional health demand functions to capture the coefficient on the lagged dependent variable, that is, the catch-up term. Fedorov and Sahn (2005) specify a dynamic conditional child health demand func- tion in levels and Hoddinott and Kinsey (2001) and Alderman et. al (2006) use a child growth specification. 11 Fedorov and Sahn (2005) follow both the Arellano-Bond (1991) and Arellano-Bover (1995) type estimation strategies, yielding coefficient estimates of 0.19 and 0.21 on lagged height, respectively. Their results indicate reasonable catch-up potential. The main limitation of their paper is that the results rely on a very strong assumption, that 10 Johnston and Macvean (1995) use type of fuel used and number of electrical appliances as right hand side covariates, both of which are likely to be correlated with household’s socio-economic status. Adair (1999) use low birth weight, early menarche, height in the baseline year; all of which are correlated with household and individual-specific time-invariant unobservables. Almost 50% of the observations are attr- ited over time [Johnston and Macvean (1995)]. Selection problems are magnified by running regressions on stunted and non-stunted children as classified from the baseline year [Johnston and Macvean (1995); Adair (1999)] 11 The levels specification used by Fedorov and Sahn (2005) can be specified as:H it =β 0 +β 1 H it−1 + P R j=1 β X j X jit + P S j=1 β Z j Z ji +ǫ i +ǫ h +ǫ c +ǫ it . The growth specification given by Hoddinott and Kinsey (2001) and Alderman, Hoddinott and Kinsey (2006) can be written as: H it −H it−1 = β 0 + β G H it−1 + P R j=1 β X j X jit + P S j=1 β Z j Z ji +ǫ i +ǫ h +ǫ m +ǫ it . The coefficient onβ 1 from a dynamic levels specification is equal to 1 +β G from the growth specification here. Whereǫ i is child specific time- invariant unobservable, ǫ h is household specific time-invariant unobservable, ǫ c is community specific time-invariant unobservable,ǫ m is the mother specific time-invariant unobservable, andǫ it is the random time-varying i.i.d term. 36 is, lack of serial correlation in the error terms, which is not always satisfied in dynamic panel data models [Deaton (1997); Blundell and Bond (1998); Blundell et. al (2000)]. 12 Hoddinott and Kinsey (2001) use both two-stage least squares (2SLS) and maternal fixed-effects estimation techniques. Their specification (in levels) yields a coefficient estimate of 0.56 and 0.18 respectively on the catch-up term reflecting partial catch-up effects. The 2SLS method adopted in Hoddinott and Kinsey (2001) addresses prob- lems arising from random measurement error but may not address omitted variable bias arising from the potential correlation between the instruments and the individual and household-specific time-invariant unobservables. 13 The maternal fixed-effects esti- mation strategy adopted by Hoddinott and Kinsey (2001) addresses omitted variable bias problem arising from household specific time-invariant unobservables; and do not address biases arising from child specific time-invariant unobservables. 14 In addition, the estimation strategy adopted cannot address the measurement error bias completely. Alderman et. al (2006) use a maternal fixed-effects instrumental variable (MFE-IV) estimation strategy which results in a catch-up coefficient of 0.43 in levels, reflecting partial catch-up effects. They construct two child specific shock variables as instru- ments for lagged health status. The first shock variable is calculated as the log no. of days a child was living prior to 18 August 1980 capturing exposure to the civil war. The second shock variable is a dummy variable taking the value equal to 1 if: the child was 12 It is shown later in the chapter using a Hausman (1978) type specification test that the assumption of zero first-order and second-order serial correlation in the error terms is in fact not valid for the data in hand and may not necessarily be valid for other papers with a short time dimension (say less than 5 periods) as well. 13 For example: birth weight (instrument used in Hoddinott and Kinsey, 2001) itself can also be endoge- nous on two accounts - One, children with higher birth weight reflect higher unobserved healthiness/innate ability and hence potentially correlated with other child specific unobservables in the model. Two, birth weight is usually measured for births taken in a health facility reflecting household’s socioeconomic status (Strauss and Thomas, 2008). This makes birth weight correlated with other household-specific unobserv- ables in the model as well. 14 see Rosenzweig and Wolpin (1988) 37 observed in 1983 and was between 12 and 24 months; or was observed in 1984 and was between 12 and 36 months; and equals 0 otherwise. These shock variables can be argued as exogenous and hence can address biases coming from measurement error in data and other household and community specific time-invariant unobservables, addressing almost all sources of omitted variables bias and measurement error bias. However, individual-specific time-invariant unobservables such as the child’s innate ability to fight diseases is treated as random. The individual-specific time-invariant unobservables such as the child’s genetic ability to fight diseases and absorb nutrients could potentially be correlated with the instruments used in the first-stage regressions (no. of days the child was living prior to August 1980). The estimation strategy adopted by Alderman et. al (2006) though addresses biases coming from the correlation between household-specific unobservables and child’s lagged health status, individual specific time-invariant unob- servables remain a potential source of concern. In addition, both Hoddinott and Kinsey (2001) and Alderman et. al (2006) esti- mate a growth specification which is likely to magnify the measurement error problem associated with height attainments and biases the estimated coefficient on the lagged dependent variable towards -1 which is equivalent to 0 in levels specification. As discussed above, the following three papers - Fedorov and Sahn (2005), Hod- dinott and Kinsey (2001), and Alderman et. al (2006) cannot completely address for both omitted variable bias and measurement error bias in data. It is the ability of the first-difference GMM strategy used in this paper that makes it especially attractive. 3.3 Model Parents make investments in their children’s health with the aim of improving the child’s overall well-being. Following Fedorov and Sahn (2005), Strauss and Thomas (1998, 2008), health status in period t can be specified as a function of health inputs, 38 environmental factors, individual demographic characteristics, household background characteristics, genetic endowments, time-varying health shocks, and time-invariant health endowments. H t =h(M t ,M t−1 ,....,M 0 ,I t ,I t−1 ,....,I 0 ,D σ ,θ cσ ,θ c ,μ hσ ,μ h ,G) σ = 0,1,...t(3.1) H t is current health status measured by height in cm. M t is health input at time t which includes food and non-food consumption goods used towards the maintenance and or improvement of child health. It is assumed that households do not derive any direct utility from the consumption of health inputs except from its indirect use in the accumulation of child health output. I t characterizes the environment where the child lives capturing infrastructure availability and disease environment in the community. D σ reflects all time-varying demographic characteristics such as the child’s age. θ cσ includes all time-varying health shocks like fever and diarrhea. θ c summarizes informa- tion about all time-invariant characteristics such as the child’s gender and time-invariant health endowments like the child’s innate ability to absorb nutrients and fight diseases. μ hσ and μ h capture household specific time-varying and time-invariant demographics and background characteristics such as parents rearing and caring practices. G summa- rizes information about all genetic endowments capturing genotype 15 and phenotype 16 influences that affect child health. 15 Genotype influences include genetic endowments that are passed from the parents to the child via their DNA. 16 Phenotype influences capture all observable characteristics of an individual, such as shape, size, color, and behavior that result from the interaction of genotype influences with the environment. 39 Following Strauss and Thomas (1992, 1995), the one-period lagged health status is assumed to be a sufficient statistic that captures the impact of all health inputs, envi- ronmental factors, and other time-varying characteristics starting from birth up until the last observed period in the sample. By making this assumption we can substitute for all past period’s determinants of child health by the one-period lagged health status in equation (1). 17 Redefining equation (1), the dynamic child health production function can be re-written as: H t =f(H t−1 ,M t ,I t ,D t ,θ ct ,θ c ,μ ht ,μ h ,G) (3.2) Where, health status in the current period is a function of the one-period lagged health status, current period health inputs, environmental factors, demographics, genetic endowments, health shocks, and household characteristics. The optimal choice of health inputs is determined by the household’s utility maximization problem described below. The household maximizes expected lifetime utility - U (3.3), subject to a lifetime budget constraint (3.4) where assets at end of period T must be equal to the difference between lifetime earnings and lifetime expenditure, and a period specific dynamic child health production function (3.5). It is assumed that - (a) sub-utility functions (u t ) are concave and twice differentiable. (b) The household can potentially borrow and or lend against its future in each period t [Deaton and Meullbauer (1980); Fedorov and Sahn (2005); Strauss and Thomas (2008)]. 17 We acknowledge that this assumption is strong but testing this assumption is beyond the scope of this chapter. 40 Max :U =E t T X t=0 β t u t [C t ,H t ,L t ;θ pt ] (3.3) Subject to: A T = ( T Y t=0 (1+r t ))A 0 + T X t=0 ( T Y τ=t (1+r τ ))(w t (T t −L t )+π t −P c t C t −P m t M t ) (3.4) H t =f(H t−1 ,M t ,I t ,D t ,θ ct ,θ c ,μ ht ,μ h ,G) (3.5) The sub-utility function (u t ) in each period depends upon consumption goods that include food and non-food consumption commodities, C t , leisure, L t , health status of the child,H t , and certain unobserved preference shocks,θ pt . β is the subjective discount factor which captures household preferences for higher utility today vis-a-vis the future. P c t is a vector of prices of food and non-food consumption goods. P m t is a vector of price of health inputs. w t is the wage rate (price of leisure). T t is parents total time endowment and A 0 is assets the households owns at the beginning of period 0. Profit income from farm and non-farm activities and all other sources of non-labor income is captured byπ t . E t is the expectations operator conditional on the information available at time t. The first-order conditions for the above maximization problem w.r.tC t ,L t , andM t can be written as follows: ∂U/∂C t =β t u ′ (C t ,H t ,L t ;θ pt )−λ T Y τ=t (1+r τ )P c t = 0 (3.6) 41 ∂U/∂L t =β t u ′ (C t ,H t ,L t ;θ pt )−λ T Y τ=t (1+r τ )w t = 0 (3.7) ∂U/∂M t =β t u ′ (C t ,H t ,L t ;θ pt )∂H t /∂M t −λ T Y τ=t (1+r τ )P m t +E t β t+1 u ′ (C t+1 ,H t+1 ,L t+1 ;θ pt+1 )(∂H t+1 /∂H t )(∂H t /∂M t ) +E t β t+2 u ′ (C t+2 ,H t+2 ,L t+2 ;θ pt+2 )(∂H t+2 /∂H t+1 )(∂H t+1 /∂H t )(∂H t /∂M t ) +E T β T u ′ (C T ,H T ,L T ;θ pT )(∂H T /∂H T−1 )......(∂H t+1 /∂H t )(∂H t /∂M t ) = 0 (3.8) The solution for the above optimization problem provides us with the optimal amount of health input (M ∗ t ) 18 demanded by the household, which can be written as: M ∗ t =m(H t−1 ,P c t ,P m t ,w t ,I t ,λ,D t ,θ ct ,θ c ,μ ht ,μ h ,G,θ pt ,E t (Z t+j )) (3.9) forj = 1,2,....,T−t andZ =P c t ,P m t ,w t ,I t ,D t ,θ ct ,θ c ,μ ht ,μ h ,θ pt ,G M ∗ t is a function of the one-period lagged health status, prices of consumption goods, prices of other health inputs, wage rates, environmental factors, marginal util- ity of wealth in period zero (λ), a set of time-varying and time-invariant child level and household level characteristics, and household’s expectations at date t about all future period’s prices, incomes, environmental characteristics, and demographic characteris- tics (D t ). Expectations about future periods prices of consumption goods and health inputs, wages and all other factors are captured by the term Z. 18 See Strauss and Thomas (2008) for a similar, yet even more general model with clear exposition of the solution method and assumptions needed to derive such a dynamic model. 42 The dynamic conditional health demand function (3.10) can be obtained by replacing M t in equation (3.5) byM ∗ t in equation (3.9): H ∗ t =h(H t−1 ,P c t ,P m t ,w t ,I t ,λ,D t ,θ ct ,θ c ,μ ht ,μ h ,G,θ pt ,E t (Z t+j )) (3.10) forj = 1,2,....,T−t andZ =P c t ,P m t ,w t ,I t ,D t ,θ ct ,θ c ,μ ht ,μ h ,θ pt ,G The term Z, can enter the dynamic conditional health demand function in an unre- stricted manner. Additional assumptions necessary for estimating the above equation and are described in the following section. 3.4 Empirical specification and identification The dynamic conditional health demand function (3.10) estimated in this chapter can be written as follows: H it =β 0 +β 1 H it−1 + R X j=1 β X j X jit + S X j=1 β Z j Z ji +ǫ i +ǫ h +ǫ c +ǫ it (3.11) H it is the child’s height measured in centimeters at time t and t-1 respectively, where subscript i refers to the individual. Xs are time-varying regressors which include child’s age, household income, and community characteristics such as prices of food consump- tion goods, prices of health inputs, and community infrastructure variables. Zs include time-invariant regressors such as parent’s completed grades of schooling and parental height in cm. In the dynamic model, λ is known as the marginal utility of wealth in period 0. λ is a function of both retrospective information (period 0 to period t-1) and prospec- tive information (period t+1 to period T) on prices, incomes, child characteristics, and 43 household characteristics, that enter the demand function through the lifetime budget constraint. Empirically, treating marginal utility of wealth as a constant would be a strong assumption since it relies on the existence of complete markets, an assumption that is not likely to hold in a developing country set up where most households may be credit constrained. In addition the household cannot perfectly control for future wealth and changes in the environment where they live makingλ stochastic. In order to allow forλ to reflect some of these time-varying changes and dynamics, we use household’s access to resources in the long-run as measured by lag of log of household’s real per capita consumption expenditure [lag log (PCE)] as an additional control variable in the right hand side. E t (Z t+j ) (from equation 3.10) empirically enters either through the time-invariant household specific unobservables (ǫ h ) or the time-varying i.i.d term (ǫ it ) given in equation (3.11). Whether the sequence of factors that affect current health sta- tus through term Z enters the empirical specification via (ǫ h ) and or (ǫ it ) depends upon whether the household assumes some of these expectations to be time-invariant or not. However, we do need to assume that the impact ofE t (Z t+j ) onH ∗ t enters the dynamic reduced form conditional health demand function only additively. There are four sources of unobservables in the dynamic specification (equation 3.11) -ǫ i ,ǫ h ,ǫ c , andǫ it . ǫ i captures the time-invariant individual-specific unobservables such as the child’s inherent healthiness which affects his or her ability to absorb nutrients and fight diseases. ǫ h captures all time-invariant household-specific unobservables reflect- ing parental preferences toward child health and parents time discount rate. ǫ c captures all time-invariant community-specific unobservables like community endowments and political associations/connections. ǫ it includes child specific time-varying unobserv- ables such as expected future health shocks, current health shocks, and expected future 44 prices of - consumption goods, health inputs, wage rates, and other household charac- teristics, some of which are unknown to the child and all of which are unknown to the econometricians at date t. The condition of zero correlation between the error term and explanatory variables may never be satisfied with the inclusion of the lagged dependent variable in the right hand side [Deaton (1997); Blundell and Bond (1998); Wooldridge (2002)]. Hence with H it−1 endogenous, standard OLS estimate ofβ 1 is likely to be biased and inconsistent. The sources of endogeneity inH it−1 deserve careful explanation. The one-period lagged health status,H it−1 , is likely to be correlated with the time- invariant individual-specific unobservables like the child’s ability to fight diseases, which creates an upward bias in the estimated coefficient on the one-period lagged health status - β 1 . The one-period lagged health status is also likely to be positively correlated with the time-invariant household-specific unobservables like parental pref- erences towards child health and time discount rate, again creating an upward bias in the estimated coefficient on the one-period lagged health status -β 1 . Parents could also invest more in children who had lower health status in the last period making the coef- ficient onβ 1 biased downwards. The time-invariant community-specific unobservables like political connections of a community are also likely to be positively correlated with the lagged dependent variable creating an upward bias in the estimated coefficient on β 1 . At the same time, pro-poor policies at the community level can bias the estimated coefficient ofβ 1 downwards. In addition,β 1 is likely to be biased downwards, towards zero due to the presence of classical measurement error in height attainments. Given the different sources of the potential biases in H it−1 , it is difficult to assign the net direction of bias on the estimated coefficient on the one-period lagged health status - β 1 . However, one can broadly classify the main sources of the endogeneity in 45 the estimated coefficient on the one-period lagged health status as omitted variables and or random measurement error in data. It is empirically a difficult challenge to correct for both omitted variable bias and measurement error in data. This chapter discusses variants of the IV/GMM estimation strategies that can be used to address either omitted variables bias and or random mea- surement error in data. The first IV strategy followed here is a simple two-stage least-square (2SLS) with province fixed-effects, where the dynamic levels specification (3.11) is estimated using two-period lagged (1993) community characteristics as instruments for lagged height under the assumption that the community characteristics are exogenous, and that the time-invariant individual-specific, location-specific, and household-specific unobserv- ables are random. The 2SLS estimation strategy followed here addresses random mea- surement error bias as lagged community characteristics (from 1993) are likely to be uncorrelated with the measurement error problem in height attainments. However, one cannot rule out for the correlation between the time-invariant unobservables (ǫ i ,ǫ c , and ǫ h ) and the instruments used and hence the estimated coefficient on H it−1 is likely to be biased. The direction (upwards/downwards) of the bias depends upon the correlation between the time-invariant unobservables and lagged height. A simple solution for removing all sources of unobserved heterogeneity (ǫ i ,ǫ c , and ǫ h ) would be to estimate the dynamic specification (equation 3.11) in first-differences. The advantage of first-differencing is that it takes away all time-invariant unobservables from the estimation equation there by taking care of one of the potential sources of endo- geneity inβ 1 , omitted variable bias. The disadvantage of first-differencing being that it takes away a lot of the potential variation among the right hand side variables. First- differencing alone cannot address biases coming from the correlation betweenδ(H it−1 ) and δ(ǫ it ) which stems from the presence of serial correlation in the errors terms of a 46 dynamic panel data model, and is still to be addressed. Random measurement error in height attainments also continues to remain a source of concern. A simple first- difference method in the presence of random measurement error will create an even larger downward bias in the estimated coefficient on the first-differenced lagged height (see Griliches and Hausman, 1986). The second estimation strategy adopted here follows an Arellano-Bond (1991) framework where the first-differences in lagged height is instrumented with community characteristics from 1993, and height from 1993, maintaining the assumption of lack of serial correlation in the error terms, and exogeneity of the community characteristics. However, in section 3.6.3, I show that the assumption of lack of serial correlation in the error terms is not satisfied, and hence an Arellano-Bond (1991) type estimator does not generate an unbiased and consistent parameter estimate on the catch-up term. Other variants of the GMM estimation strategy like the Arellano-Bover (1995) and the System GMM (Blundell and Bond, 1998) estimators can potentially address both sources of endogeneity - omitted variables and measurement error. However, these two estimators also rely on the absence of serial correlation in the error terms, which is not satisfied in this chapter, and hence cannot be used to obtain an unbiased and consistent parameter estimate on the catch-up term. 19 Third, the preferred first-difference GMM strategy adopted here uses only com- munity characteristics from 1993 and their interactions with child’s age and mother’s schooling to identify the changes in height between 1997 and 1993 (first-differenced lagged height). The first-difference GMM strategy relies on the assumption of exogene- ity of the community characteristics conditional upon first-differences, which in turn provides us with an unbiased and consistent coefficient estimate on the catch-up term. 19 See Blundell et. al (2000) for an outline on the additional restrictions needed for obtaining unbiased and consistent coefficient on the lagged dependent variable using the Arellano-Bover (1995) and System GMM estimators. 47 Two-period lagged (1993) community characteristics like number of health posts in a community and other measures of community infrastructure are used to identify the changes in height between 1997 and 1993. Health posts also locally known as posyandus which are located in almost all com- munities in Indonesia. These posyandus are community-sponsored sub-village health posts which provide basic maternal and child health care to neighborhood groups. They are primarily targeted towards meeting the health care needs of younger children in the age of 0 and 5 years - who are most vulnerable to health shocks. Health posts pro- vide immunization services, oral rehydration solution packets, and vitamin supplements on a monthly basis. The health posts also provides food supplements to young children. Health posts in a community actively contribute towards meeting the health care needs of children and hence the number of health posts present in a community during 1993 can be used as a good identifying variable to explain the subsequent changes in child health between 1993 and 1997. 20 Additionally, interactions between mother’s schooling and the number of health posts in 1993; interactions between child’s age in months in 1993 and the number of health posts in 1993 capture for the age and mother specific returns to availability of health post in the community. Electricity in the community reflects infrastructure availability and the disease environment, both of which affect subsequent changes in child height. Taken together, these instruments capture access to preventive measures of health and to some extent curative measures of health, both of which affect subsequent changes in child height. Recall that under the assumption that the commu- nity characteristics are exogenous, all the above mentioned instruments are valid for identifying the subsequent changes in height attainments among young children. 21 20 Frankenberg et. al (2005) show how access to better community infrastructure can improve children’s nutritional outcomes. The statistical relevance of these instruments used is discussed in section 3.6.2 21 1993 measures of all community characteristics can be potentially used as instruments to identify the changes in health status between 1997 and 1993 (first-differenced lagged height). However, there is 48 So far the potential pros and cons of following the different IV/GMM estimation strategies have been discussed. The results section outlines the actual coefficient esti- mates on the lagged dependent variable and the direction of bias in the estimated coeffi- cient on the catch-up term. This chapter attempts to choose the estimator that addresses both omitted variables bias and measurement error bias. 3.5 Data and variables 3.5.1 Indonesian Family Life Survey The data used in this chapter comes from the 1993, 1997 and 2000 waves of the Indonesian Family Life Survey (IFLS), a large-scale socio-economic survey conducted in Indonesia. The IFLS collects extensive information at the individual, the household, and the community level. The survey includes modules on measures of health, house- hold composition, labor and non-labor income, farm and non-farm assets, pregnancy, schooling, consumption expenditure, contraceptive use, sibling information, and immu- nization [see Frankenberg et. al (1995, 2000) and Strauss et. al (2004) for more details on sample selection and survey instruments]. The IFLS is an ongoing longitudinal survey, the first wave of which was fielded dur- ing late 1993 and early 1994 (IFLS1). In IFLS1, 7224 households were interviewed. The first follow-up wave was surveyed during the second half of 1997 (IFLS2) just before the major economic and financial crisis in Indonesia. In IFLS2, 7629 households were interviewed of which 6752 were original IFLS1 households and 877 were split-off households. The third wave (IFLS2+) was a special follow-up survey fielded during the late 1998. A 25% sub-sample of the original IFLS1 households were contacted in late 1998 with the aim of analyzing the immediate impact of the 1997-98 economic and a severe weak instrument problem associated with using all the community characteristics from 1993 to identify the changes in lagged health status between 1997 and 1993. 49 financial crisis. The fourth wave of the IFLS was fielded in 2000 (IFLS3). A total of 10435 households were interviewed in 2000. Of these, 6661 were original IFLS1 house- holds and 3774 households were split-off households. The sample surveyed in 1993-94 represented 83% of the Indonesian population living in 13 of Indonesia’s 27 provinces at the time. The 13 provinces are spread across the islands of Java, Bali, Kalimantan, Sumatra, West Nusa Tenggara, and Sulawesi. Provinces were selected to maximize rep- resentation of the population, capture the cultural socio-economic diversity of Indone- sia, and yet be cost-effective given the size and the terrain of the country. A total of 321 enumeration areas (EAs)/communities were selected from these 13 provinces for final survey purposes. The IFLS is unique in a number of ways - (1) it links individual, household and com- munity level data bringing together an enormous amount of information that enables us to better understand the impact of household characteristics on individual level observ- ables controlling for community infrastructure availability. (2) IFLS interviews mem- bers from different age groups (0-14 years interviewed by proxy, 15-49 years, and 50 years and older) capturing the overall demographic composition in a household. (3) Few other surveys collect health related measures, in particular, height in centimeters is not commonly collected in all household surveys. (4) The IFLS is particularly useful in estimating a dynamic panel data model as estimating such a model requires data from atleast two time periods and a lot of exogenous variables that can be used as potential instruments to address the endogeneity issues in the lagged dependent variable. (5) The IFLS data quality is excellent as numerous checks were done at the field level and at the data entry level. For example: IFLS provides best guessed age in years, date of birth year, date of birth month, and date of birth day information for all panel and new respondents from all three waves of the survey. Numerous variables are double-checked 50 across waves and across books within the same wave to provide correct information to the user. Other data descriptive, including details on the location indicators used in this chap- ter are the same as described in chapter 2. 3.5.2 Attrition rates Sample attrition primarily occurs at two levels - individual level and household level. Attrition at the individual level occurs when an individual from the original wave either cannot be followed in the subsequent waves or information on the dependent variable is missing due to measurement error in data or due to other restrictions imposed by the author. Attrition can be a problem only if, firstly, observable factors that result in attrition are correlated with the error term in the specification of interest (3.11), and sec- ondly, if unobservables in the attrition equation are correlated with the unobservables in the empirical specification of interest (Fitzgerald et. al, 1998). This section pro- vides details on household level and individual level attrition rates using the IFLS and addresses concerns regarding attrition bias. In IFLS1, 7224 households were interviewed. In IFLS2, 94.3% of all original IFLS1 households were re-contacted. In IFLS3, 94.8% of “target”(original IFLS1 households, split-offs from IFLS2 and IFLS2+) households were recontacted (Strauss et. al, 2004). The follow-up surveys were only designed to target the original IFLS1 households and any split-offs their off in the subsequent years. Household level attrition rates have declined during the follow-up survey periods. The IFLS follows households that move out of the community in which they are interviewed in the baseline year keeping house- hold level attrition low [see Thomas et. al (2001) for more details on sample attrition in IFLS]. In addition details about attrition rates at the individual level are provided below. 51 From IFLS1 complete information on age in months, sex, and height in cm is avail- able for 2203 children between the age of 3 and 59 months. Of these 2203 children, 1966 were followed in 1997, and 2051 of the original sample was re-contacted in 2000. A total of 1819 children between the age of 3 and 59 months in 1993 can be followed through the 1997 and 2000 waves of the IFLS - this sample excludes observations deleted due to measurement error in height attainments or age in months. There was an overall rate of 10.76% attrition between 1993 and 1997 and 6.90% between 1997 and 2000. Re-contact rates were much lower in 1997 as compared to 2000. 22 A simple mean test on the dif- ference in height attainments between attriters and those who were followed through all three waves of the survey is 0.59 cm with a standard error of 0.67. This difference is not statistically significant, which suggests that attrition rates are not likely to be related to intial period health status and more likely to be random. 23 In addition, individual level attrition is not a real concern in this chapter, given the estimation strategy adopted here. First-differencing removes all potential sources of unobservables like the child’s genetic endowments which is likely to be correlated with the observables or unobservables that result in attrition, thereby creating attrition bias. 22 In analyzing household level attrition rates, Thomas et. al (2001) also find that attrition rates are higher between 1993 and 1997 as compared to 1997 and 1998. They attribute this decline in attrition rate to be associated with learning by doing in running a large-scale household level survey. 23 Additionally a linear probability model on attrition is also estimated where the dependent variable, attrition is defined equal to 1 if the individual can be followed through the 1993, 1997 and 2000 waves of the IFLS, and zero otherwise. The right hand side regressors include height-for-age z-score, mother’s schooling, father’s schooling, mother’s height, father’s height, gender, age in months, measure of house- hold income, mother’s age, father’s age, rural dummy, and location indicators. All the right hand side regressors belong to the baseline survey year, 1993. The coefficient on HAZ from 1993 is 0.002 with a standard error of 0.004, indicating an insignificant impact in determining attrition. Among the other regressors mentioned above, it is only the rural dummy which has a significant impact on attrition apart from the location indicators. Children residing in rural areas are more likely to be followed as compared to children residing in urban areas in the baseline year. This is similar to the findings by Thomas et. al (2001), who find that household level attrition rates are higher in urban areas compared to rural areas. In summary, the OLS estimates verifies that attrition is unrelated to endogenous observables like the child’s health status from 1993 and measure of household income. Hence the parameter estimates reported in this chapter are not likely to be confounded by selection issues. See table B.2 in appendix for complete results of the attrition regression. 52 In the presence of a first-difference estimation strategy, the only possible remaining source of attrition is that arising from the presence of random health shocks, such as infectious diseases that may affect health status in 1993. But, these health shocks from 1993 are also likely to be uncorrelated with the health shocks in 1997 and/or 2000. Hence, attrition arising from the existence of random, time-varying health shocks is not likely to contaminate the parameter estimate on the lagged dependent variable. 3.5.3 Sample size, variables, and descriptive statistics Martorell and Habicht (1986) and Satyanarayana et. al (1980) point out that decline in growth in height during the first few years of life largely determines the small stature exhibited by adults in developing countries. In addition height measured at young ages is also strongly correlated with attained body size as an adult [Spurr (1988), Martorell (1995)]. Hence, in this chapter the initial sample is restricted to children less than 5 years of age in 1993. 24 The sample is restricted to include children who are less than 12 years of age in 2000 in order to keep the child health production function time-invariant for the complete sample here. 25 One would naturally worry about attrition and sample selection related concerns arising from such restrictions. However, because the initial sample includes children who are between the age of 3 and 59 months in 1993, by 2000, over 99% of the sample is still under 144 months of age (12 years) there by addressing 24 Although some amount of catch-up growth occurs during adolescence, it is not sufficient to overcome the initial loss in the growth in height (Martorell, 1999). Additionally, the catch-up potential in adoles- cence is limited by maturation. Early maturation also hinders catch-up potential. Almost all children mature somewhere between 11-14 years, thereby restricting growth potential. Hence, catch-up growth estimated using the sample of children less than 12 years, reflects a possible lower bound on the extent of actual long-run catch-up possible by the time the child is an adult and stature becomes predetermined for life. However, at the same time maturation during adolescence suggests that this catch-up coefficient is not likely to be a lot smaller that the true lifetime catch-up estimate. 25 The child health production function varies between young children and teenagers going through pubescent growth spurts (Waterlow, 1988). 53 any selection related concerns. The final sample includes 1819 children for whom there exist complete anthropometric details from all three waves of the survey. The outcome variables of interest in this chapter is height in centimeters. Height in centimeters is used as the dependent variable in estimating a dynamic conditional health demand function as specified by equation (3.11). Figure 3.1 shows that z-scores flatten out by 48 months of age. Figure 3.1: Lowess plot on height-for-age z-score against age in months for all panel children −2.5 −2 −1.5 −1 −.5 0 .5 1 1.5 2 Height−for−age z−score 3 12 18 24 36 48 60 72 84 96 108 120 132 144 age in months Also the majority of children in the dynamic specification are older than 48 months, by which z-scores flatten out leaving little scope for any dynamics. 26 However, height 26 It is growth faltering at young ages among children from developing countries that results in the decline in the z-scores. Most of this growth faltering occurs due to poor nutrition and diseases. See Shrimpton et. al (2001) for more discussion on growth faltering in young children. 54 attained in centimeters is not only a long-run indicator of health status but also captures the dynamic effects in health outcomes. Figure 3.2 highlights the relationship between height in centimeters and age in months, depicting continuous changes in height attainments. Figure 3.2: Lowess plot on height in cms against age in months for all panel children 60 80 100 120 140 Height 0 60 144 age in months Male Female The right hand side variables in the regression estimates include - age of the child, male dummy, male dummy interacted with age in months, logarithm of real per capita household consumption expenditure, mother’s height in centimeters, father’s height in centimeters, mother’s completed grades of schooling, and father’s completed grades of schooling. The regression estimates also include a series of location level time-varying characteristics such as an indicator for whether the individual lives in a rural area, log of real price of rice, log of real price of condensed milk, log of real price of cooking oil, distance to health center in km, dummy for presence of paved road, percentage of 55 households with electricity, log of real hourly male wage rates, log of real hourly female wage rates, and number of health posts in a community. Information on age of the child, gender, and per capita consumption expenditure is obtained from the household questionnaires. Table 3.1: Mean height attained in 2000 for all panel children between the age of 3 and 59 months in 1993 Male(966) Female (853) Difference Stunted (739) 121.35 122.06 -0.71 (0.36) (0.42) (0.55) Non-Stunted (1080) 126.00 125.87 0.12 (0.46) (0.37) (0.59) - Source: IFLS - 1993, 1997, and 2000 - Children with HAZ<-2 in 1993 were classified as stunted - Children with HAZ>=-2 in 1993 were classified as non-stunted *** significant at 1%, ** significant at 5%, * significant at 10% Table 3.1 depicts the relationship between levels of stunting during childhood (as measured in 1993) and height attained in centimeters during later stages of life (as mea- sured in 2000). Male children initially classified as stunted in 1993 grow to be 4.65 cm shorter than their counterparts in 2000, who did not suffer from any evidence of long-run malnutrition during childhood. Similarly, female children initially classified as stunted in 1993 grow to be 3.81 cm shorter than their female counterparts who did not suffer from any malnutrition during childhood. There is no evidence of gender- differences in height attainments among stunted and non-stunted children. The pattern of no gender-differentials is also found in another important aspect of human capital accumulation, education as measured by primary school enrollment rates (Deolalikar, 1993). Also in examining mortality rates, Kevane and Levine (2001) find no evidence of “missing girls”, i.e., daughters are not likely to suffer from higher rates of mortality than sons. Levine and Ames (2003) show that even in the aftermath of the crisis, girls did not fare worse than boys. Most of the literature from Indonesia, suggests that there is no evidence of gender bias in favor of male children. 56 In this chapter, pooling tests on gender in the first-differenced dynamic instrument variable specification gives an overall chi-square of 55.74 (0.00), which favors separat- ing the sample for boys from girls and then estimating the first-difference equation. However, a chi-square test on all right hand side variables except the age and gen- der interacted coefficients is 10.61 (0.64). This suggests that the differences in height between boys and girls occurs only due to the age and sex specific differences in growth of height attainments and not due to differential catch-up effects between boys and girls 27 or any other socioeconomic characteristics. Hence, in this chapter only coef- ficient estimates from the pooled regressions are reported controlling for interactions between the male dummy and age in months variables to capture the gender specific growth patterns in height attainments. Table 3.2 gives information on the mean and standard deviation of all variables used in the regression specification. Table 3.2: Summary statistics of all variables used in the empirical spec- ification Variables Mean Std. dev Height-for-age z-score (HAZ) -1.68 1.30 Height in cm 105.86 19.42 Mother’s height in cm 150.5 5.1 Father’s height in cm 161.3 5.3 Mother’s completed grades of schooling 5.96 3.93 Father’s completed grades of schooling 6.90 4.33 Log of real per capita household consumption expenditure 9.87 0.76 Square root of real per capita household productive assets 1.51 2.60 Square root of real per capita household total assets 4.48 3.78 Distance to the community health center in km 5.07 4.58 Percentage of households with electricity 76.68 26.92 Log of real male wage rate 6.55 0.52 Log of real female wage rate 6.19 0.84 Log of real price of rice 0.85 0.20 Log of real price of condensed milk 5.17 1.51 Log of real price of cooking oil 1.74 0.43 27 A chi-square on the interaction between the first-differenced lagged height and the male dummy from the pooled first-difference GMM specification is 1.00 (0.31) which indicates that there are no gender differential catch-up effects in health outcomes. 57 Table 3.2: Continued Variables Mean Std. dev Dummy=1 if the community has paved road 0.74 0.43 Number of health posts in a community 6.67 4.73 - Source: IFLS - 1993, 1997, and 2000; No. of observations - 5457 3.6 Results 3.6.1 Catch-up effects - complete, partial, or none? The results from estimating a dynamic conditional health demand function using variants of the IV/GMM estimation strategy are reported in table 3.3. OLS estimate on the one-period lagged height is 0.53 (see column 1, table 3.3), this indicates less than partial catch-up in attained height. The OLS estimate is likely to be biased and inconsistent as it suffers from omitted variable bias and measurement error bias - as previously discussed in section 3.4. The coefficient estimate on the one-period lagged height using a simple 2SLS esti- mation strategy is 0.82 (column 2, table 3.3), which is even larger than the OLS parame- ter estimate. The 2SLS estimation strategy uses community characteristics from 1993 as instruments for the lagged dependent variable, addressing the downward bias in the catch-up term caused by random measurement error. 58 Table 3.3: Dynamic health demand function Covariates (1) OLS (2) Two-Stage (3) OLS (4) Arellano-Bond (5) F-D (6) F-D (7) F-D Height least-square F-D (first- Height GMM GMM GMM Height difference) without Height Height Height IV’s Height Lagged height or 0.5311*** 0.8241*** -0.1820*** -0.0714* 0.2339* 0.2375* 0.2833** catch-up coefficient (0.02) (0.21) (0.03) (0.03) (0.13) (0.13) (0.11) Male dummy 9.5169*** 3.6775 (3.39) (26.44) Lag age in months 0.4554*** 0.0210 0.4172*** 0.4044*** 0.4290*** 0.4276*** 0.4260*** (0.03) (0.21) (0.13) (0.03) (0.03) (0.03) (0.03) Lag age in months -0.1533*** -0.0106 -0.1806*** -0.1725*** -0.1717*** -0.1692*** -0.1714*** *male dummy (0.05) (0.32) (0.04) (0.04) (0.04) (0.04) (0.05) Duration 0.7941*** 0.2092 0.1737** 0.2312*** 0.4950*** 0.4937*** 0.5059*** (0.06) (0.53) (0.08) (0.08) (0.13) (0.13) (0.12) Duration*male dummy -0.1926** -0.0058 -0.1583 -0.1709 -0.1846 -0.1788 -0.1885 (0.07) (0.77) (0.10) (0.11) (0.12) (0.12) (0.12) Duration*lag age -0.0075*** 0.0018 0.0021*** 0.0009 -0.0036* -0.0036* -0.0044** in months (0.0009) (0.007) (0.0007) (0.0008) (0.002) (0.002) (0.001) Duration*lag age in 0.0030** -0.0014 0.0042*** 0.0038 0.0037*** 0.0037*** 0.0037*** months*male dummy (0.001) (0.009) (0.0008) (0.0008) (0.0009) (0.0009) (0.001) Mother’s height 0.1799*** 0.1211** (0.01) (0.05) Father’s height 0.1340*** 0.0796* (0.01) (0.04) Mother’s schooling 0.0204 -0.0017 (0.02) (0.03) Father’s schooling 0.0215 -0.0027 (0.02) (0.02) Lagged log(PCE) 0.5257*** 0.2479 -0.0114 0.1155 0.2240* 0.2167 (0.12) (0.16) (0.11) (0.11) (0.13) (0.15) Lagged household 0.0237 59 Table 3.3: Continued Covariates (1)OLS (2) Two-Stage (3) OLS F-D (4) Arellano-Bond (5) F-D (6) F-D (7) F-D least-square without IV’s GMM GMM GMM assets (0.03) Price of rice 0.9522 0.8412 -0.4455 -0.5379 -0.1291 -0.0420 (1.19) (1.06) (0.64) (0.65) (0.74) (0.74) Price of cooking oil -0.2679 -0.2029 0.0924 0.0377 -0.0398 -0.0693 (0.28) (0.31) (0.16) (0.16) (0.19) (0.19) Price of condensed 0.0460 -0.0772 -0.0385 -0.0028 -0.0233 -0.0167 milk (0.09) (0.13) (0.06) (0.06) (0.07) (0.07) Rural dummy -0.2300 1.4347 -0.8436 -1.1385 0.0323 0.0063 (1.23) (0.99) (1.08) (1.10) (1.31) (1.31) Rural dummy*price of -0.7123 -2.3382** 0.2103 0.3830 -0.1465 -0.1216 rice (1.26) (1.20) (0.80) (0.79) (0.91) (0.91) Number of health posts -0.1115 0.0415 -0.0101 -0.0097 -0.00016 0.0004 (0.07) (0.02) (0.01) (0.01) (0.02) (0.01) Distance to health center -0.0332 0.0424** 0.0132 0.0149 -0.0094 -0.0095 (0.02) (0.02) (0.01) (0.01) (0.02) (0.02) Electricity -0.0043 -0.0027 -0.0070 -0.0048 -0.0017 -0.0017 (0.007) (0.005) (0.005) (0.005) (0.005) (0.005) Dummy for paved road -0.0195 0.2091 -0.0349 0.0065 -0.0375 -0.0492 (0.31) (0.25) (0.25) (0.25) (0.27) (0.27) Male wage rate 0.4226 0.3253 -0.1311 -0.0947 0.0169 0.0231 (0.29) (0.31) (0.16) (0.16) (0.21) (0.21) Female wage rate 0.2958 -0.0735 0.0432 0.1568 0.1495 0.1444 (0.19) (0.15) (0.12) (0.11) (0.13) (0.12) observations 5457 3638 1819 1819 1819 1819 1819 Location Yes No No No No No No fixed-effects Province No Yes No No No No No fixed-effects F statistic 3.46 31.90 3.06 3.14 17.54 on the excluded (0.03) (0.00) (0.01) (0.01) (0.00) 60 Table 3.3: Continued (1)OLS (2) Two-Stage (3) OLS F-D (4) Arellano-Bond (5) F-D (6) F-D (7) F-D least-square without IV’s GMM GMM GMM instruments from the first-stage regressions Hansen J statistic 0.019 9.86 2.31 2.12 2.69 (0.88) (0.04) (0.51) (0.54) (0.26) Difference on the 0.30 catch-up coefficients (0.12) between specification 4 and specification 5 Difference on the 0.30 catch-up coefficients (0.12) between specifications 4 and 6 Difference on the -0.29 first-differenced lagged (0.42) log(PCE) obtained using the Hausman specification, to test the orthogonality of the first-differenced lagged log(PCE) in specification (5) C statistic testing the 6.54 orthogonality of height (0.01) in 1993 used as instrument for specification (4) C statistic testing the 0.13 orthogonality of the (0.71) first-differenced lagged log(PCE) in specification (5) 61 Table 3.3: Continued - Source: IFLS - 1993, 1997, and 2000; Two-period lagged corresponds to information from the year 1993; *** significant at 1%, ** significant at 5%, * significant at 10% - In column (1), standard errors (reported in parenthesis) are robust to clustering at the individual level - In columns (2)-(7), standard errors (reported in parenthesis) are robust to clustering at the community level - In (2), Instruments (IV’s) used - two-period lagged measure of prevalence of electricity in the community, two-period lagged dummy=1 if the road in the community is paved. - In (4), IV’s used - two-period lagged measure of electricity in the community, two-period lagged no. of health posts in the community, two-period lagged no. of health posts interacted with two-period lagged age in months, two-period lagged no. of health posts interacted with mother’s schooling, and two-period lagged height in cm - In (5), IV’s used - two-period lagged measure of electricity in the community, two-period lagged no. of health posts in the community, two-period lagged no. of health posts interacted with two-period lagged age in months, and two-period lagged no. of health posts interacted with mother’s schooling - In (6), IV’s used - same IV’s as in specification 5 - In (7), IV’s used - two-period lagged log(PCE), two-period lagged no. of health posts interacted with mother’s schooling, and two-period lagged no. of health posts interacted with two-period lagged age in months - Also included in the regressions are dummy variables capturing missing observations for each of the following variables - mothers schooling, fathers schooling, mothers height and fathers height, where the missing observation was imputed by the sample mean. - p-values are reported for the F statistic on the excluded instruments and the Hansen J statistic. 62 But may not address biases arising from the correlation between time-invariant unobservables (ǫ i , ǫ c , and ǫ h ) and lagged height and hence, the parameter estimate obtained on the catch-up term using this strategy continues to be biased and inconsistent. The coefficient estimate on the catch-up term reported in column 3, table 3.3 is -0.18 and is biased downwards as compared to the OLS estimate, 0.53 (column 1, table 3.3). An OLS method applied to a first-difference specification creates an even larger down- ward bias compared to an OLS method applied to a levels specification, magnifying the measurement error problem (see Griliches and Hausman, 1986 for a discussion on this). Parameter estimate from an Arellano-Bond (1991) type first-difference GMM strat- egy uses community characteristics from 1993 and height in cm from 1993 as instru- ments for the first-differenced one-period lagged height. The coefficient estimate on the first-differenced lagged height for this specification is reported in column 4, table 3.3 which produces a coefficient estimate of -0.07 on the catch-up term. The Arellano-Bond (1991) strategy does not address measurement error bias due to the correlation between the time-varying error terms and the two-period lagged height in the instrument set. Results from a Hausman (1978) type specification test as reported in section 3.6.3 show that the assumption of lack of serial correlation in the time-varying error terms is not valid for this chapter and hence the Arellano-Bond (1991) estimation strategy will also produce a biased and inconsistent coefficient estimate on the catch-up term. The first-differenced GMM specification uses community characteristics from 1993 as instruments for the first-differenced one-period lagged height. This results in a coef- ficient estimate of 0.23 (column 5, table 3.3) on the catch-up term. The coefficient on the catch-up term from the first-difference GMM specification indicates larger catch-up effects compared to the coefficient estimate reported in the OLS specification, suggest- ing an upward bias in the OLS parameter estimate of the catch-up term. The catch-up term of 0.23 indicates more than partial catch-up in height attainments, that is, children 63 with less than average height in 1993 will not continue to obtain less than average height attainments in 2000. This indicates that malnutrition during childhood is not likely to lock these children into lower health status as measured by height in centimeters in the future. The catch-up coefficient obtained from following a first-difference GMM strategy provides us with our preferred estimate on the catch-up term as it addresses both omitted variables bias (via first-differencing) and measurement error bias (via instrumental-variable techniques) in data. In column 6, table 3.3, an alternate measure of household’s long-run resource avail- ability is used where one-period lagged assets (productive and non-productive assets included) are used to replace the one-period lagged log(PCE). This specification is to verify the robustness of the catch-up estimate, i.e., to see if the use of the two different measures of household resource availability alters the coefficient estimates on the catch- up term. 28 The coefficient estimate on the catch-up term reported in column 6, table 3.3 is 0.23 and uses a first-difference GMM strategy with the same instruments as those used in column 5, table 3.3. The coefficient estimates reported on the first-differenced lagged height in columns 5 and 6 of table 3.3 are statistically different from both zero and the ordinary least square parameter estimate. 29 In addition, the coefficient estimate on the catch-up term obtained from columns 5 and 6 of table 3.3 are not statistically dif- ferent form each other which suggests that coefficient estimates on the first-differenced 28 Even if we were to treat lagged log(PCE) as endogenous in the first-difference specification, the estimated coefficient on the catch-up term remains unchanged. For instance, I re-estimate the specifica- tion from column 5, table 3.3, now treating the first-differenced lagged log (PCE) as endogenous using two-period lagged log (PCE) as additional instruments. This results in a coefficient estimate of 0.23 on the catch-up term which is statistically significant at 5% and not different from the catch-up coefficient obtained in column 5, table 3.3 where the first-differenced lagged log (PCE) is treated as exogenous. Results from a Hausman type specification test are reported in column 5, table 3.3 testing the exogene- ity of the firs-differenced lagged log(PCE). The test results suggest that the null of exogeneity of the first-differenced lagged log(PCE) cannot be rejected. 29 A simple chi-square on the coefficient on the first-difference GMM being different from the OLS coefficient estimate are 4.82 (0.02) for estimates reported in column 5, table 3.3 and 4.69 (0.03) for estimates reported in column 6, table 3.3 with p-values in the bracket. 64 lagged height in columns 5 and 6 of table 3.3 are robust to the variables used to capture household’s long-run resource availability. If I were to assume that there exists complete markets, that is, households can freely borrow and lend in each period, this would then imply that there should be no mea- sure of household resource availability in the right hand side of the dynamic empirical specification. Estimating the dynamic specification using a first-difference GMM esti- mation strategy dropping lagged log(PCE) (our measure of household’s access to credit) from the RHS using the same instruments as in column 5, table 3.3, yields a coefficient estimate of 0.24 on the catch-up term which is statistically significant at 10%. The catch-up coefficient reported in column 7, table 3.3 yields a parameter estimate of 0.28, indicating partial catch-up effects. This estimate is obtained following a first- difference GMM estimation strategy where the first-differenced lagged height is identi- fied using community characteristics from 1993 and log of real PCE from 1993 as instru- ments, maintaining the same stochastic assumption as in columns 5 and 6. The dynamic specification estimated in column 7 replaces the community time-varying observables with community interacted time dummies in a first-difference framework. Further, the catch-up estimate in column 7 is not statistically different from those obtained in columns 5 and 6. Now comparing the coefficient estimate on the catch-up term obtained in this chapter to some of the earlier literature. Hoddinott and Kinsey (2001) find a catch-up coefficient of 0.56 using data on children from Zimbabwe. Fedorov and Sahn (2005) report a coef- ficient of 0.19 on the catch-up term using data on children from Russia. Alderman et. al (2006) estimate a catch-up coefficient of 0.43 again using data on children from Zim- babwe. Children from Russia exhibit higher levels of catch-up potential as compared to children from Zimbabwe. Children from Indonesia too exhibit higher levels of catch-up potential compared to children from Zimbabwe. 65 3.6.2 Test and discussion of weak instruments for the dynamic panel specification The preferred IV estimates reported here are also additionally robust to an important econometric concern - instrument validity. An instrument is defined to be valid only if it satisfies the following two conditions - (1) the excluded instruments must be strongly correlated with the endogenous regressor and (2) the instrument must be uncorrelated with the error term in the second stage regression. In the presence of weak correlation between the instruments and the endogenous regressors, the IV estimates reported here are likely to suffer from a higher bias and inconsistency compared to the bias obtained on the OLS parameter estimate (Blundell, 2005). It is hence important to verify that the IV estimates reported here satisfy the two above mentioned conditions. Stock et. al (2002) and Staiger and Stock (1997) have discussed some test statistic that can be used to test the relevance of the instrument used in an IV estimation frame- work. Stock et. al. (2002) and Stock and Yogo (2005) define an instrument to be weak based on two criteria - First, based on the relative two-stage least squares (TSLS) bias where the instrument is deemed to be strong if the Cragg-Donald F statistic is large such that the TSLS bias with respect to the OLS bias is say at most x% (5, 10, 15 depending the extent of bias the author wants to allow). The second criterion is based on size, i.e., the instruments are defined to be strong if the Cragg-Donald F statistic is large enough that a 5% hypothesis test is rejected no more than say x% of the time, otherwise the instruments are weak. The Cragg-Donald F statistic is however based on the assump- tion of lack of first-order and second-order serial correlation in the error terms which is not valid in the current setting and hence the Cragg-Donald F statistic is not an appro- priate test statistic for the dynamic panel data model estimated in this chapter. The bias in an IV coefficient estimate relative to an OLS estimate can also be approx- imated with the inverse of the F statistic on the excluded instruments obtained from the first-stage regressions (Murray, 2006). Based on the above definition of relative bias, 66 the larger the F the smaller the relative bias from following an IV strategy compared to an OLS estimation approach. If F =1 the bias in 2SLS can be approximated to the bias in OLS estimates. If F<1 then the bias in 2SLS is even larger than the bias in OLS estimate. Staiger and Stock (1997) suggest a simple rule of thumb to test for instrument relevance. They suggest that in the presence of a single endogenous regressor, instru- ments are deemed to be weak if the first-stage F statistic on the excluded instruments is less than 10. However, the number 10 itself is quite arbitrary in its choice. In general, weak instruments cause two problems: (1) it brings the bias in the 2SLS/IV estimate closer or even larger than the OLS estimate. (2) It reduces the standard errors in IV estimates thereby producing incorrect inferences. Since there does not exist a precise test statistic to check for instrument relevance of the instruments used in the first-difference GMM estimates reported in columns 5 and 6 of table 3.3. A combination of factors jointly help to support that the most plausible coefficient estimate on the catch-up term is close to 0.23 and is statistically different from both zero and the OLS parameter estimate. The first-stage F statistic reported in columns 5 and 6 of table 3.3 are 3.06 and 3.14 respectively. The F statistics reported here if compared to the Staiger and Stock (1997) rule of thumb would identify the instruments as weak. However, using a different set of lagged community characteristics to identify the exogenous variation in the first-differences in lagged height maintaining the same stochastic assumptions as for the estimates reported in columns 5 and 6 of table 3.3 gives a coefficient estimate of 0.25 on the catch-up term with a first stage F statistic of 8.03, which is closer to 10. This clearly indicates no problem of weak instruments. The standard weak instrument problem does not seem to apply to this case since neither the significance of the parameter estimates changes and nor does the actual magnitude obtained changes under the presence of a smaller first-stage F statistic. 67 In addition to the test of strong correlation between the endogenous regressor and the instrument, it must also be the case that the instrument is uncorrelated with the error term in the second stage regression. The Hansen J statistic (1996) of 2.31 with a p- value of 0.51 (column 5, table 3.3) and 2.12 with a p-value of 0.54 (column 6, table 3.3) suggests that we cannot reject the null of instrument validity for the instruments specified in columns 5 and 6 of table 3.3. The coefficient estimate on the Hansen J statistic and the first-stage F test statistic on the excluded instruments are all appended at the end of the regression tables. The two conditions of instrument relevance discussed in this section provide addi- tional support for the reliability of the preferred estimates obtained using the first- difference GMM strategy. 3.6.3 A test of serial correlation in the error terms In this section an attempt is made to determine whether or not there is serial corre- lation in the error terms of a dynamic panel model. An Arellano-Bond (1991) estima- tion strategy may not be suitable for the dynamic specification because of the presence of serial correlation in the time varying error terms, however this must be tested. A Hausman (1978) type test is incorporated to the Arellano-Bond (1991) and the first- difference GMM strategies specified in columns 5 and 6 of table 3.3. Under the null that there is no serial correlation in the error terms, the Arellano-Bond (1991) strategy must yield consistent and efficient parameter estimates on the first-differenced lagged height. However, if this assumption fails, then the alternative first-difference GMM esti- mate (preferred estimate of this chapter) must be chosen which is consistent and efficient under the alternative but not under the null. The first-difference GMM (in column 5, table 3.3) estimator is tested against the Arellano-Bond (1991) (in column 4, table 3.3) estimator, where two-period lagged 68 height is used as an instrument for the first-difference in lagged height in addition to all the instruments specified in the first-difference GMM specification reported in column 5 of table 3.3. The estimated difference on the catch-up coefficients is 0.30 (standard error 0.12), rejecting the null. The coefficient estimates on the first-differenced lagged height are statistically significant and different under the two estimation strategies suggesting that the null of zero first-order and second-order serial correlation in the error terms is rejected. This section provides additional support in favor of the first-difference GMM strategy as the most preferred estimation strategy to be followed in a dynamic model especially, where serial correlation between the error terms is inevitable. 30 30 Apart from testing our preferred first-difference GMM estimation strategy against the Arellano-Bond (1991) estimator, I also test for the preferred first-difference GMM strategy against the two-stage least square estimate specified in the levels equation as reported in column 2, table 3.3 and the simple first- difference estimation strategy as reported in column 3, table 3.3. First, I use a Hausman specification test to compare the two-stage least square estimate reported in column 2, table 3.3 against our preferred first-difference GMM estimate reported in column 5, table 3.3. Under the null that the community time- invariant unobservables are random, the coefficient estimates reported in column 2, table 3.3 are both consistent and efficient. However, under the alternative the coefficient estimates reported in column 2, table 3.3 are inconsistent and the coefficient estimates reported in column 5, table 3.3 are both consistent and efficient. I use a Hausman specification test comparing the estimates on the catch-up term reported in column 2, table 3.3 and column 5, table 3.3; the difference in the catch-up coefficients is 0.60 with a standard error of 0.12 rejecting the null that the community specific time-invariant unobservables are random. Second, I also use a Hausman specification test to compare the first-difference estimation strategy reported in column 3, table 3.3 against our preferred first-difference GMM estimation strategy reported in column 5, table 3.3. Under the null that there is no serial correlation in the error terms and no measurement error problem in height attainments, the first-difference estimates reported in column 3, table 3.3 are both consistent and efficient. However, under the alternative the the coefficient estimates reported in column 5, table 3.3 are both consistent and efficient. The Hausman specification test comparing the coefficient estimates on lagged height from column 3, table 3.3 and column 5, table 3.3 yields a coefficient estimate of 0.41 on the catch-up term with a standard error of 0.13, rejecting the null of lack of serial correlation in the error terms/presence of measurement error in height attainments. Under the alternative, the coefficient estimates reported in column 5, table 3.3 are both consistent and efficient. All this provides further support towards the first-difference GMM estimation strategy as being our preferred estimates uses in this chapter. 69 3.6.4 Role of child, household, and community characteristics in the dynamic conditional health demand function Table 3.3 reports coefficient estimates from the regression of the dynamic condi- tional child health demand function specified in equation (3.11). Column 1 in table 3.3 reports coefficient estimates from following a simple OLS estimation strategy. The preferred first-difference GMM estimates are reported in column 5, table 3.3. The coefficient on lag age in months from column 5, table 3.3 captures the positive relationship between age in months and attained height. The interaction term between lag age in months and the male dummy suggest that with age, improvements in height are slightly higher among females. This is similar to the patterns found in the static regression results. In addition to the age and sex variables as controls in our right hand side, dura- tion, i.e., the length of period measured in months between the two consecutive survey rounds controls for the uneven gap between the three survey rounds (1993, 1997 and 2000). For every additional month between survey rounds, there is a 0.49 centimeter increase in attained height between 1997 and 2000 (column 5, table 3.3). The coeffi- cient on the interaction of lag age in months and duration captures the age differential growth patterns in height. The longer the duration between survey rounds, the slower the changes in height attainments among older cohorts. The interaction terms between lag age in months and both duration and the male dummy capture the age and sex dif- ferential patterns in growth of height attainments. The longer the duration and older the child, the larger will be growth in height for male children relative to their female counterparts. Child characteristics capture the biological process of growth in height that differs by age and sex. The coefficient estimates from the child characteristics are largely consistent with that found in the literature. 70 Another household characteristic included in the regression estimates is the one- period lagged household consumption expenditure. Regression estimates from table 3.3 show that the one-period lagged log(PCE) in the dynamic function has a positive effect on current health status. The coefficient on lagged log(PCE) is 0.51 (column 1, table 3.3) in the OLS specification indicating a large positive impact of income on current health even after controlling for the one-period lagged health status. The coefficient on lagged log(PCE) in the first-difference GMM specification reduces to 0.22 (column 5, table 3.3) indicating the presence of a possible upward bias in the OLS coefficient estimate of lagged log(PCE) resulting from the correlation between time-invariant household spe- cific unobservables and lagged log(PCE) reported in column 1, table 3.3. Income and child health exhibit a strong positive and significant relationship. Community characteristics play an important role in determining child health out- comes in static models. Little is known about their influence in dynamic settings. Fedorov and Sahn (2005) report coefficient estimates on a series of community char- acteristics from the estimation of a static and dynamic conditional child health demand function and find that community characteristics have a larger role to play in determining current health in dynamic settings. At the same time there is no impact of these community characteristics in the dynamic specifications especially for the preferred estimates reported in columns 5, table 3.3. After controlling for the one-period lagged health status, the effect of past community characteristics in determining current health largely diminishes. First- differencing removes all time-invariant variation among the right hand side regressors and additional instrumenting of the first-difference specification, results in a loss of over time variation in the right hand side variables. Both these factors explain for the little role played by health inputs in determining current health status in the dynamic specifi- cation. 71 3.6.5 Do catch-up effects differ with age? It is usually hypothesized that younger children will experience larger catch-up effects as compared to older children [Martorell and Habicht (1986); Habicht et. al (1995)]. For example: Schroeder et. al (1995); Habicht et. al (1995) show that the impact of the nutritional intervention program in rural Guatemala had the most signifi- cant impact on improving the stature of children less than 3 years of age. This chapter attempts to find similar support by adding an interaction term between the one-period lagged health status and lag age in months in the dynamic specification. A positive and significant coefficient estimate on the interaction term will indicate lower catch-up potential among older children. However, adding the interaction term in the empirical specification increases the endogeneity problem. Columns 1 and 2 of table 3.4 report coefficient estimates on the one-period lagged health status and the interaction term between lagged health status and lag age in months using OLS and first-difference GMM estimation strategies. Table 3.4: Dynamic health demand function with additional interaction terms Covariates (1) OLS (2) First-difference Height GMM Height preferred estimates Lagged height 0.3675*** 0.2408** (0.02) (0.09) Lagged height*lag age in months 0.0033*** 0.0010* (0.0003) (0.0006) Male dummy 9.4866**** (3.35) Lag age in months -0.3206*** 0.1884 (0.09) (0.15) Lag age in months*male dummy -0.1564**** -0.1660*** (0.05) (0.05) Duration 0.1948** 0.3501** (0.09) (0.13) Duration*male dummy -0.1890** -0.1633 (0.07) (0.12) Duration*lag age in months 0.0036** -0.0009 72 Table 3.4: Continued Covariates (1) OLS (2) First-difference GMM (0.001) (0.002) Duration*lag age in months* 0.0030** 0.0037*** male dummy (0.001) (0.001) Mother’s height 0.1708*** (0.01) Father’s height 0.1260*** (0.01) Mother’s schooling 0.0170 (0.02) Father’s schooling 0.0208 (0.02) Lagged log(PCE) 0.5345*** 0.2074 (0.12) (0.14) Price of rice 1.5788 0.1964 (1.25) (0.79) Price of cooking oil -0.2241 -0.0165 (0.27) (0.20) Price of condensed milk 0.0721 -0.0096 (0.09) (0.07) Rural dummy 0.4186 0.2002 (1.19) (1.39) Rural dummy*price of rice -1.1380 -0.4095 (1.24) (0.97) Number of health posts -0.1235 -0.0021 (0.06) (0.02) Distance to health center -0.0318 -0.0175 (0.02) (0.02) Electricity -0.0032 -0.0018 (0.007) (0.006) Dummy for paved road 0.0331 -0.0359 (0.31) (0.28) Male wage rate 0.3254 0.0341 (0.29) (0.23) Female wage rate 0.2717 0.1259 (0.19) (0.13) observations 5457 1819 Location Yes No fixed-effects F statistic 19.69 on the excluded (0.00) instruments from the first-stage regressions Hansen J statistic 0.42 (0.81) C statistic testing the 0.24 73 Table 3.4: Continued (1) OLS (2) First-difference GMM orthogonality of the (0.61) two-period lagged log(PCE) in specification (2) - Source: IFLS - 1993, 1997, and 2000 - *** significant at 1%, ** significant at 5%, * significant at 10% - In (1), Robust standard errors adjusted for clustering at the individual level reported in parenthesis - In (2), instruments used - two-period lagged log(PCE), two-period lagged number of health posts in the community, two-period lag age in months, two-period lag age in months interacted with two-period lagged no. of health posts in the community. - Also included in the OLS regression are dummy variables capturing missing observations for each of the following variables - mother’s schooling, father’s schooling, mother’s height, and father’s height where the missing observation was imputed by the sample mean. - P-values are reported for the F statistic on the excluded instrument and the Hansen J statistic. - The F on the excluded instruments from the lagged height*lagged age in months - 64.50 - Prices of consumption goods and hourly wage rates are converted in real terms and expressed in logs - Two-period lagged corresponds to information from the year 1993 The first-difference GMM estimates reported in column 2, table 3.4 indicates a coef- ficient of 0.0010 on the interaction term indicating age differential catch-up effects, i.e., older children experience lower catch-up as compared to younger children. The F sta- tistics on the excluded instruments are also valid and appended at the end of table 3.4. The Hansen J statistic testing the null of zero correlation between the error and the instrument set is also satisfied. Figure 3.3 plots the catch-up effects against age in months based on the regression estimates from column 2, table 3.4. Figure 3.3 indicates that there exists only some age differential catch-up effects with younger children exhibiting only marginally higher catch-up potential than older children. 74 Figure 3.3: Catch-up effects .26 .28 .3 .32 .34 .36 Catch−up Effects <6 6 12 18 24 36 48 59 Age in months as of 1993 3.6.6 Further implications Stunting during early childhood has long-term effects on an individual’s future eco- nomic and social well-being. This chapter captures the extent to which stunting in child- hood manifests into poor health status in the future. In the absence of strong causal effects between childhood malnutrition and subsequent health status, some of the neg- ative consequences associated with childhood malnutrition can be mitigated. In this chapter, I find that childhood malnutrition causes some but not significant growth retar- dation in an individual’s future physical well-being as measured by height attainments. I find that a malnourished child in the absence of any catch-up potential would by adoles- cence, grow to be 4.15 cm shorter than a well-nourished child. However, in the presence of partial catch-up effects, i.e., a coefficient of 0.23 as estimated in this chapter indi- cates that a malnourished child will by adolescence grow to be only 0.95 cm shorter 75 than a well-nourished child. This recovery from childhood stunting also has impact on an individual’s schooling attainments and other socioeconomic characteristics. For example: Maccini and Yang (2005) examine the association between adult height attain- ments and schooling attainments using data from the IFLS. Using their estimates on the causal effects of adult height attainments on schooling attainments, and combining the methodology outlined in Alderman at. al (2006), I compute the magnitude to which the presence of partial catch-up effects affects schooling attainments. I find that a malnour- ished child, in the presence of partial catch-up effects (0.23) as predicted in this chapter, by adolescence, is likely to complete 0.6 less grades of schooling compared to a well- nourished child from the same population. In the absence of any catch-up potential this coefficient estimate is likely to be four times larger. 3.7 Conclusion In view of the ever growing concern among development economists for child health, this chapter identifies the extent to which childhood malnutrition affects subsequent health status. A dynamic conditional health demand function is estimated where the coefficient on the lagged dependent variable captures the extent of recovery, if any, from childhood malnutrition. A coefficient of 0.23 on the one-period lagged health status indicates reasonable catch-up in height attainments. Recall from the introduction sec- tion, in the presence of partial catch-up potential, by adolescence, a malnourished child will grow to be 0.95 cm shorter than a well-nourished child. In the absence of any catch-up, by adolescence, a malnourished child will grow to be 4.15 cm shorter than a well-nourished child. Using the coefficient estimates reported in Maccini and Yang (2005) on the impact of height on various socioeconomic outcomes, I calculate that a decline in stature by 0.95 cm lowers schooling attainments by 0.6 grades of schooling. 76 There is only some evidence showing that catch-up effects are marginally higher among younger children than older cohorts. From a practical standpoint, the presence of partial catch-up effects and age- differential catch-up effects suggests that continued efforts must be made on the part of households and policy makers towards improving children’s nutritional status at all ages. However, special emphasis must be on younger age groups as their catch-up potential is still the highest. It is important that policy prescription is drawn from good empirical work. The first-difference GMM estimation strategy used here relies on much weaker stochastic assumptions than earlier work and addresses omitted variable bias and measurement error bias in data. The results reported here are in addition robust to econometric con- cerns such as sample attrition and weak instruments. This chapter and other papers from the earlier literature can be criticized due to the presence of potential regression to the mean effects [Cameron et. al (2005), and Coly et. al (2006)]. In this chapter, I have mitigated some of this problem by addressing issues related to measurement error and sample selection in data (Cameron et. al, 2005). How- ever, the individual-specific time-varying growth spurts in stature also result in regres- sion to the mean effects. Therefore the presence of regression to the mean effects can never be completely ruled out. To summarize, this chapter uses a dynamic framework to outline the determinants of child health. The dynamic results indicate that there exists catch-up potential in health outcomes, that is, children who suffer from chronic malnutrition during childhood are not likely to remain undernourished forever. The presence of catch-up potential suggests that focused attempts must be made towards improving nutritional outcomes of children at all ages with special emphasis on the very young. 77 Chapter 4 Determinants of schooling outcomes among children from rural Ethiopia 4.1 Introduction Investments in schooling are positively associated with higher economic and non- economic gains in the future for the individual, the household and at the aggregate level for the economy [see Psacharopoulos and Patrinos (2004) for a recent review]. In the past, economists have identified labor market returns, parental education, house- hold income, and school characteristics as some key determinants of schooling out- comes. 1 For instance: Behrman and Knowles (1999) find that a child from Vietnam who belonged to a household whose average income was one standard deviation above the mean, on an average completed 2.2 additional years of schooling and scored 7% higher on examinations. Glewwe and Jacoby (1994) find that in Ghana, adding a library increased math test scores by 1.2 standard deviations. The existing literature mostly uses cross-sectional data to characterize the socioeco- nomic determinants of schooling outcomes which suggest that policy initiatives aimed at directly affecting the demand (household income) and or supply side factors (avail- ability of schools) of schooling are likely to improve enrollments and completed grades 1 See Foster and Rosenzweig (1996) and Deolalikar (1993) for the role played by differences in labor market returns in determining schooling outcomes. Lillard and Willis (1992), Parish and Willis (1993), bring out the impact of parental education in determining schooling outcomes. Behrman and Knowles (1999) provide a review on the role played by household income in determining educational outcomes. Glewwe and Jacoby (1995) and Glewwe (2002) have focused on supply side determinants of schooling outcomes. Handa and Peterman (2007) and Alderman et. al (2006) examine the impact of childhood malnutrition on future schooling outcomes. 78 of schooling. However, schooling outcomes today are not independent of your school- ing related decision made in the last period. Schooling decision in the current period is a choice between continuing in school and dropping out for those already enrolled; this decision changes to a choice between enrolling or not for children who have never been to school. Hence, schooling decisions today are interconnected with past period’s schooling decisions, all of which affect an individual’s final schooling attainment as observed today. For instance, Behrman, Sengupta, and Todd (2005) accounts for past schooling decisions in explaining current period attainments using experimental data from Mexico. They assess the impact of PROGRESA (a school subsidy program in Mexico) on schooling outcomes, treating the initial distribution of schooling states (not enrolled, enrolled in grade 1, enrolled in grade 2) at each age as given, to estimate a probability transition matrix which specifies the vector of schooling states for the next age. This methodology allows them to capture the association between an individual’s enrollment status in the past period and it’s effect on current period enrollments. A transition matrix is estimated for children in both the treatment group and the control group and the difference in these matrices at each age determines the program impacts on schooling enrollments and progression. Behrman, Sengupta and Todd (2005) account for past schooling outcomes in deter- mining current attainments, however, do not account for the impact of socioeconomic factors and other unobservables that too have a role to play in determining the child’s complete trajectory of current and future schooling outcomes. For instance, the child’s innate genetic ability to perform well in school affects both the child’s current state in the schooling transition matrix and his/her state in the last period. Socioeconomic fac- tors like household income and demographic characteristics are also likely to determine 79 the child’s current schooling outcome. It is difficult to add these characteristics addi- tionally in a probability transition matrix framework as it poses additional problems in estimation [see Behrman, Sengupta and Todd (2005) for details]. The main objective of this paper is to use an empirical framework that allows us to capture the impact of past schooling outcomes in explaining current outcomes together controlling for household socioeconomic factors and demographic characteristics. We specify a dynamic conditional schooling demand function, where the coefficient on the lagged schooling outcome variable captures the extent to which an individual’s past schooling inputs can affect his/her final schooling attainments. 2 In this paper, I find evi- dence which suggests the existence of a strong positive association between current and lagged schooling outcomes. This implies that in the absence of such a dynamic specifi- cation, the impact of past schooling resources (including short-run policy intervention) that affect lagged outcomes may be underestimated. Similar to earlier work in the schooling literature, this paper also characterizes the socioeconomic determinants of current schooling outcomes. Empirical evidence from this paper finds household income and parental schooling as the key factors that explain for schooling outcomes. Identifying such factors can be helpful for guiding reforms and policy initiatives for the future. To address the above objectives, we draw empirical evidence from rural Ethiopia. Why Ethiopia? First, developing countries from Asia and Africa pose a huge chal- lenge to policy makers who aim to achieve universal primary school education by 2015. Ethiopia is no exception, where even today less than 60% of children in each age cohort enter grade 1 and only 60% of these enter grade 4 (Schaffner, 2004). Hence, factors that 2 Such a dynamic specification capturing past schooling inputs have been estimated previously and are commonly referred as value-added specifications. The value-added specifications have mostly restricted their analysis to schooling outcomes capturing cognitive achievements as measured by test scores (see Todd and Wolpin, 2007). 80 explain for schooling outcomes can be used to guide public investments in the future. Second, despite the very low levels of grade completion, average primary school enroll- ments almost doubled during 1994-2004. Much of this increase in schooling outcomes during 1994-1999 was accompanied by substantial increments in average household per capita consumption expenditures. However, the period of 1999-2004 saw similar improvements in schooling outcomes with little improvement in household incomes. A dynamic specification as previously described can be useful in this context as it brings out the role played by past schooling resources including household income in explain- ing future schooling outcomes. Third, our focus is primarily on rural areas as achieving improvements in educational outcomes in rural areas is the more difficult objective to achieve. 3 To address the objectives of this paper, we estimate both a static 4 and dynamic con- ditional schooling demand function. The static function is estimated separately for primary school age children using data from each of the 1994, 1999 and 2004 waves of the Ethiopian Rural Household Survey (ERHS). The dynamic function is estimated using observations on children between 7 and 14 in 1994 that are followed through the 1999 and 2004 waves of the ERHS. This paper uses enrollment status and relative grade attainment as main outcome variables of interest. The paper contributes to the existing literature in two ways - First, the paper uses a dynamic conditional schooling demand function to establish the link between past schooling resources and current schooling attainments controlling for other observ- able socioeconomic factors and unobservables such as individuals’ genetic endowments. 3 See Orazem and King (2008) and Schaffner (2004) for details on rural-urban differences in schooling attainments among school age children from Ethiopia. 4 Static framework uses accounts for households’ decision making in period as independent of other period’s decisions and in a dynamic framework the household decision in any period is related with all past and future periods’ decisions. 81 Second, the paper identifies the socioeconomic factors that have contributed towards the improvements in schooling outcomes in Ethiopia over the last decade. Third, the static specification also explicitly addresses some of the concerns that have not been addressed by existing cross-section studies which use the following approaches that are somewhat limited in scope. For example, most papers in the literature use data on individuals who have already completed their schooling spell and use socioeconomic characteristics from the individual’s current period to explain his/her completed school- ing attainments (Parish and Willis, 1993). This kind of an empirical specification is potentially misspecified as the right hand side demand and supply side characteristics may not map to the year in which the decision regarding schooling outcomes is actually made. 5 We address this concern by using socioeconomic characteristics that appropri- ately map to the year in which the child’s schooling related decisions are made. Also, many papers from the literature use data on children with only completed schooling spell’s restricting their sample to older children (usually to at least 15 years and above on an average). This is likely to create out migration related selection con- cerns. For example: boys older than 15 years from poor rural areas are likely to migrate in search of job opportunities and similarly girls older than 15 are likely to migrate due to early marriage. Hence the analysis sample is likely to be a pool of non-randomly selected individuals who continue to live in the same household. 6 In order to address out-migration related selection concerns, we restrict our sample to include only primary school age (7-14 years) children. However, household level attrition could potentially 5 Authors often use the same household level and community level characteristics to explain for the 5 grades of schooling that a 12 year old and an 18 year old have attained without taking into account that the 18 year old may have completed 5 grades of schooling 6 years ago and it is the demand and supply side factors from that year which affects his/her completed grades today. 6 A few papers in the literature have tried to address this issue by re-defining the outcome variable to a relative measure Most papers assume this problem away and only a very few even mention it [see Tansel (1998), Holmes (1999) for discussion on these issues]. 82 create a non-random sample even among primary school age children. We later show that the low rates of household level attrition in the ERHS make this data set especially attractive. Finally, schooling outcomes today is not just a function of current resources and it is the history of demand and supply side factors that determines an individual’s com- plete trajectory of schooling outcomes. Hence, we must use a dynamic specification to capture such an association. The static regressions identify household income and parental schooling as the key determinants of enrollments and relative grades accumulated. 7 Our preferred estimates of household income suggest that in 1994 income had almost no role in explaining schooling enrollments and only very little role in explaining relative grades. How- ever, during 1999-2004 income effects gained sizable importance. In 2004 a 100% increase in real per capita consumption expenditure increased the probability of a 7-14 year old being enrolled by 0.17. A similar impact is seen on relative grades, where a 100% increase in real per capita consumption expenditure increases relative grades accumulation by 0.08. The increasingly strong association between household income and schooling in Ethiopia suggests that more income in the hands of the rural households will improve schooling among the next generation. The preferred (IV) estimate on household income reported here is three times larger than the OLS parameter estimate, suggesting that not accounting for the endogeneity in the income variable would result in a large downward bias in the OLS estimate of household income. The IV coefficient estimates of income reported in this paper are also robust to the problem of weak instruments. 7 Relative grade is calculated as actual grades divided by potential grades. Where potential grade is calculated as the total number of grades accumulated if the individual completed one grade of schooling by age 7 and accumulated one additional grade in each subsequent year. 83 The dynamic estimation results indicate that current schooling is strongly associated with past schooling resources, where the impact of past schooling resources is captured by the coefficient on the lagged dependent variable. We find that a children who were enrolled in the last period is 32 percentage points more likely to be enrolled today as compared to children who were not enrolled in the last period. There also exists associ- ation between relative grades attained today and relative grades accumulated in the last period, the magnitude of this coefficient is 0.25 suggesting that individual’s are able to compensate for at least some of the loss in grades occurred during the initial years. To summarize, this paper contributes to the existing literature in multiple ways - (1) we use socioeconomic characteristics that appropriately map to the year in which the child’s schooling related decisions was made. (2) In order to address migration related selection concern, we restrict our sample to include only primary school age (7-14 years) children. However, household level attrition could potentially create a non-random sam- ple even among primary school age children. We later show that the low rates of house- hold level attrition in the ERHS make this data set especially attractive. (3) We treat our measure of income (measured by household per capita consumption expenditure) as endogenous and compare the extent and sources of bias in PCE comparing the OLS and IV estimates. This has previously been addressed in the literature by several papers, but not all due to the lack of available data on possible instruments for PCE [see Behrman and Knowles (1999) for review]. (4) The coefficient estimates on the household charac- teristics reported in this paper are robust to the inclusion of actual village level supply side factors or village fixed-effects. (5) We establish the relationship between current and lagged schooling outcomes. (6) We use variants of the GMM estimation strategy to deal with the endogeneity problem in the lagged schooling outcome variable. In particular, the estimation strategy adopted here addresses all sources of time-invariant 84 heterogeneity - at the individual, household, and village level in determining the impact of lagged schooling outcomes on current schooling outcomes. The rest of the chapter is organized as follows. Section 4.2 outlines the theoret- ical model. Section 4.3 outlines the empirical specification used for estimation pur- poses. Section 4.4 describes the data, provides descriptives, and details on variables constructed. Section 4.5 discusses the results obtained from the static and dynamic regressions. Concluding remarks follow in Section 4.6. 4.2 Model Parental investment in schooling is guided by either altruistic preferences or eco- nomic returns. In the former case, schooling is treated as a pure consumption good from which the parent derives utility. Whereas, in the latter case, schooling does not sep- arately enter the utility function; it is only the expected future returns from schooling (wage earnings) that affect parents utility [Mincer (1958), Schultz (1960), Bommier and Lambert (2000), Brown and Park (2002). 8 ] This framework does not account for any non-pecuniary benefits associated with schooling. 9 Following the health production function specified in Sahn and Fedorov (2005); Strauss and Thomas (1995, 1998, 2008); Foster (1995), we specify our schooling pro- duction function (4.1), where schooling in period t,S t is a function of schooling inputs, community resources, individual demographics, child characteristics, household char- acteristics, and genetic endowments. 8 Brown and Park (2002), define parents as being altruistic as long as parents care as much about their children as themselves. 9 There is another household framework, where parent’s investments in schooling outcomes are higher among the relatively less endowed children. Parents here aim to equalize the expected future returns from schooling among all children in a household [Behrman, Pollak and Taubman (1982)]. It is important to have a good measure of ability to identify the need for differential schooling investments among children. The ERHS does not collect data that measures the child’s ability and hence we do not dwell any further in these models. 85 S t =s(M t ,M t−1 ,....,M 0 ,I t ,I t−1 ,....,I 0 ,D σ ,θ cσ ,θ c ,μ hσ ,μ h ,G) σ = 0,1,...t (4.1) S t is measured as enrollment status, completed grades or relative grade attainment. Schooling inputs,M t include books, school uniform, and home inputs which affect the accumulation of schooling outcome. It is assumed that the household does not derive any direct utility from the consumption ofM t except via its impact on determiningS t . Envi- ronmental characteristics are important in determining schooling outcomes as they affect the age at which the child first starts schooling and also continue to affect enrollment in every subsequent period. It characterizes the environment where the child lives captur- ing school resource availability in the community. D σ include time-varying household demographic characteristics such as mother’s age and age of the head of the household capturing household experience. G captures genetic endowments that pass from the par- ent to the child affecting the child’s overall cognitive development and learning. θ cσ and θ c includes child specific time-varying and time-invariant observables such as child’s sex and age which captures age and gender specific differences in the accumulation of schooling. θ cσ andθ c also include time-varying and time-invariant unobservables such as the child’s own innate ability to perform well in school. μ h and μ hσ capture house- hold specific time-invariant and time-varying rearing and caring practices as captured by parental schooling variables. Households decide to allocate schooling inputs based on the following inter- temporal utility maximization problem defined over T time periods. Expected lifetime 86 utility - U (4.2) is maximized, subject to a lifetime budget constraint (4.3), and a dynamic schooling production function (4.4) 10 Max :U =E t T X t=0 β t u t [C t ,S t ,L t ;θ pt ] (4.2) Subject to: A T = ( T Y t=0 (1+r t ))A 0 + T X t=0 ( T Y τ=t (1+r τ ))(w t (T t −L t )+π t −P c t C t −P m t M t ) (4.3) S t =f(S t−1 ,M t ,I t ,D t ,θ ct ,θ c ,μ ht ,μ h ,G) (4.4) The utility function specified here assumes the existence of a unitary household, that is, each member of the household shares the total available resources in equal measure. We are aware that there exists little empirical validation for the existence of a unitary household model (Haddad et. al, 1995). However, limited data availability does not allow us to use a collective approach to model household behavior. In addition, given the data restrictions, a ‘unitary’ vs. ‘collective’ approach to model household behavior will not alter the empirical specification. The sub-utility function (u t ) in each period depends upon consumption goods that include food and non-food commodities, C t , leisure, L t , and the child’s educational status, S t . E t is the expectations operator conditional upon information at time t. β is the subjective discount factor which captures household preferences for higher utility 10 We substitute for all past period’s schooling inputs and environmental factors by the one-period lagged schooling outcome in equation (4.1). Redefining equation (4.1) the dynamic child schooling pro- duction function (4.4) can be obtained. 87 today vis-a-vis the future. P c t is a vector of prices of food and non-food consumption goods. P m t is a vector of price of school inputs. w t is the wage rate (price of leisure). T t is parents total time endowment andA 0 is assets the households owns at the beginning of period 0. Profit income from farm and non-farm activities and all other sources of non-labor income is captured byπ t . The solution to this optimization problem relies on the following assumptions - (a) the household’s utility function is additively separable over time. (b) The sub-utility functions are concave and twice differentiable. (c) The one-period lagged schooling outcome is sufficient to capture the impact of all past schooling inputs, environmental factors and other characteristics starting from birth onwards up until the last observed period in the sample. (d) The household can potentially borrow and or lend against its future in each period [Deaton and Meullbauer (1980), Strauss and Thomas (2008)]. Under these assumptions, we can solve for the optimal conditional schooling input demand function, asM ∗ t . 11 M ∗ t =m(S t−1 ,P c t ,P m t ,w t ,I t ,λ,D t ,θ ct ,θ c ,μ ht ,μ h ,G,θ pt ,E t (Z t+j )) (4.5) forj = 1,2,....,T−t andZ =P c t ,P m t ,w t ,I t ,D t ,θ ct ,θ c ,μ ht ,μ h ,θ pt ,G The dynamic conditional child schooling demand function (4.6) can be derived by replacingM ∗ t forM t in equation (4.4): S ∗ t =f(S t−1 ,P c t ,P m t ,w t ,I t ,λ,D t ,θ ct ,θ c ,μ ht ,μ h ,G,θ pt ,E t (Z t+j )) (4.6) forj = 1,2,....,T−t andZ =P c t ,P m t ,w t ,I t ,D t ,θ ct ,θ c ,μ ht ,μ h ,θ pt ,G 11 See the derivation of first-order conditions obtained for a similar dynamic model from chapter 3 88 The optimal dynamic conditional child schooling demand function derived in (4.6) is expressed as a function of the one period lagged schooling outcome, price of con- sumption (food and non-food) goods, price of schooling inputs, wage rate, community infrastructure, child characteristics, household characteristics and marginal utility of wealth in period 0 (λ). Expectations at time t about future periods - prices of consump- tion goods and schooling inputs, wage rates, community resources, household charac- teristics and child characteristics are all captured by the term Z. Theoretically the term Z enters the dynamic schooling demand function in an unrestricted manner. However, we assume that the term Z (in equation 4.6) enters the dynamic empirical specification only additively. In a static model, the optimal school input (M ∗ t ) is determined by maximizing utility in current period subject to a period specific budget constraint and a static schooling pro- duction function. 12 The optimal school inputs are then substituted into the static school- ing production function to obtain the optimal schooling demand function, as specified below: S ∗ t =f(P c t ,P m t ,w t ,I t ,D t ,π t ,θ ct ,θ c ,μ ht ,μ h ,G,θ pt ) (4.7) In a static framework, current schooling is a function of current period prices of con- sumption goods, price of leisure, price of school inputs, environment characteristics and household income. The theoretical model specified in this section guides the choice of the right hand side variables used in the empirical specification. 12 See Mani (2007) for a derivation of static conditional child health demand function. An identical framework is followed to write out the static conditional child schooling demand function specified here. 89 4.3 Empirical specification The empirical counterpart of the static [equation 4.7] and dynamic [equation 4.6] conditional schooling demand functions can be written as follows: S it =β 0 + R X j=1 β X j X jit + S X j=1 β Z j Z ji +ǫ v +υ it ; υ it =ǫ i +ǫ h +ǫ it (4.8) S it =β 0 +β 1 S it−1 + R X j=1 β X j X jit + S X j=1 β Z j Z ji +ǫ i +ǫ h +ǫ v +ǫ it (4.9) S it andS it−1 is the child’s enrollment status or relative grades accumulated at time t and t-1 respectively, where subscript i refer to the child. Enrollment status is char- acterized as a dummy variable defined =1 if the child is enrolled in school at the time of the survey and 0 otherwise. 13 Relative grades is defined as actual grades divided by potential grades. Where potential grades is calculated as the total no. of grades accu- mulated had the individual completed one grade of schooling by age 7 and continued to accumulate one more additional grade of schooling in each subsequent year. The time-invariant regressors, Zs include a male dummy, measure of parental school- ing, and mother’s age. The male dummy is coded as 1 for males and 0 for females and captures gender differences in schooling outcomes. We include measures for both mother’s schooling and father’s schooling. Our measure for mother’s schooling is con- structed as a dummy=1 if the mother has completed non-zero years of schooling and 13 In addition to enrollment in regular school, some of the children are also enrolled in religious schools. We are interested in the child’s actual human capital accumulation which comes only through learning subjects like mathematics, science or social science and is directly related to the child’s future earnings potential, none of which is taught in religious schools. Hence, for our purpose in this paper, we treat children enrolled in religious schools as not enrolled. 90 0 otherwise. Our measure for father’s schooling is constructed using the same rule. Parental schooling variables reflect parental rearing and caring practices that affect schooling outcomes. Our measures of parental schooling is characterized using dummy variables because the average grades accumulated among parents is about 1 grade in 1994 and increases to only 2 grades by 2004. There is only a marginal variation in completed grades of schooling among parents. Hence, we use a categorical variable to describe parental schooling attainments. Mother’s age captures mother’s experience in efficiently allocating schooling inputs among school age children. Identical time- invariant regressors are specified in the static and dynamic specifications (equation 4.8 and equation 4.9). Xs include time-varying regressors at the individual level, household level, and vil- lage level. A point to note is that some of the time-varying regressors specified in the static specification are different from the one’s specified in the dynamic specification. In the static specification, we control for time-varying child level regressors such as age of the child, where age is specified using dummy variables with a separate dummy vari- able for each year between 7-14 years. The age dummies capture the age differential impact of schooling. In the dynamic specification, we follow panel respondents who were between 7-14 in 1994. Hence we use a spline in lag age in years with the cut-off at 14.99 years. 14 The age-gender specific determinants of schooling are captured using interaction terms between the age variables and the gender dummies. The time-varying household level regressors in the static specification include house- hold composition variables such as number of adult (>18 years) males, number of adult (>18 years) females capturing household demographic composition. Age of the head of the household is an additional regressor used to capture household experience and life-cycle position. In the dynamic specification we use one-period lagged measure of 14 This cut-off was determined using a lowess plot on age in months and schooling outcome 91 the same household composition variables, as current period household composition variables would be endogenously determined in the dynamic model. In order to reduce the number of endogenous regressors in the right hand side, we use lagged measure of household composition variables. Time-varying village specific characteristics can only be included in the static model since the information on village characteristics are only available for 2004. To control for schooling infrastructure and environment in the community, we include distance to primary school in km, a dummy for availability of electricity, and a dummy for avail- ability of piped water. The static specification in equation 4.8 must also include a measure for household wage and non-wage income. Households tend to smooth consumption and hence, con- sumption is likely to be a better measure of income than current information on wage and non-wage income. Hence, we use logarithm of real per capita consumption expenditure (PCE) as a measure of household income [Behrman and Knowles (1999)]. Total house- hold consumption expenditure is constructed as the sum of value of food items (ques- tionnaire included details on 33 specific food items) consumed including purchased and non-purchased consumption goods (consumption out of own stock), and value of non- investment type non-food items purchased. Non-food items include consumables such as matches, batteries, kerosene but exclude expenditure on durables such as housing [Dercon, Hoddinott and Woldehanna (2005)]. Consumption levels are valued using prices obtained from market survey fielded at the same time as the household surveys. The total household consumption expenditure is further divided by household size to capture the per person resource availability in the household. The nominal per capita consumption values are then converted to real per capita consumption expenditure using the food price index. Logarithm of the real per capita household consumption expendi- ture is further taken to account for non-linearity in the per capita consumption variable. 92 In the static specification, we treat our measure of household income as captured by log(real pce) as being endogenously determined. OLS estimate on PCE is likely to be biased and inconsistent due to - (1) the potential correlation between household specific time-invariant unobservables (parent’s preferences and time discount rate) and PCE, resulting in omitted variables bias. (2) The presence of random measurement error in data is likely to additionally bias the estimated coefficient on PCE towards zero. We use two-stage least squares to address the endogeneity problem in per capita consumption variable. The dynamic model includes λ, marginal utility of wealth in period 0, which is a function of both retrospective information (period 0 to period t-1) and prospective infor- mation (period t+1 to period T) on prices, incomes, child characteristics, and house- hold characteristics, that enter the demand function through the lifetime budget con- straint. Empirically, treating marginal utility of wealth as a constant would be a strong assumption since it relies on the existence of complete markets, an assumption that is not likely to hold in a developing country set up where most households may be credit constrained. In addition the household cannot perfectly control for his future wealth and changes in the environment where he lives makingλ stochastic. In order to allow forλ to reflect some of these time-varying changes and dynamics, we use household’s access to resources in the long-run as measured by lag of log of household’s real per capita consumption expenditure [lag log (PCE)] as an additional control variable in the right hand side. The sequence of expected future household characteristics, prices, incomes, and other factors affecting current schooling throughE t (Z t+j ) empirically enters either through the time-invariant household specific unobservables (ǫ h ) or the time-varying i.i.d term (ǫ it ) given in equation (4.9). We do need to assume that E t (Z t+j ) enters the dynamic reduced form conditional schooling demand function only additively. 93 In addition to the observable characteristics included in the regressions, there are four sources of unobservables in this model - ǫ i , ǫ h , ǫ v , and ǫ it . ǫ i captures the time- invariant individual specific unobservables such as the child’s innate ability. ǫ h cap- tures time-invariant household specific unobservables which reflect parental preferences toward schooling and parents time discount rate. ǫ v captures all time-invariant village specific unobservables like political connections. ǫ it includes time-varying unobserv- ables which include time-varying price shocks, income shocks and expectations at date t about future period’s income, prices, and other characteristics which are all unknown to the econometricians at date t. We assumeǫ it to be random. The condition of zero correlation between the error term and lagged dependent vari- able may never be satisfied due to the presence time-invariant and time-varying unob- servables. The one-period lagged schooling outcome, S it−1 , is likely to be correlated with the time-invariant individual-specific unobservables like the child’s ability to per- form well in school, which creates an upward bias in the estimated coefficient on the one-period lagged schooling status - β 1 . The time-invariant household-specific unob- servables like parental preferences towards child schooling and time discount rate, is also likely to create an upward bias in the estimated coefficient on - β 1 . However, we know that parents could also invest more in children who had lower genetic ability and thus biasing the coefficient on β 1 biased downwards. The time-invariant community- specific unobservables like political connections of a community are also likely to be positively correlated with the lagged dependent variable creating an upward bias in the estimated coefficient onβ 1 . At the same time, pro-poor policies at the community level can bias the estimated coefficient ofβ 1 downwards. In addition,β 1 is likely to be biased downwards, towards zero due to the presence of classical measurement error in school- ing outcomes. 94 Given the different sources of the potential biases in S it−1 , it is difficult to assign the net direction of bias onβ 1 . However, one can broadly classify the main sources of the endogeneity in the estimated coefficient on the lagged dependent variable as omit- ted variables and or measurement error in data. The results section discusses variants of the GMM estimation strategies that can be used to address the potential sources of endogeneity in the lagged dependent variable. 4.4 Data 4.4.1 Ethiopian Rural Household Survey Ethiopia is divided into 11 regions and each region is sub-divided into zones and zones into woredas. 15 Each woreda is further sub-divided into peasant associations. The smallest administrative unit in Ethiopia is called a ‘peasant association’ which is sometimes equivalent to one village or a cluster of villages. The data used in this paper comes from the 1994, 1999, and 2004 waves of the Ethiopian Rural Household Survey (ERHS). The ERHS is a large-scale socioeconomic survey which has collected individ- ual level, household level, and occasionally community level data from selected rural peasant associations in Ethiopia during 1989-2004. 16 The first wave of the ERHS was fielded in 1989 during which households from 7 farming villages in central and southern Ethiopia were surveyed. In 1989, only a narrow set of questions were included in the survey as at the time there was no intention of creating a longitudinal data set. In 1994, 6 of the 7 original villages from 1989 (one of the villages could not be re-visited due to civil unrest) and 9 new villages that most suffered from the 1984-85 and 1987-89 droughts were additionally selected for 15 A woreda in Ethiopia is roughly equivalent to a county in the U.S. 16 Peasant associations were first set up in 1974, as an aftermath of the revolution. We use the term “villages” and “peasant association” interchangeably throughout this paper 95 survey purposes. A total of 15 rural villages were surveyed in 1994 with the aim of constructing a longitudinal data set. The 15 rural villages included in the 1994 survey are representative of the diverse farming systems practiced across the thousands of rural villages in Ethiopia. In 1994, two waves of the ERHS were administrated, the first wave during January-March and the second during August-October. The ERHS subsequently followed households residing in these 15 rural villages during 1995, 1997, 1999 and 2004 [see Dercon and Hoddinott (2004) and Dercon et. al (2006) for more details on survey design]. The ERHS provides extensive information on household composition, income, con- sumption expenditure, farm and non-farm assets, ownership and value of land and live- stock units, anthropometrics, harvest use and schooling outcomes. In 1997 and 2004, the ERHS also collected detailed community level information on infrastructure avail- ability, prices of consumption goods, and community level shock variables. 4.4.2 Sample composition This paper uses cross-sectional and panel data methods to outline the socioeconomic determinants of schooling outcomes in a static and dynamic context. The repeated cross- sectional regressions use observations on children aged 7-14 years from each of the 1994, 1999, and 2004 waves of the ERHS. The panel data regressions use observa- tions on children who were initially between 7-14 years in 1994 and could be followed through the 1999 and 2004 waves of the ERHS. The choice of drawing empirical evidence from only the 1994, 1999 and 2004 waves deserves explanation. It is only from 1994 that we have a larger pool of villages that were first surveyed and details on enrollments and completed grades were first collected only in 1994. In addition, to keep the interpretations on our cross-sectional coefficient estimates straightforward, we maintain the years between two consecutive survey rounds 96 unchanged restricting our final sample to exclude observations from the 1995 and 1997 waves of the ERHS. Why restrict the sample to include only primary school age children? First, only a third of all school age children from rural Ethiopia have at least one completed grade of schooling and less than 10% of these children have completed primary schooling. Hence it is the socioeconomic environment that primary school age children face that deter- mines their complete trajectory of current and future schooling attainments. Second, restricting the sample to primary school age children addresses out migration related selection concerns. There occurs huge out migration among high school age children due to early marriage. For instance: Ezra and Kiros (2001) find that 79% of the female Ethiopian migrants identified marriage as the primary reason for out migration with average age at marriage around 16 years. Also, Fafchamps and Quisumbing (2005) document average age of a bride during first marriage in rural Ethiopia to be 17 years. 4.4.3 Attrition The long time line of the ERHS raises concerns about attrition. Sample attrition occurs at two levels - household and individual. If attrition were random and not related to the outcome variable of interest then there is no source of inconsistency in the para- meters estimated. However, if sample attrition were related to the outcome variable of interest either through observables or unobservables then the coefficients estimated are likely to suffer from attrition bias [Fitzgerald et. al (1998)]. In this section we address, attrition related concerns at both the household and individual level. The focus on primary school age children alone allows us to naturally deal with individual level selection issues such as migration that can potentially contaminate our cross-sectional parameter estimates. However, households could migrate in search of 97 better schooling opportunities for their children. This can result in a non-random sam- ple of school age children creating attrition bias even among the 7-14 year olds. The low levels of household attrition rates in the ERHS (as described below) address this selection concern too. In 1994, the ERHS surveyed a total of 1477 households from 15 rural villages in Ethiopia. In 1999, 1371 of the original 1477 households were re-contacted for interview purposes. Even after 5 years, in 1999, 92.82% of the original households were re- contacted. In 2004, 1304 of the original 1477 households were re-interviewed. Between 1999 and 2004, re-contact rate is 95.1%. After almost a decade, the re-contact rates were as high as 88.2% with total household level attrition rate at 11.8% between 1994 and 2004. Household level attrition is minimal in these rural areas and is supported by other studies using the ERHS. 17 The Ethiopian economy is primarily agrarian and 85% of the total working age population depends on agricultural income for survival. Land is owned by the government and households cannot obtain land if they decide to move to another location. This severely constraints the household’s mobility and keeps household level attrition rates low. Our dynamic regressions use observations on children who are initially between 7 and 14 in 1994 and can be followed through the 1999 and 2004 waves of the ERHS. By 2004, we lose more than 50% of the initial sample due to attrition at the individual level. A simple mean test on the difference in average completed grades of schooling between panel respondents and attriters is 0.18 (standard error = 0.06), which is statis- tically significant at 5%. Attriters have higher completed grades of schooling compared to children who can be followed over time. This indicates that attrition is related to 17 Dercon and Hoddinott (2004) report that sample attrition rates are as low as 7% using data from the 1989 and 1994 waves of the ERHS. Dercon et. al (2006) report household level attrition between 1994 and 2004 to be around 12.4% which is very close to the numbers reported in this paper on household level attrition. 98 schooling outcomes from the intial period. Further, to determine the extent to which ini- tial period schooling outcomes affect attrition, we estimate a linear probability model on attrition, where the dependent variable attrition takes the value equal to 1 if the individ- ual can be followed through all three waves of the ERHS and 0 otherwise. We control for age, mother’s schooling, father’s schooling, log of real household per capita consump- tion expenditure, no. of adult males, no. of adult females, village dummies, and allow for interactions between the age variables and the sex dummy. The results from column 1 table C.1 (appendix C) indicates that higher the accumulated grades of schooling in 1994, the lower the probability that the individual be followed through time. Column 2 captures a similar relationship, where attrition is negatively associated with grade pro- gression. The third column captures the relationship between attrition and enrollment status in 1994. Enrollment status in the initial period is unrelated to sample attrition. All three columns in the table bring out the strong negative relationship between age and attrition. The older the individual, the less likely he/she is to be observed over time. The interaction terms between the age variable and the sex dummy suggest that male chil- dren are more likely to remain in the sample compared to females, this observation is consistent with marriage related migration among females. Sample attrition here is also related to household income, children who lived in households with higher per capita incomes were less likely to be followed over time. The strong association between attri- tion and relative grades indicates that attrition is not likely to be random. However, the more relevant concern here is to determine the extent to which the non-random nature of sample attrition is likely to contaminate our preferred coefficient estimates in the dynamic specification. Our preferred specification uses a first-difference IV estimation strategy, which removes the time-invariant unobservables from the empirical specifica- tion addressing some of the concerns relating to attrition bias. The time-varying unob- servables in empirical specification can also potentially cause attrition bias. However, 99 due to the lack of valid instruments available to us, we are not able to use any selection correction methods to address this source of attrition bias. 4.4.4 Descriptive statistics During 1994-2004, rural Ethiopia witnessed huge improvements in its primary school enrollment rates. In 1994 only 12.7% of primary school children were enrolled in school. A decade later, in 2004 the percentage of primary school aged children enrolled increased by three times to 45.49%. Figures 4.1 and 4.2 depict schooling enrollment rates (in %) among boys and girls for all primary school children during this period. Figure 4.1: Male enrollment rate (%) by age in years 0 10 20 30 40 50 60 70 80 percentage enrolled 7 8 9 10 11 12 13 14 1994 1999 2004 We find a steep increase in the percentage of children enrolled between 1994 and 1999 with a relatively smaller increase during 1999-2004. 100 Figure 4.2: Female enrollment rate (%) by age in years 0 10 20 30 40 50 60 70 percentage enrolled 7 8 9 10 11 12 13 14 1994 1999 2004 Increment in enrollment rates for children between 7-9 years occurs during 1994- 1999, there is practically no increase in enrollment among the younger age group during 1999-2004. The low levels of enrollment in 1994 combined with the initial increase in enrollment during 1994-1999 reflects mostly new enrollments, whereas improvements between 1999 and 2004 can be attributed to both new and continued enrollments. There is strong association between age and enrollment. In all three years, the enrollment rate is smallest at age 7 and peaks only after age 11. There is some reflection of declining enrollment rate among girls after age 12. For boys the enrollments and age have an almost linear relationship. During 1994-1999 the improvements in enrollment is higher among male children, however, between 1999 and 2004 the initial gender differences decline. 101 Figure 4.3: Male Relative Grade Attainment 0 .05 .1 .15 .2 .25 .3 .35 .4 relative grade attained 7 8 9 10 11 12 13 14 1994 1999 2004 Figures 4.3 and 4.4 depict trends in relative grade attainment among all primary school age children. Early enrollments in 1999 are reflected in the steep increase in relative grades accumulated by age 7. The pattern of improvement in relative grades is similar to the patterns in enrollment rates. The male-female differences in relative grades also continue to decline. 102 Figure 4.4: Female Relative Grade Attainment 0 .05 .1 .15 .2 .25 .3 .35 .4 .45 relative grade attained 7 8 9 10 11 12 13 14 1994 1999 2004 Tables 4.1-4.3 provide sample averages and standard deviation on the dependent variables and the regressors used in the cross-sectional regression specifications. Table 4.1: Sample averages using data for primary school children from 1994 Variable Mean Std. dev Enroll, Enroll=1 if currently enrolled in school and 0 otherwise 0.13 0.34 Completed grades of schooling 0.60 1.43 Relative grade attained* 0.13 0.38 Household size 8.29 3.17 Log real per capita household consumption expenditure (PCE) 3.79 0.73 Mother’s schooling 0.17 0.37 Father’s schooling 0.39 0.49 Male dummy 0.50 0.50 Age in years 10.80 2.29 Land in hectares per adult member 0.57 0.56 Livestock units 3.16 4.12 No. of adult males 1.65 1.18 No. of adult females 1.71 1.05 Mother’s age 38.64 9.60 103 Table 4.1: Continued Variable Mean Std. dev Age of the head of the household 48.7 13.49 observations 2047 - *Relative grade attained = actual grade completed/potential grade. Table 4.2: Sample averages using data for primary school children from 1999 Variable Mean Std. dev Enroll, Enrolled=1 if currently enrolled in school and 0 otherwise 0.38 0.48 Completed grades of schooling 1.13 1.70 Relative grade attained* 0.27 0.47 Household size 7.30 2.90 Log real per capita household consumption expenditure (PCE) 4.05 0.76 Mother’s schooling 0.22 0.42 Father’s schooling 0.44 0.49 Male dummy 0.50 0.50 Age in years 10.81 2.21 Land in hectares per adult member 0.49 0.50 Livestock units 3.32 3.10 No. of adult males 1.58 1.15 No. of adult females 1.70 1.03 Mother’s age 39.76 9.91 Age of the head of the household 49.27 12.64 observations 1877 - *Relative grade attained = actual grade completed/potential grade. Table 4.3: Sample averages using data for primary school children from 2004 Variable Mean Std. dev Enroll, Enrolled=1 if currently enrolled in school and 0 otherwise 0.45 0.49 Completed grades of schooling 1.17 1.67 Relative grade attained* 0.24 0.41 Household size 7.15 2.33 Log real per capita household consumption expenditure (PCE) 4.05 0.75 Mother’s schooling 0.31 0.46 Father’s schooling 0.51 0.50 Male dummy 0.51 0.50 Age in years 10.70 2.41 Land in hectares per adult member 0.64 0.56 Livestock units 3.42 3.61 No. of adult males 1.49 0.95 No. of adult females 1.5 0.83 Mother’s age 40.05 8.65 104 Table 4.3: Continued Variable Mean Std. dev Age of the head of the household 49.32 12.37 observations 1629 - *Relative grade attained = actual grade completed/potential grade. Table 4.4 captures the change in the outcome variable and all other regressors between 1994 and 2004. Table 4.4: Mean changes in schooling outcomes and other variables between 1994 and 2004 Variable Mean difference Std. error (2004-1994) Enroll, Enrolled=1 if currently enrolled in school and 0 otherwise 0.33 0.01 Completed grades of schooling 0.58 0.05 Relative grade attained* 0.11 0.01 Log real per capita household consumption expenditure (PCE) 0.25 0.02 Mother’s schooling 0.15 0.01 Father’s schooling 0.13 0.01 Land in hectares per adult member 0.06 0.02 Livestock units 0.21 0.12 No. of adult males -0.17 0.03 No. of adult females -0.19 0.03 Mother’s age 1.41 0.38 Age of the head of the household 0.53 0.52 - *Relative grade attained = actual grade completed/potential grade. 4.5 Results 4.5.1 Static results Before we outline the main regressions results, we clarify the choice of the econo- metric strategies used in this paper. Enrollment status is defined as a limited dependent variable and hence, most papers in the literature use a probit specification to characterize the determinants of enrollment [Dostie and Jayaraman (2006), Pal (2004), Tansel (1998)]. Alternately one could specify a linear probability model (LPM), which can be estimated using an ordinary least square 105 (OLS) estimation strategy. The OLS estimation strategy provides consistent and unbi- ased parameter estimates (Maddala, 1981). However, the presence of heteroskedastic errors results in incorrect inference, which can be corrected by applying robust stan- dard errors [see pg 454, Wooldridge (2002)] to the OLS estimates. The only limitation associated with the application of OLS technique to the enrollment regression is that, sometimes the predicted probability of enrollment may not necessarily be restricted to the 0 to 1 interval and hence cannot be interpreted as probabilities. One of the most commonly used outcome variable in this literature is completed grades of schooling. There are a number of econometric difficulties associated with using completed grades of schooling - First, many children in the sample have not yet been enrolled in school and hence a large number of observations are censored at zero. Second, observations on completed grades of schooling will also be right-censored for children currently enrolled in school. Both sources of censoring result in inconsistent parameter estimates. In addition, OLS estimation techniques cannot be applied to com- pleted grades of schooling, since the outcome variable is not a continuous variable. There are several approaches used in the literature to address the above issues. One way is to restrict the sample to include only observations on children with completed schooling spells. Such a sample can be estimated using ordered probit estimation tech- niques to obtain unbiased and consistent parameter estimates. However, this is likely to create out migration related selection concern. Also the right hand side variables used to characterize the determinants of schooling outcome may not be representative of the actual socioeconomic environment that affected the schooling decision. An alternative is to create a relative measure of completed grades of schooling. For example Birdsall (1982) defines schooling as actual grades divided by the mean grades for the relevant age-sex category. Behrman (1984) used actual grades divided by poten- tial grades as their relative measure of schooling. Some authors have also used grades 106 completed per year as a relative measure for schooling attainments. The advantage of using relative measures of schooling is that a continuous outcome variable is created making OLS estimates consistent. The relative measure also accounts for delays in enrollments and grade attainments. Relative grades control for the difference in the time taken to complete ‘x’ grades of schooling. Individuals with the same completed grades of schooling are treated differently depending upon their age, except if the actual com- pleted grade is zero. Hence, sample censoring at zero continues to remain a concern. The third approach is to estimate a censored ordered probit specification [King and Lillard (1983, 1987)]. In this specification, a maximum likelihood framework is used where children who have completed their entire schooling spells (uncensored observa- tions) and children who have not completed their entire schooling spell (censored obser- vations) both enter the likelihood function separately. This estimation strategy addresses both sources of censoring bias producing consistent parameter estimates. However, this specification may not be attractive in this context as it relies on the strong assumption that children who belong to the uncensored category do not re-enter schools. Another important econometric issue addressed here is clustering. Individuals resid- ing in the same village share common unobserved village level characteristics and hence the error terms are correlated across individuals residing in the same village. Any such correlation in the error term violates the standard OLS assumptions producing incorrect inference. 18 The most common way to address this is to cluster the standard errors at the village level. In all the specifications reported in this paper, particularly our pre- ferred estimates are obtained using an IV estimation strategy with village fixed-effects. The application of village fixed-effects in our cross-sectional data removes all possi- ble sources of unobserved correlation between individuals residing in the same village 18 The drawback is identical to the problems caused by heteroskedastic errors. We assume away any potential correlation between individuals residing in two different villages. 107 [Wooldridge (2002, 2003)]. Hence we need not adjust the standard errors to clustering at the village level. We do correct the standard errors in our specifications to adjust for the presence of any arbitrary form of heteroskedasticity using the White (1980) formu- lation [see Wooldridge (2002)]. Hence the standard errors reported here are reliable and can be readily used to draw inferences. The static conditional schooling demand function is estimated separately for the 1994, 1999, and 2004 waves of the ERHS. Each year’s regression uses right hand side characteristics from that year alone. In table 4.5, a dummy for enrollment is regressed upon a set of child level and household level characteristics using observations on pri- mary school children from the 1994 wave of the ERHS. Similar regression estimates are reported in tables 4.6 and 4.7, using data from the 1999 and 2004 waves of the ERHS respectively. Table 4.5: Determinants of schooling enrollment among primary school age children from 1994 Covariates (1) OLS (2) OLS (3) IV (4) IV Enroll Enroll Enroll Enroll Mother’s schooling 0.0852** 0.0931** 0.0985** 0.0985** (0.03) (0.03) (0.04) (0.04) Father’s schooling 0.1079*** 0.1088*** 0.1072*** 0.1072*** (0.02) (0.02) (0.02) (0.02) Log (real pce) 0.0319*** -0.0172 -0.0172 (0.01) (0.05) (0.05) Land -0.0218 (0.01) Livestock units 0.0018 (0.002) Male dummy 0.0175 0.0136 0.0146 0.0146 (0.02) (0.02) (0.02) (0.02) dummy=1 if the child 0.0299 0.0298 0.0300 0.0300 is 8 years (0.02) (0.02) (0.02) 0.02 dummy=1 if the child 0.0833*** 0.0792*** 0.0799*** 0.0799*** is 9 years (0.02) (0.02) (0.02) (0.02) dummy=1 if the child 0.0589** 0.0587** 0.0591** 0.0591** is 10 years (0.02) (0.02) (0.02) (0.02) dummy=1 if the child 0.0936*** 0.0900*** 0.0909*** 0.0909*** is 11 years (0.03) (0.03) (0.03) (0.03) dummy=1 if the child 0.1374*** 0.1380*** 0.1398*** 0.1398*** 108 Table 4.5: Continued Covariates (1) OLS (2) OLS (3) IV (4) IV is 12 years (0.03) (0.03) (0.03) (0.03) dummy=1 if the child 0.2042*** 0.1995*** 0.2014*** 0.2014*** is 13 years (0.03) (0.03) (0.03) (0.03) dummy=1 if the child 0.0922*** 0.0920*** 0.0910*** 0.0910*** is 14 years (0.03) (0.03) (0.03) (0.03) Male dummy*dummy=1 if -0.0156 -0.0121 -0.0122 -0.0122 the child is 8 years (0.03) (0.03) (0.03) (0.03) Male dummy*dummy=1 if -0.0189 -0.0159 -0.0194 -0.0194 the child is 9 years (0.04) (0.04) (0.04) (0.04) Male dummy*dummy=1 if 0.0411 0.0417 0.0408 0.0408 the child is 10 years (0.04) (0.04) (0.04) (0.04) Male dummy*dummy=1 if 0.0822 0.0875*** 0.0855* 0.0855* the child is 11 years (0.05) (0.05) (0.05) (0.05) Male dummy*dummy=1 if -0.0002 0.0019 0.0007 0.0007 the child is 12 years (0.04) (0.04) (0.04) (0.04) Male dummy*dummy=1 if -0.0162 -0.0076 -0.0094 -0.0094 the child is 13 years (0.05) (0.05) (0.05) (0.05) Male dummy*dummy=1 if 0.1545*** 0.1597*** 0.1604*** 0.1604*** the child is 14 years (0.05) (0.05) (0.05) (0.05) Number of adult males 0.0141 0.0130 0.0159* 0.0159* (0.008) (0.009) (0.008) (0.008) Number of adult females -0.0175** -0.0207** -0.0201** -0.0201** (0.008) (0.008) (0.008) (0.008) Mother’s age 0.0010 0.0010 0.0011 0.0011 (0.0008) (0.0008) (0.0008) (0.0008) Age of the head of -0.0001 -0.0001 -0.0002 -0.0002 the household (0.0005) (0.0006) (0.0006) (0.0006) Observations 2047 2047 2047 2047 Village fixed-effects Yes Yes Yes Yes F statistic on the 14.84 11.17 excluded instruments from (0.00) (0.00) the first-stage regression Hansen J statistic 4.195 (0.12) 4.20 (0.24) - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - p-values are reported for the F statistic on the excluded instruments and - the Hansen J statistic - In column 3, PCE is instrumented with land and livestock units - In column 4, PCE is instrumented with land, livestock units, and rainfall*land - omitted age dummy - 7 years, omitted sex dummy - female - Also included in the regressions are dummy variables capturing missing observations for each particular variable parental schooling, land, household composition variables where the missing observations were imputed by the sample mean 109 Table 4.6: Determinants of schooling enrollment among primary school age children from 1999 Covariates (1) OLS (2) OLS (3) IV (4) IV Enroll Enroll Enroll Enroll Mother’s schooling 0.0620 0.0613 0.0441 0.0413 (0.03) (0.03) (0.04) (0.04) Father’s schooling 0.0930*** 0.0967*** 0.0746** 0.0717** (0.02) (0.02) (0.03) (0.03) Log (real pce) 0.0417** 0.2400* 0.2712** (0.01) (0.13) (0.13) Land -0.0383 (0.02) Livestock units 0.0124*** (0.004) Male dummy -0.0689 -0.0666 -0.0740 -0.0748 (0.04) (0.04) (0.05) (0.05) dummy=1 if the child 0.0568 0.0594 0.0601 0.06061 is 8 years (0.05) (0.05) (0.05) (0.05) dummy=1 if the child 0.1824*** 0.1892*** 0.1698*** 0.1678*** is 9 years (0.05) (0.05) (0.05) (0.05) dummy=1 if the child 0.2021*** 0.2103*** 0.1786*** 0.1749*** is 10 years (0.05) (0.05) (0.05) (0.05) dummy=1 if the child 0.2583*** 0.2684*** 0.2241*** 0.2187*** is 11 years (0.05) (0.05) (0.06) (0.06) dummy=1 if the child 0.3124*** 0.3126*** 0.3119*** 0.3118*** is 12 years (0.05) (0.05) (0.05) (0.05) dummy=1 if the child 0.3380*** 0.3434*** 0.3165*** 0.3131*** is 13 years (0.05) (0.05) (0.05) (0.05) dummy=1 if the child 0.3008*** 0.3074*** 0.2833*** 0.2805*** is 14 years (0.05) (0.05) (0.06) (0.06) Male dummy*dummy=1 if 0.1746** 0.1735** 0.1548** 0.1516* the child is 8 years (0.07) (0.07) (0.07) (0.07) Male dummy*dummy=1 if 0.0515 0.0479 0.0353 0.0327 the child is 9 years (0.07) (0.07) (0.07) (0.07) Male dummy*dummy=1 if 0.2670*** 0.2666*** 0.2805*** 0.2826*** the child is 10 years (0.07) (0.07) (0.07) (0.07) Male dummy*dummy=1 if 0.2142*** 0.1964** 0.2573*** 0.2640*** the child is 11 years (0.07) (0.07) (0.08) (0.08) Male dummy*dummy=1 if 0.1454* 0.1488** 0.1286 0.1260 the child is 12 years (0.07) (0.07) (0.07) (0.07) Male dummy*dummy=1 if 0.0388 0.0353 0.0391 0.0391 the child is 13 years (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if 0.1651** 0.1579* 0.1808** 0.1833** the child is 14 years (0.08) (0.08) (0.08) (0.08) Number of adult males -0.0032 -0.0173 0.0115 0.0138 (0.01) (0.01) (0.01) (0.01) Number of adult females 0.0087 -0.0019 0.0232 0.0255* (0.01) (0.01) (0.01) (0.01) Mother’s age -0.0035*** -0.0035*** -0.0039*** -0.0040*** 110 Table 4.6: Continued Covariates (1) OLS (2) OLS (3) IV (4) IV (0.001) (0.001) (0.001) (0.001) Age of the head of 0.0001 0.0002 -0.0003 -0.0003 the household (0.001) (0.001) (0.001) (0.001) Observations 1877 1877 1877 1877 Village fixed-effects Yes Yes Yes Yes F statistic on the 14.26 9.95 excluded instruments from (0.00) (0.00) the first-stage regressions Hansen J statistic 8.45 (0.003) 8.97 (0.011) - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - p-values are reported for the F statistic on the excluded instruments and the Hansen J statistic - In column 3, PCE is instrumented with land and livestock units - In column 4, PCE is instrumented with land, livestock units, and rainfall*land - omitted age dummy - 7 years, omitted sex dummy - female - Also included in the regressions are dummy variables capturing missing observations for each particular variable parental schooling, land, household composition variables where the missing observations were imputed by the sample mean 111 Table 4.7: Determinants of schooling enrollment among primary school age children from 2004 Covariates (1) OLS (2) OLS (3) IV (4) IV (5) IV (6) IV Enroll Enroll Enroll Enroll Enroll Enroll Mother’s schooling 0.0775** 0.0839** 0.0661* 0.0646* 0.1012*** 0.1014*** (0.03) (0.03) (0.03) (0.03) (0.03) (0.03) Father’s schooling 0.0775** 0.0800*** 0.0663** 0.0649** 0.0843*** 0.0843*** (0.03) (0.03) (0.03) (0.03) (0.02) (0.02) Log (real pce) 0.0608*** 0.1645* 0.1780** 0.1650*** 0.1641*** (0.01) (0.08) (0.08) (0.05) (0.05) Land 0.0477* (0.02) Livestock units 0.0055 (0.004) Male dummy 0.0099 0.0030 0.0222 0.0238 0.0197 0.0196 (0.04) (0.04) (0.04) (0.04) (0.04) (0.04) dummy=1 if the child 0.1207** 0.1225** 0.1183** 0.1180** 0.1171** 0.1172** is 8 years (0.05) (0.05) (0.05) (0.05) (0.05) (0.05) dummy=1 if the child 0.2474*** 0.2471*** 0.2548*** 0.2558*** 0.2612*** 0.2611*** is 9 years (0.05) (0.05) (0.05) (0.05) (0.05) (0.05) dummy=1 if the child 0.4152*** 0.4107*** 0.4196*** 0.4201*** 0.4223*** 0.4223*** is 10 years (0.05) (0.05) (0.05) (0.05) (0.05) (0.05) dummy=1 if the child 0.4318*** 0.4274*** 0.4428*** 0.4443*** 0.4488*** 0.4488*** is 11 years (0.06) (0.06) (0.06) (0.06) (0.06) (0.06) dummy=1 if the child 0.5288*** 0.5213*** 0.5424*** 0.5442*** 0.5541*** 0.5540*** is 12 years (0.05) (0.05) (0.05) (0.05) (0.05) (0.05) dummy=1 if the child 0.4588*** 0.4510*** 0.4643*** 0.4650*** 0.4570*** 0.4569*** is 13 years (0.06) (0.06) (0.06) (0.06) (0.06) (0.06) dummy=1 if the child 0.4403*** 0.4389*** 0.4461*** 0.4469*** 0.4553*** 0.4553*** is 14 years (0.05) (0.05) (0.05) (0.05) (0.05) (0.05) Male dummy*dummy=1 if 0.0351 0.0299 0.0520 0.0542 0.0520 0.0519 the child is 8 years (0.07) (0.07) (0.07) (0.07) (0.07) (0.07) Male dummy*dummy=1 if 0.0322 0.0310 0.0174 0.0155 0.0097 0.0099 112 Table 4.7: Continued Covariates (1) OLS (2) OLS (3) IV (4) IV (5) IV (6) IV the child is 9 years (0.07) (0.07) (0.07) (0.07) (0.07) (0.07) Male dummy*dummy=1 if -0.0648 -0.0574 -0.0757 -0.0771 -0.0766 -0.0765 the child is 10 years (0.07) (0.08) (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if 0.0172 0.0221 0.0011 -0.0009 0.0046 0.0047 the child is 11 years (0.08) (0.08) (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if -0.0659 -0.0562 -0.0803 -0.0821 -0.0868 -0.0866 the child is 12 years (0.07) (0.07) (0.07) (0.07) (0.07) (0.07) Male dummy*dummy=1 if 0.0494 0.0638 0.0296 0.0270 0.0330 0.0332 the child is 13 years (0.08) (0.08) (0.08) (0.08) (0.08) (0.08) Male dummy *dummy=1 if 0.1527* 0.1580** 0.1462* 0.1453* 0.1338* 0.1339** the child is 14 years (0.07) (0.07) (0.07) (0.07) (0.07) (0.07) Number of adult males 0.0010 0.0003 0.0005 0.0005 0.0004 0.0004 (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) Number of adult females 0.0298* 0.0289* 0.0406** 0.0420** 0.0600*** 0.0600*** (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) Mother’s age -0.0022 -0.0019 -0.0022 -0.0022 -0.0018 -0.0018 (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) Age of the head of -0.0003 -0.0002 -0.0009 -0.0010 -0.0006 -0.0006 the household (0.001) (0.001) (0.001) (0.001) (0.001) (0.001) Distance to primary -0.0205 -0.0204 school in km (0.01) (0.01) Dummy=1 if the village 0.0403 0.0409 has electricity (0.05) (0.05) Dummy=1 if the village 0.0026 0.0027 as piped water (0.02) (0.02) Observations 1629 1629 1629 1629 1629 1629 Village fixed-effects Yes Yes Yes Yes No No F statistic on the 44.08 32.03 108.52 72.28 excluded instruments from the (0.00) (0.00) (0.00) (0.00) first-stage regressions Hansen J statistic 2.88 (0.08) 3.64 (0.16) 0.31 (0.57) 0.80 (0.66) 113 Table 4.7: Continued - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - p-values are reported for the F statistic on the excluded instruments and the Hansen J statistic - In columns (3) and (5) PCE is instrumented with land and livestock units - In columns (4) and (6) PCE is instrumented with land, livestock units, and rainfall*land - omitted age dummy - 7 years, omitted sex dummy - female - Also included in the regressions are dummy variables capturing missing observations for each particular variable (parental schooling, land, household composition variables) where the missing observation was imputed by the sample mean 114 In table 4.8, relative grade attainment is regressed upon a set of child level and house- hold level characteristics using the 1994 data. In tables 4.9 and 4.10, similar regression estimates are reported using data from 1999 and 2004 waves of the ERHS respectively. Table 4.8: Determinants of relative grades attained (RGA) among pri- mary school age children from 1994 Covariates (1) OLS (2) OLS (3) IV (4) IV RGA RGA RGA RGA Mother’s schooling 0.0738 0.0820* 0.0517 0.05204 (0.04) (0.04) (0.04) (0.04) Father’s schooling 0.1135*** 0.1162*** 0.1147*** 0.1147*** (0.02) (0.02) (0.02) (0.02) Log (real pce) 0.0242** 0.1058** 0.1046** (0.01) (0.04) (0.04) Land 0.0032 (0.01) Livestock units 0.0050** (0.002) Male dummy 0.0565 0.0552 0.0614 0.0613 (0.07) (0.07) (0.07) (0.07) dummy=1 if the child 0.0356 0.0368 0.0354 0.0354 is 8 years (0.04) (0.04) (0.04) (0.04) dummy=1 if the child 0.0091 0.0090 0.0146 0.0145 is 9 years (0.04) (0.04) (0.04) (0.04) dummy=1 if the child 0.0046 0.0068 0.0042 0.0042 is 10 years (0.03) (0.03) (0.03) (0.03) dummy=1 if the child 0.0198 0.0211 0.0243 0.0242 is 11 years (0.04) (0.04) (0.04) (0.04) dummy=1 if the child 0.0426 0.0421 0.0386 0.0387 is 12 years (0.04) (0.04) (0.04) (0.04) dummy=1 if the child 0.0566 0.0524 0.0613 0.0612 is 13 years (0.04) (0.04) (0.04) (0.04) dummy=1 if the child 0.0404 0.0426 0.0423 0.0423 is 14 years (0.04) (0.04) (0.04) (0.04) Male dummy*dummy=1 if -0.0786 -0.0769 -0.0844 -0.0843 the child is 8 years (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if 0.0075 0.0055 0.0082 0.0082 the child is 9 years (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if -0.0197 -0.0229 -0.0191 -0.0191 the child is 10 years (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if 0.0229 0.0214 0.0175 0.0176 the child is 11 years (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if -0.0431 -0.0425 -0.0445 -0.0445 the child is 12 years (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if -0.0395 -0.0341 -0.0507 -0.0506 the child is 13 years (0.08) (0.08) (0.08) (0.08) 115 Table 4.8: Continued Covariates (1) OLS (2) OLS (3) IV (4) IV RGA RGA RGA RGA Male dummy*dummy=1 if 0.0451 0.0406 0.0353 0.0355 the child is 14 years (0.08) (0.08) (0.08) (0.08) Number of adult males 0.0022 0.00007 -0.0007 -0.0007 (0.007) (0.008) (0.007) (0.007) Number of adult females -0.0173* -0.0191** -0.0131 -0.0132 (0.009) (0.009) (0.009) (0.009) Mother’s age 0.0010 0.0011 0.0007 0.0007 (0.0009) (0.0009) (0.0009) (0.0009) Age of the head of 0.0001 0.0002 0.0003 0.0003 the household (0.0005) (0.0005) (0.0005) (0.0005) Observations 2047 2047 2047 2047 Village fixed-effects Yes Yes Yes Yes F statistic on the 19.99 15.11 excluded instruments from (0.00) (0.00) the first-stage regressions Hansen J statistic 4.31 (0.11) 5.34 (0.14) - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - p-values are reported for the F statistic on the excluded instruments and the Hansen J statistic - In column 3, PCE is instrumented with land and livestock units - In column 4, PCE is instrumented with land, livestock units, and rainfall*land - omitted age dummy - 7 years, omitted sex dummy - female - Also included in the regressions are dummy variables capturing missing observations for each particular variable parental schooling, land, household composition variables where the missing observations were imputed by the sample mean, this holds for table 4.9 too Table 4.9: Determinants of relative grade attained (RGA) among primary school age children from 1999 Covariates (1) OLS (2) OLS (3) IV (4) IV RGA RGA RGA RGA Mother’s schooling 0.0061 0.0062 -0.0112 -0.0108 (0.03) (0.03) (0.04) (0.04) Father’s schooling 0.1066*** 0.1106*** 0.0888*** 0.0892*** (0.03) (0.03) (0.03) (0.03) Log (real pce) 0.0452*** 0.2364* 0.2322* (0.01) (0.12) (0.12) Land -0.0301* (0.01) Livestock units 0.0115*** 0.004 Male dummy -0.0916 -0.0888 -0.0964 -0.0963 (0.12) (0.12) (0.12) (0.12) dummy=1 if the child -0.2256*** -0.2231*** -0.2225*** -0.2225*** 116 Table 4.9: Continued Covariates (1) OLS (2) OLS (3) IV (4) IV is 8 years (0.08) (0.08) (0.08) (0.08) dummy=1 if the child -0.1166 -0.1100 -0.1287 -0.1284 is 9 years (0.08) (0.08) (0.08) (0.08) dummy=1 if the child -0.1999** -0.1916** -0.2225*** -0.2220*** is 10 years (0.08) (0.08) (0.08) (0.08) dummy=1 if the child -0.1603** -0.1498* -0.1933** -0.1925** is 11 years (0.08) (0.08) (0.08) (0.08) dummy=1 if the child -0.1806** -0.1802** -0.1812** -0.1812** is 12 years (0.08) (0.08) (0.08) (0.08) dummy=1 if the child -0.1680** -0.1623** -0.1888** -0.1883** is 13 years (0.07) (0.07) (0.07) (0.07) dummy=1 if the child -0.1948** -0.1881** -0.2116*** -0.2113*** is 14 years (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if 0.1751 0.1740 0.1560 0.1564 the child is 8 years (0.13) (0.13) (0.13) (0.13) Male dummy*dummy=1 if -0.0002 -0.0031 -0.0159 -0.0155 the child is 9 years (0.13) (0.13) (0.13) (0.13) Male dummy*dummy=1 if 0.1907 0.1894 0.2037 0.2034 the child is 10 years (0.12) (0.12) (0.12) (0.12) Male dummy*dummy=1 if 0.1332 0.1149 0.1747 0.1738 the child is 11 years (0.13) (0.13) (0.13) (0.13) Male dummy*dummy=1 if 0.1876 0.1909 0.1715 0.1718 the child is 12 years (0.12) (0.12) (0.12) (0.12) Male dummy*dummy=1 if 0.1048 0.1011 0.1051 0.1050 the child is 13 years (0.12) (0.12) (0.12) (0.12) Male dummy*dummy=1 if 0.1317 0.1234 0.1468 0.1465 the child is 14 years (0.12) (0.12) (0.12) (0.12) Number of adult males 0.0176 0.0045 0.0317** 0.0314** (0.01) (0.01) (0.01) (0.01) Number of adult females -0.0251** -0.0350*** -0.0111 -0.0114 (0.01) (0.01) (0.01) (0.01) Mother’s age -0.0002 -0.0002 -0.0006 -0.0006 (0.007) (0.006) (0.01) (0.01) Age of the head of 0.0019 0.0021 0.0016 0.0016 the household (0.001) (0.001) (0.001) (0.001) Observations 1877 1877 1877 1877 Village fixed-effects Yes Yes Yes Yes F statistic on the 32.63 22.98 excluded instruments from (0.00) (0.00) the first-stage regressions Hansen J statistic 10.97 (0.0009) 11.45 (0.003) - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - p-values are reported for the F statistic on the excluded instruments and the Hansen J statistic - In column 3, PCE is instrumented with land and livestock units - In column 4, PCE is instrumented with land, livestock units, and rainfall*land - omitted age dummy - 7 years, omitted sex dummy - female 117 Table 4.10: Determinants of relative grade attained (RGA) among pri- mary school age children from 2004 Covariates (1) OLS (2) OLS (3) IV (4) IV (5) IV (6) IV RGA RGA RGA RGA RGA RGA Mother’s schooling 0.0812** 0.0857** 0.0765* 0.0763* 0.0918** 0.0916** (0.03) (0.03) (0.03) (0.03) (0.03) (0.03) Father’s schooling 0.0467 0.0491 0.0421 0.0418 0.0456 0.0455 (0.03) (0.03) (0.03) (0.03) (0.03) (0.03) Log (real pce) 0.0417** 0.0848 0.0869 0.1091** 0.1098** (0.01) (0.07) (0.07) (0.04) (0.04) Land 0.0248 (0.02) Livestock units 0.0028 (0.003) Male dummy 0.0281 0.0315 0.0267 0.0366 0.0369 0.0453 (0.07) (0.07) (0.07) (0.07) (0.07) (0.07) dummy=1 if the child 0.0547 0.0559 0.0538 0.0537 0.0645 0.0645 is 8 years (0.07) (0.07) (0.07) (0.07) (0.07) (0.07) dummy=1 if the child 0.0012 0.0003 0.0043 0.0045 0.0106 0.0107 is 9 years (0.06) (0.06) (0.06) (0.06) (0.06) (0.06) dummy=1 if the child 0.1057 0.1029 0.1075* 0.1076* 0.1138* 0.1138* is 10 years (0.06) (0.06) (0.06) (0.06) (0.06) (0.06) dummy=1 if the child 0.1141 0.1107 0.1187 0.1189 0.1314* 0.1314* is 11 years (0.07) (0.07) (0.07) (0.07) (0.07) (0.07) dummy=1 if the child 0.1599** 0.1546** 0.1655** 0.1658** 0.1763*** 0.1764*** is 12 years (0.06) (0.06) (0.06) (0.06) (0.06) (0.06) dummy=1 if the child 0.0892 0.0846 0.0915 0.0916 0.0988 0.0989 is 13 years (0.06) (0.06) (0.06) (0.06) (0.06) (0.06) dummy=1 if the child 0.1291** 0.1278** 0.1315** 0.1316** 0.1355** 0.1355** is 14 years (0.06) (0.06) (0.06) (0.06) (0.06) (0.06) Male dummy*dummy=1 if 0.0139 0.0095 0.0209 0.0212 0.0096 0.0097 the child is 8 years (0.10) (0.10) (0.10) (0.10) (0.10) (0.10) Male dummy*dummy=1 if 0.0101 0.0110 0.0040 0.0037 -0.0049 -0.0051 118 Table 4.10: Continued Covariates (1) OLS (2) OLS (3) IV (4) IV (5) IV (6) IV the child is 9 years (0.09) (0.08) (0.09) (0.09) (0.09) (0.09) Male dummy *dummy=1 if -0.0244 -0.0194 -0.0289 -0.0291 -0.0330 -0.0331 the child is 10 years (0.09) (0.09) (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if -0.0163 -0.0122 -0.0230 -0.0233 -0.0292 -0.0293 the child is 11 years (0.09) (0.09) (0.09) (0.09) (0.09) (0.09) Male dummy*dummy=1 if -0.0793 -0.0728 -0.0852 -0.0855 -0.0957 -0.0958 the child is 12 years (0.08) (0.09) (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if 0.0515 0.0609 0.0432 0.0428 0.0311 0.0310 the child is 13 years (0.08) (0.08) (0.08) (0.08) (0.08) (0.08) Male dummy*dummy=1 if 0.0319 0.0353 0.0292 0.0291 0.0259 0.0258 the child is 14 years (0.08) (0.08) (0.08) (0.08) (0.08) (0.08) Number of adult males -0.0061 -0.0064 -0.0063 -0.0064 -0.0087 -0.00876 (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) Number of adult females 0.0082 0.0067 0.0127 0.0129 0.0224* 0.0225* (0.01) (0.01) (0.01) (0.01) (0.01) (0.01) Mother’s age -0.0013 -0.0011 -0.0013 -0.0013 -0.0010 -0.0010 (0.002) (0.002) (0.002) (0.002) (0.002) (0.002) Age of the head of 0.0011 0.0007 0.0007 0.0008 0.0008 the household (0.001) (0.001) (0.001) (0.0009) (0.0009) (0.0009) Distance to primary -0.0212* -0.0214* school in km (0.01) (0.01) Dummy=1 if the village 0.1279** 0.1273** has electricity (0.06) (0.06) Dummy=1 if the village -0.0092 -0.0093 has piped water (0.02) (0.02) Observations 1629 1629 1629 1629 1629 1629 Village fixed-effects Yes Yes Yes Yes No No F statistic on the 61.97 45.16 116.78 78.03 excluded instruments from (0.00) (0.00) (0.00) (0.00) the first-stage regressions Hansen J statistic 0.76 (0.38) 0.82 (0.66) 0.00 (0.99) 0.25 (0.88) 119 Table 4.10: Continued - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - p-values are reported for the F statistic on the excluded instruments and the Hansen J statistic - In columns (3) and (5) PCE is instrumented with land and livestock units - In columns (4) and (6) PCE is instrumented with land, livestock units, and rainfall*land - omitted age dummy - 7 years, omitted sex dummy - female - Also included in the regressions are dummy variables capturing missing observations for each particular variable (parental schooling, land, household composition variables) where the missing observation was imputed by the sample mean Relative grade attained = actual grade completed/potential grade. 120 In tables 4.7 and 4.10, we also control for community level characteristics which are only available for the 2004 wave of the ERHS. 19 Child characteristics and gender differences Our preferred estimates on enrollment are reported in column 4 of tables 4.5, 4.6, and 4.7. Age dummies, male dummy, and age interacted male dummies are included in the regression specifications to account for age, gender, and age-gender specific differences in schooling outcomes. The coefficient estimates on the age dummies from all three years indicate a strong positive association between age and enrollment. The older the child, the more likely he/she is enrolled compared to a 7 year old. In 1994, the coefficient estimates on the age dummies reflect delayed enrollments. The parameter estimates from the 1999 regressions depict an increase in the enrollment probabilities among all age groups suggesting both timely enrollments and continued enrollments. By 2004, more children are likely to be enrolled by age 8 and the probability of being enrolled tends to peak at a much earlier age, 12 years. Age-gender specific differences in enrollments are captured by the parameter esti- mates on the age interacted male dummies. The interaction terms in column 4 of tables 4.5, 4.6, and 4.7 suggest that there are little gender differences in enrollments in 1994 with enrollment probabilities only marginally higher among boys. By 1999 the gen- der differences get magnified, boys at all ages are more likely to be enrolled than their 19 A pooling test combining the sample from 1994, 1999, and 2004 waves results in a F statistic of 4.33, statistically significant at 1%, rejecting the null of pooling the sample from all three waves together. Hence the determinants of schooling outcomes are separately estimated for the 1994, 1999 and 2004 waves of the ERHS. We also test if the socioeconomic characteristics controlled in the regression specifications vary by gender. A joint test on the interaction between the gender dummy and the socioeconomic characteristics from 1994, yields an F statistic of 1.17 (0.31 as p-value), which is statistically insignificant. A similar test on the pooled sample from 1999 yields an F statistic of 1.39 (0.20 as p-value) and from 2004 yields an F statistic of 0.39 (0.91 as p-value). We can now conclude that the socioeconomic characteristics included in the empirical specifications do not vary by gender and hence, we estimate our static model pooling the sample on boys and girls together. Our pooled specifications allow for age-gender specific differences in schooling attainments. 121 female counterparts. These gender differences in schooling enrollments narrow down by 2004. Our preferred estimates from the relative grade attainment regressions are reported in column 4 of tables 4.8, 4.9 and 4.10 respectively. In 1994, only 13% of primary school age children were enrolled and an even smaller percentage of these children had non- zero completed grades of schooling. There is little variation in the outcome variable based on age and hence no significant relationship between age and relative grades. It is only in 1999 that the coefficient estimates on the age dummies suggest a strong negative relationship between age and relative grades, suggesting delayed enrollments which contribute towards slower accumulation of schooling grades. By 2004, relative grades systematically improved among all ages and yet none of the coefficient estimates on the age dummies are statistically significant. There is no evidence for age specific gender differences in relative grades attained. Parental characteristics Parental schooling variables capture the efficiency with schooling inputs are trans- formed into actual schooling outcomes. The coefficient estimates on the parental school- ing variables reported in column 4 of tables 4.5-4.7 indicate a strong positive relationship between measure of parental schooling and children’s enrollment status. 20 Every child whose mother has non-zero grades of schooling is 9 percentage points more likely to be enrolled in 1994, 4 percentage points more likely to be enrolled in 1999, and 6 percent- age points more likely to be enrolled in 2004 compared to a child whose mother has zero accumulated grades. For every child whose father has non-zero accumulated grades is 10 percentage points more likely to be enrolled in 1994, 7 percentage points more likely 20 Behrman and Wolfe (1984), Birdsall (1985), Alderman et. al (1997) and Parish and Willis (1993) all find that parental education has an important role in determining of child schooling outcomes. 122 to be enrolled in 1999 and 6 percentage points more likely to be enrolled in 2004 as compared to a child whose father has zero accumulated grades of schooling. In this paper, we treat parental schooling variables as exogenous. Parental schooling variables can potentially be endogenous due to the correlation between unmeasured innate ability across generations that is likely to affect both the child and parent’s schooling attain- ments. However, data restrictions do not allow us to explicitly address this potential source of endogeneity. 21 In addition to the independent effects of parental schooling, we examine add interac- tion terms between the age dummies and the parental schooling variables to capture the age differential impact of parental schooling on child schooling. The interaction terms are all jointly insignificant for all the three enrollment and relative grade attainment regressions. 22 21 Lillard and Willis (1994) explicitly control for the correlation between parent’s unobservables and child specific unobservables. Behrman and Rosenzweig (2002, 2005) show that the impact of parental schooling declines once we control for the unobservables that affect both parent and child schooling outcomes. 22 The F statistic on the interaction between the age dummies and mothers schooling for the enrollment regressions in 1994, 1999 and 2004 are - 0.69 (0.67), 1.32 (0.23) and 1.27 (0.26) respectively with p- values in parenthesis. The F statistic on the interaction between the age dummies and fathers schooling in the enrollment regressions from 1994, 1999 and 2004 are - 1.41 (0.19), 0.91 (0.50) and 0.69 (0.64) respectively. The F statistic on the interaction between the age dummies and pce in the enrollment regres- sions from 1994, 1999 and 2004 are - 0.47 (0.85), 1.77 (0.08) and 1.07 (0.38) respectively. The joint F statistic on the interaction between the age dummies and pce, age dummies and mothers schooling, age dummies and fathers schooling in the enrollment regressions from 1994, 1999 and 2004 are - 0.86 (0.63), 1.30 (0.18) and 1.14 (0.29) respectively. The F statistic on the interaction between the age dummies and mothers schooling for the relative grade attainment regressions in 1994, 1999 and 2004 are - 0.64 (0.72), 1.17 (0.31) and 0.98 (0.44) respectively. The F statistic on the interaction between the age dummies and fathers schooling in the relative grade attainment regressions from 1994, 1999 and 2004 are - 0.94 (0.47), 2.28 (0.02) and 0.84 (0.55) respectively. The F statistic on the interaction between the age dummies and pce in the relative grade attainment regressions from 1994, 1999 and 2004 are - 1.25 (0.27), 1.57 (0.14) and 1.68 (0.11) respectively. The joint F statistic on the interaction between the age dummies and pce, age dummies and mothers schooling, age dummies and fathers schooling in the relative grade attainment regressions from 1994, 1999 and 2004 are - 1.02 (0.43), 1.37 (0.12) and 1.15 (0.29) respectively. 123 Household income Schooling is considered as a normal good and hence increase in income is likely to have a positive impact on schooling attainments. In this paper we use logarithm of real per capita household consumption expenditure (PCE) as our measure of full income. As discussed earlier the presence of potential correlation between unobservables and the per capita consumption variable is likely to bias the coefficient estimate on PCE. In order to address the endogeneity problem in the income variable, we use an instrument variable estimation strategy to obtain our preferred unbiased and consistent estimates on household income. Now, we need instruments that are correlated with the per capita consumption vari- able. The ERHS provides details on land holdings and livestock units, two forms of assets that households own. In a static framework, assets can be treated as being exoge- nously determined. In addition, the land distribution policy in rural Ethiopia is such that land is only owned by the government. The allocation of land to household’s is determined outside household’s schooling investment decision. Hence, we use land and livestock units as excluded instruments for PCE in the first-stage regressions. Ethiopia is primarily a agrarian economy with more than 80% of the working age population employed in the agricultural sector. Rural households are largely dependent on rainfall for agricultural output. Hence, household food consumption and rainfall are likely to be correlated. Also rainfall can be treated as exogenous and this can be used as an additional instrument for PCE. We construct our measure of rainfall as the average mm of rainfall over the main harvest months in the village. The measure of rainfall constructed varies at the village level. To create household level variation in the impact of rainfall; land is interacted with rainfall and the interaction term is used as an additional instrument for PCE in our preferred IV estimates reported in column 4 of tables 4.5-4.10. There still remains a question of whether rainfall can be appropriately 124 excluded from the second-stage regressions. Rainfall does not affect schooling directly as described in the theoretical model. Hence, the interaction term between land and rainfall can be used as excluded instruments for per capita consumption variable. The preferred IV estimates of log (PCE) are reported in column 4 of tables 4.5-4.10, a summary of which are reported below in tables 4.11 and 4.12. Also included are the IV estimates on log (PCE) as reported in column 6 of tables 4.5-4.10 where the village fixed-effects are now replaced with the actual village level supply side determinants of schooling. Table 4.11: Coefficient estimates on log (PCE) as reported in Tables 4.5- 4.7 Coefficient estimates on log (PCE) 1994 1999 2004 Without IV , column 1 0.03*** 0.04** 0.06*** With IV , column 4 -0.01 0.27** 0.17** With IV , column 6, including actual 0.16*** supply characteristics in the right hand side - *** significant at 1%, ** significant at 5%, * significant at 10% Table 4.12: Coefficient estimates on log (PCE) as reported in Tables 4.8- 4.10 Coefficient estimates on log (PCE) 1994 1999 2004 Without IV , column 1 0.02** 0.04*** 0.04** With IV , column 4 0.10** 0.23* 0.08 With IV , column 6, including actual 0.10** supply characteristics in the right hand side - *** significant at 1%, ** significant at 5%, * significant at 10% The OLS estimates reported in table 4.11 indicates that a 10% increase in household income increases the enrollment probability by 0.3 percentage points in 1994 and goes up to a maximum of 0.6 percentage points in 2004. The IV estimates reported in 2004, suggest that a 10% increase in income increases enrollment probability by 1.7 percent- age points. OLS estimates reported in table 4.12 indicate that a 10% increase in income increases relative grades by 0.2 percentage points in 1994 and by 0.4 percentage points 125 in 2004. Whereas the IV estimates of PCE show that a 10% increase in income is asso- ciated with 1 percentage point increase in relative grades in 1994 and a 0.8 percentage point increase in 2004. The importance of household income in determining schooling outcomes is re-established by our preferred estimates on PCE. Improvements in household income are positively associated with improvements in schooling outcomes. The parameter estimate on PCE went from almost 0 in 1994 to 0.17 in 2004 highlighting the differential impact of income in explaining schooling outcomes among the two cohorts of primary school age children. We find that income effects are likely to have a persistent role in explaining for the improvements in schooling outcomes among children from rural Ethiopia. In the long run, household income will remain as one of the key determinants of schooling outcomes in rural Ethiopia. The preferred IV estimate of PCE is almost three times higher than OLS estimate. The significantly large differences between the OLS and IV estimates obtained capture the magnitude of biasedness in the OLS parameter estimate. The increase in the coeffi- cient estimate on PCE from OLS to IV indicates that the OLS estimate of PCE is likely to be biased downward due to measurement error and not biased upwards due to omitted variables. Some papers in the literature have not accounted for the endogeneity in the PCE variable and hence their estimated coefficient on PCE is likely to be both biased and inconsistent. Our preferred IV estimates reported here are also additionally robust to an impor- tant econometric concern i.e. instrument validity. An instrument is defined to be valid only if it satisfies the following two conditions - (1) the excluded instruments must be strongly correlated with the endogenous regressor. (2) The instrument must be uncor- related with the error term in the second stage regression. In the presence of weak correlation between the instruments and the endogenous regressors, the IV estimates reported are likely to suffer from a higher bias and inconsistency compared to the bias 126 obtained on the OLS parameter estimate. It is important to verify that the IV estimates reported here satisfy the two above mentioned conditions. Stock et. al (2002) and Staiger and Stock (1997) have discussed a test statistic that can be used to test the relevance of the instruments used in IV estimation. Stock et. al. (2002) and Stock and Yogo (2005) define an instrument to be weak based on two criteria - First, based on the relative two-stage least squares (TSLS) bias where the instrument is deemed to be strong if the Cragg-Donald F statistic is large enough such that the TSLS bias with respect to the OLS bias is say at most x% (5, 10, 15 depending the extent of bias the author wants to allow). The second criterion is based on size, i.e. the instruments are defined to be strong only if the Cragg-Donald F statistic is large enough, such that a 5% hypothesis test is rejected no more than say x% of the time, otherwise the instruments are deemed to be weak. The main limitation with the application of the Cragg-Donald test statistic is that the test relies on the assumption of i.i.d errors and hence, is not robust to the presence of heteroskedasticity in the error term. It appears that there exists no clear consensus in the literature on the existence of an alternative test statistic that can be used to test for weak instruments with non-i.i.d errors. In which case it is suggested that the robust F statistic from the first-stage regression be used as a test for the presence of weak instruments. Staiger and Stock (1997) suggest a simple rule of thumb to test for instrument rele- vance. They suggest that in the presence of a single endogenous regressor, instruments are deemed to be weak if the first-stage F statistic on the excluded instruments is less than 10. The F statistic on the excluded instruments is again well over 10 in all the IV specifications, rejecting the null of a weak correlation between the instruments and the endogenous regressor. 127 The second condition for instrument validity is the test of lack of correlation between the errors in the second stage and the excluded instruments from the first-stage regres- sions. The Hansen J statistic is a joint test of the lack of correlation between the errors in the second stage regression and the excluded instruments and the correct exclusion of the instruments from the second stage regressions. Under the null the Hansen J statistic must satisfy both the aforementioned conditions. The Hansen J statistic is also appended in tables 4.5-4.10, where we cannot reject the null i.e. the instruments are uncorrelated with the error term and appropriately excluded from the second stage regression speci- fication. The preferred estimates of PCE reported in this paper are robust to an important econometric concern, i.e., weak instrument problem. The weak instrument problem has received very little attention in the schooling literature. Hence, our preferred estimates are much more robust as compared to the parameter estimates reported earlier in this literature. In a earlier work, Behrman and Knowles (1999) review the role of household income using 42 studies covering 21 countries show that there exists significant association between schooling outcomes and income, as also established in this paper. They find that about three-fifths of the schooling indicators show significant associations between schooling and income. They compute the median income elasticity to be 0.07 with the highest being 0.20 for low income countries - Ghana, Cote d’Ivoire, China and Nepal. The coefficient estimates on income elasticity reported in Behrman and Knowles (1999) is likely to be a lower bound on the true estimate since most studies used to compute the elasticity does not treat measures of PCE as endogenous. Additionally rural areas in general have higher income effects as compared to urban areas. The strong association between income and schooling outcomes also often reflects the existence of other unobservable factors like poor credit markets, parental preferences, household’s 128 ability to pay fees and substitute for farm labor. Since the income measure reported here also potentially captures the impact of unobservable factors, hence policy makers must target more than one determinant of household income so that improvements in income can have a large impact on the improvements in schooling outcomes. Household composition variables and supply side factors We use number of adult males, and number of adult females as additional right hand side variables to control for the impact of household composition on schooling outcomes. The coefficient estimates on adult males, adult females, mother’s age and age of the head of the household do not suggest any systematic pattern in their impact of schooling outcomes. The usual supply side characteristics that determine schooling outcomes include - number of schools available, distance to school, availability of water and sanitation facil- ity in the community, teacher-pupil ratio, number of blackboards, teaching materials and quality of road in the village. There exists a number of papers which establish the impact of distance to school, school characteristics (no. of blackboards, no. of desks, teacher- pupil ratio, leaking classrooms) and community resources to have a significant role in determining schooling outcomes [Glewwe and Jacoby (1994), Glewwe and Jacoby (1995), Dostie and Jayaraman (2006), Lavy (1996), Schaffner (2004)]. However, the magnitude and the role played by the supply side factors have not been systematically established in the literature. The ERHS did not collect detailed supply side information for all three waves of the survey data used in this paper, except for 2004. In order to establish the importance of the supply side characteristics, we replace the village level fixed-effect estimates with the actual supply side characteristics in columns 5 and 6 of tables 4.7 and 4.10. The 129 supply side factors included are - distance to primary school measured in km, dummy for access to electricity, and dummy for availability of piped water. We find that the coefficient estimates on the supply side factors reported in column 6, table 4.7 are all statistically insignificant and have little impact in determining schooling enrollments. The coefficient estimates reported in column 6, table 4.10 indicate that distance to primary school in km and the availability of electricity in the village both have a statistically significant impact on determining relative grades. Distance to school has a negative impact on the child’s relative grades. This is consistent with most other work in the literature [Lavy (1996), Schaffner, (2004)]. Children residing in villages that have access to electricity have higher levels of relative grades compared to children who live in villages without access to electricity. There is evidence to suggest that the correlation between village specific unobserv- ables and the supply side characteristics biases the coefficient estimates on the supply side variables [Rosenzweig and Wolpin (1986)]. Ghuman et. al (2005) shows that the potential correlation between household specific observables and the village specific unobservables can bias the parameter estimates obtained on the household characteris- tics as well. Therefore, it would be useful to compare the extent of bias in the household characteristics by providing estimates with village fixed-effects and replacing these vil- lage dummies with the actual supply side variables. In the regressions using the 2004 data, we compare our coefficient estimates on the household characteristics with both village fixed-effects and replace these village level fixed-effects with the actual supply side characteristics, to determine the extent of bias (if any) in the household characteristics with the exclusion of the village fixed-effects. We find that the inclusion of the village supply side characteristics does not change the parameter estimates on the demand side variables (for instance parental schooling and household income) reported in column 4 of tables 4.7 and 4.10. This indicates that the 130 inclusion of the actual supply side factors is not likely to bias the demand side coefficient estimates. In addition, to the role played by household income, the impact of child and house- hold characteristics reported in this paper are all consistent with other related work done by Chaudhury et. al (2006) and Schaffner (2004) using cross sectional data from Ethiopia. The aforementioned papers have additionally emphasized on the role played by the school supply side characteristics and village level characteristics in determining schooling improvements in rural Ethiopia. In our work, due to limited data availability on community characteristics we have no direct evidence on the role played by the vil- lage characteristics in improving schooling outcomes. The main limitations associated with the analysis of the supply side factors in the aforementioned papers is that they can neither account for endogenous program placement effects and nor do they acknowledge the potential correlation between village specific unobservables and household specific observables that can bias the coefficient estimates obtained on the household character- istics as well. 4.5.2 Dynamic results The static results discussed so far only capture the impact of current socioeconomic factors in explaining current schooling outcomes. However, schooling outcome in any period t is not just a function of current period factors and resources. It is the demand and supply side factors from all periods (0 to t) that determine an individual’s complete trajectory of schooling outcomes. In order to capture the impact of all factors that led to the choice of current schooling outcomes, we estimate a dynamic conditional schooling demand function, where the coefficient on the one period lagged schooling outcome captures the history of all demand and supply side determinants of schooling outcomes. 131 We observe that in rural Ethiopia, majority of the improvements in household income took place between 1994 and 1999 with little change in income between 1999 and 2004. Despite the little change in household income during 1999-2004, school- ing outcomes continue to improve. This section establishes the relationship between the demand for schooling in the last period and its continued impact on current school- ing outcomes. For estimating a dynamic conditional schooling demand function, we construct a panel data on primary school age children between 7-14 years in 1994 and follow them through the 1999 and 2004 waves of the ERHS. We first estimate a static schooling demand function using observations on the panel respondents, pooling data from the 1994, 1999, and 2004 waves of the ERHS. In Table 4.13, preferred IV estimates are reported in column 1, where the enrollment dummy is regressed on a set of child level, household level and community level characteristics using observations on the panel respondents. Table 4.13: Determinants of enrollment and relative grades attained among panel respondents Covariates (1) IV (2) IV Enroll RGA Male dummy -0.0961 0.0003 (0.62) (0.32) Spline if age<=14.99 -0.0071 -0.0117 (0.03) (0.01) Spine if 14.99<age<17.99 -0.0328 -0.0098 (0.02) (0.01) Spline if age>=17.99 -0.0309* 0.0078 (0.01) (0.009) Male dummy*spline if age<=14.99 0.0108 0.0037 (0.04) (0.02) Male dummy*spine if 14.99<age<17.99 0.0214 0.0083 (0.03) (0.01) Male dummy*spline if age>=17.99 -0.0206 -0.0021 (0.02) (0.01) Mother’s schooling 0.1406*** 0.0547** (0.05) (0.02) Father’s schooling 0.0816** 0.0557** (0.03) (0.02) Log (real pce) 0.5097** 0.2494* 132 Table 4.13: Continued Covariates (1) IV (2) IV (0.25) (0.14) Number of adult males 0.0160 0.0102 (0.01) (0.008) Number of adult females 0.0801*** 0.0243 (0.02) (0.01) Mother’s age -0.0043*** -0.0006 (0.001) (0.0008) Age of the head of -0.0026 -0.0021* the household (0.002) (0.001) observations 1618 1618 village*time fixed-effects Yes Yes F statistic on the 6.12 6.12 excluded instruments from (0.00) (0.00) the first-stage regressions Hansen J statistic 0.85 (0.65) 2.97 (0.22) - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - p-values are reported for the F statistic on the excluded instruments and the Hansen J statistic - In columns (1) and (2), PCE is instrumented with two-period lagged measure of - land and livestock units - Also included in the regressions are dummy variables capturing missing observations for each particular variable (parental schooling, land, household composition variables) where the missing observation was imputed by the sample mean In column 2, similar estimates are reported for relative grade attainment. The coef- ficient estimates on the parental schooling and household income indicate a positive relationship with enrollment and relative grade attainments, in line with much of the results outlined in the previous section. The actual parameter estimate is slightly higher now for the parental schooling variables and particularly higher for household income. The difference in the magnitude of the parameter estimates obtained from the cross- sectional and panel regressions for the static model indicate strong sample composition effects, that is, the sample itself is changing over time and hence the relative impact of parental schooling and household income is different using cross-sectional observations and the panel observations. In the dynamic specification, the coefficient estimate on the lagged enrollment sta- tus/lagged relative grade attainment is of primary interest. OLS coefficient estimates 133 of the lagged dependent variable are likely to be biased due to the presence of omitted variables and random measurement error in data. Omitted variable bias stems from the positive correlation between the lagged schooling outcome and other child and house- hold specific time-invariant unobservables which creates an upward bias in the estimated coefficient on the lagged dependent variable. Random measurement error in data addi- tionally biases the estimated coefficient on the lagged outcome variable towards zero. The lagged PCE variable controlled in the right hand side of the dynamic specification captures for household’s access to resources in the long-run. This measure of house- hold income is also treated as endogenously determined due to both omitted variables problem measurement error problem in household income. We use variants of the IV/GMM estimation strategy to deal with the endogeneity in the lagged schooling variable and lagged PCE. The coefficient estimates from the dynamic regressions are reported in tables 4.14 and 4.15. 134 Table 4.14: Dynamic schooling enrollment demand function Covariates (1) OLS (2) IV (3) IV (4) IV (5) IV Enroll Enroll Enroll Enroll Enroll Lagged enrollment 0.3422*** 0.3220*** 0.3961*** 0.2466*** 0.2970*** (0.02) (0.09) (0.04) (0.09) (0.09) Lagged log (real pce) 0.0046 -0.0311 -0.0169 -0.0330 -0.0487* (0.01) (0.04) (0.03) (0.04) (0.02) Male dummy 0.0288 0.0912 (0.11) (0.45) Spline at lag age<14.99 years -0.0249*** -0.4292 -0.0181 -0.4263 -0.4125 (0.008) (0.55) (0.02) (0.53) (0.54) Spline at lag age>=14.99 years -0.0339** -0.4288 -0.0396** -0.4323 -0.4140 (0.01) (0.54) (0.01) (0.52) (0.54) Male dummy*Spline at lag age<14.99 years 0.0029 0.0006 -0.0026 0.0008 0.0007 (0.009) (0.01) (0.03) (0.01) (0.01) Male dummy*Spline at lag age>=14.99 years -0.0090 -0.0135 -0.0037 -0.0123 -0.0127 (0.02) (0.02) (0.02) (0.02) (0.02) Mother’s schooling 0.0983** 0.07423 (0.04) (0.05) Father’s schooling 0.0627** 0.0224 (0.03) (0.04) Lagged no. of adult males 0.0054 -0.0114 -0.0105 -0.0091 -0.0108 (0.01) (0.02) (0.01) (0.02) (0.02) Lagged no. of adult females 0.0302** 0.0186 0.0353*** 0.0195 0.0172 (0.01) (0.02) (0.01) (0.02) (0.02) Mother’s age -0.0024* -0.0030* (0.001) (0.001) Lagged age of the head of the household -0.0013 -0.0044 0.0008 -0.0039 -0.0040 (0.001) (0.003) (0.001) (0.003) (0.003) Observations 1618 809 809 809 809 Number of village*time fixed-effects Yes Yes Yes Yes Yes F statistic on the excluded instruments 111.34 728.82 61.03 116.78 from the first-stage regressions (0.00) (0.00) (0.00) (0.00) 135 Table 4.14: Continued (1) OLS (2) IV (3) IV (4) IV (5) IV Hansen J statistic 0.00 0.00 11.60 (0.003) 2.27 (0.13) Difference on the coefficient estimate 0.01 on first-differenced lagged log (PCE) (0.03) obtained using a Hausman specification - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - p-values are reported for the F statistic on the excluded instruments and the Hansen J statistic - In column (2), first-differences in lagged PCE and lagged enrollment are instrumented with two-period lagged PCE and two-period lagged enrollment - In column (3), lagged enrollment and lagged PCE are instrumented with the first-differences in lagged PCE and first-differences in lagged enrollment - In column (4), first-differences in lagged enrollment and lagged PCE are instrumented with two-period lagged enrollment, two-period lagged PCE, two-period lagged PCE*two-period lagged rainfall, two-period lagged rainfall*mother’s schooling - In column (5), first-differences in lagged enrollment is instrumented with two-period lagged enrollment, two-period lagged rainfall*mother’s schooling, first-differenced lagged pce is now treated as exogenous - Also included in the regressions are dummy variables capturing missing observations for each particular variable (parental schooling, land, household composition variables) where the missing observation was imputed by the sample mean 136 Table 4.15: Dynamic schooling relative grade attainment demand func- tion Covariates (1) OLS (2) IV (3) IV (4) IV (5) IV RGA RGA RGA RGA RGA Lagged relative grade attainment (RGA) 0.6200*** 0.2568*** 0.7303*** 0.2579*** 0.2566*** (0.02) (0.06) (0.06) (0.06) (0.06) Lagged log (real pce) 0.0213*** -0.0030 0.0081 -0.0029 0.0051 (0.007) (0.01) (0.01) (0.01) (0.007) Male dummy 0.0134 -0.0670 (0.05) (0.16) Spline at lag age<14.99 years -0.0096** -0.0494 0.0133 -0.0495 -0.0567 (0.004) (0.14) (0.009) (0.14) (0.14) Spline at lag age>=14.99 years 0.0141* -0.0298 0.0022 -0.0299 -0.0373 (0.007) (0.14) (0.008) (0.14) (0.14) Male dummy* Spline at lag age<14.99 years 0.0022 0.0019 0.0069 0.0019 0.0018 (0.004) (0.003) (0.01) (0.003) (0.003) Male dummy*Spline at lag age>=14.99 years 0.0005 0.0011 0.0018 0.0011 0.0009 (0.009) (0.007) (0.01) (0.007) (0.007) Mother’s schooling 0.0329* 0.0468** (0.01) (0.02) Father’s schooling 0.0130 0.0129 (0.01) (0.01) Lagged no. of adult males 0.0111** 0.0112 0.0125** 0.0112 0.0115* (0.004) (0.006) (0.0058) (0.006) (0.006) Lagged no. of adult females 0.0091* 0.0150* 0.0148** 0.0150* 0.0158* (0.005) (0.008) (0.006) (0.008) (0.008) Lagged mother’s age -0.0003 -0.0002 (0.0006) (0.0006) Lagged age of the head of the household -0.0008 0.0009 -0.0001 0.0009 0.0008 (0.0005) (0.0009) (0.0007) (0.0009) (0.0009) Observations 1618 809 809 809 Village*time fixed-effects Yes Yes Yes Yes Yes F statistic on the excluded instruments 145.86 30.43 95.18 141.77 137 Table 4.15: Continued (1) OLS (2) IV (3) IV (4) IV (5) IV from the first-stage regressions (0.00) (0.00) (0.00) (0.00) Hansen J statistic 0.00 0.00 0.02 (0.88) 0.05 (0.81) Difference on the coefficient estimate -0.008 on first-differenced lagged log (PCE) (0.01) obtained using a Hausman specification - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - p-values are reported for the F statistic on the excluded instruments and the Hansen J statistic - In column (2), first-differences in lagged PCE and lagged RGA are instrumented with two-period lagged PCE and two-period lagged RGA - In column (3), lagged RGA and lagged PCE are instrumented with the first-differences in lagged PCE and first-differences in lagged RGA - In column (4), first-differences in lagged RGA and lagged PCE are instrumented with two-period lagged RGA, two-period lagged PCE, two-period lagged PCE*two-period lagged rainfall, two-period lagged rainfall*mother’s schooling - In column (5), first-differences in lagged RGA are instrumented with two-period lagged RGA, two-period lagged rainfall*mother’s schooling, first-differenced lagged pce is now treated as exogenous - Also included in the regressions are dummy variables capturing missing observations for each particular variable (parental schooling, land, household composition variables) where the missing observation was imputed by the sample mean 138 We follow two variants of the GMM estimation strategy - Arellano-Bond (1991), and Arellano-Bover (1995). Using the approach outlined in Arellano-Bond (1991) we instrument the first-differences in lagged schooling outcome (enrollment and relative grade attainment) variable and lagged PCE with the two-period lagged schooling out- come variable and two period lagged PCE respectively, assuming that the measurement error in schooling outcomes and household pce are serially uncorrelated over time. Following an Arellano-Bover (1995) estimation strategy, we instrument the levels in lagged PCE and lagged schooling outcome variables with the first-differences in lagged PCE and first-differences in lagged schooling outcome (enrollment and relative grade attainment) variables under the assumption that the first-differenced variables are uncor- related with the time-invariant unobservables. Lastly, we follow another variant of the Arellano-Bond estimator, where we instru- ment the first-differences in lagged PCE and first-differences in lagged schooling out- come (enrollment and relative grade attainment) variables with the two period lagged schooling outcome, two period lagged PCE interacted with the two period lagged rain- fall and the two period lagged rainfall interacted with mothers schooling. All the three estimation strategies above rely on the assumption of lack of first-order and second-order serial correlation in the error terms for the schooling outcome. This is a strong assumption and may not necessarily be satisfied in a dynamic panel model [Deaton (1997), Blundell and Bond (1998)]. However, little can be done to test this assumption with a short panel and especially when we do not have instruments that do not rely on this assumption. Our preferred IV estimates on the lagged dependent variable are reported in columns 2 and 4 of tables 4.14 and 4.15. The parameter estimate on the lagged outcome variable indicates a strong positive association between current schooling outcomes and lagged schooling outcomes. The dynamic enrollment regressions indicate that a child who was 139 enrolled in the last period is 32 percentage points more more likely to be enrolled today compared to a child who was not enrolled in the last period. This suggests that even a one time policy initiative taken on the part of governments will translate into continued enrollments in subsequent periods as captured by the coefficient estimate on the lagged enrollment variable. We find that the OLS coefficient estimates from the dynamic enrollment regressions in table 4.14 are quite close to results obtained using the different GMM estimation strategies. This suggests that the usual upward bias expected in the OLS coefficient esti- mate of the lagged dependent variable is offset by the downward bias caused by random measurement error thereby bringing the total bias close to zero. It is also possible that the bias caused by random measurement error is minimal and the ability bias 23 is not an important source of bias in estimating dynamic enrollment regressions. We also examine the relationship between current relative grade attainments and lagged relative grade attainments. The dynamic regression results for relative grade attainments are reported in table 4.15. We find that higher grade attainment in the last period is positively associated with grade attainment in the current period. Our preferred estimate of 0.25 on lagged relative grade attainment indicates that lower the relative grades accumulated in the initial period the greater will be the difference between actual grades and potential grades accumulated over life course. The OLS coefficient estimate on the lagged dependent variable reduces from 0.62 to 0.25 (table 4.15) following an Arellano-Bond estimation strategy. Measurement error in grade attainments is much lower as compared to data on enrollments as extensive effort was put into ensuring that the data on completed grades of schooling was consistent. However, for enrollment we had to simply rely on the individual’s responses and little 23 Ability bias is the bias caused due to the correlation between an individual’s innate ability and his lagged schooling outcome. Children with higher ability are more likely to be enrolled in school. They are also likely to accumulate higher completed grades of schooling on an average. 140 double checks were possible to reduce measurement error. The low measurement error and potentially higher source of ‘ability bias’ in grade attainments, makes OLS coeffi- cient estimates of lagged relative grade attainment biased upwards. First-differencing eliminates the sources of the upward bias on the lagged dependent variable as noted in the parameter estimates obtained using Arellano-Bond. In column 5 of tables 4.14 and 4.15, we also report coefficient estimates on lagged schooling outcome variable using our preferred Arellano-Bond type estimation strategy, now treating lagged pce as exogenous. We also report the difference on the coefficient estimates obtained on lagged pce in tables 4.14 and 4.15 as obtained from specifying a Hausman specification test to test if the assumption of treating lagged pce as exogenous is valid. The coefficient estimates on the difference in lagged pce reported in table 4.14 and 4.15 are 0.01 and 0.008 respectively and neither of these estimates are statistically significant, which implies that the null of exogeneity of the lagged consumption variable is not rejected. The IV estimates obtained in the dynamic specifications are valid and satisfy the con- ditions necessary for a valid instruments outlined earlier. The F statistic on the excluded instruments and Hansen J statistic appended at the end of tables 4.14 and 4.15 satisfy the conditions necessary for valid instruments. The coefficients on the lagged schooling outcome variables indicate a strong positive relationship between current and lagged schooling outcome variables. The magnitude and the sources of bias between the OLS and IV estimates differ by the measure of schooling outcome used in the regression specification. For example: the OLS estimate are close to the other IV estimates found in the enrollment regressions and the OLS estimates are quite different from the IV estimates found in the relative grade attainment regressions. 141 4.6 Concluding remarks In this paper, we examine the impact of individual level and household level char- acteristics in determining current schooling outcomes as measured by enrollment sta- tus and relative grade attainments using both cross-sectional and panel data methods. Mother’s education, father’s education, log of real per capita consumption expenditure, age of the child and gender of the child are identified as some important determinants of schooling outcomes. We find the role played by household income in explaining the improvements in schooling attainments is strong and has substantially increased during the last decade. Our preferred estimates on household income increase from 0.06 in 1994 to 0.17 in 2004 capturing the role played by income during both low income periods (1994) and high income periods (2004) as indicated by the coefficient estimates. We treat PCE as being endogenously determined and show that inference based on OLS estimates of household income can be potentially misleading. The IV coefficient estimates of income reported in this paper is also robust to the problem of weak instruments. In addition to the static regressions we also estimate a dynamic conditional schooling demand function to determine the association between current schooling and lagged schooling. We find strong associations between current schooling and lagged schooling outcomes which are omitted from the static regression results. We find that children who were enrolled in the last period is 32 percentage points more likely to be enrolled today as compared to children who were not enrolled in the last period. We also find that individual’s past schooling resources contributes towards child’s current relative grade attainments. The coefficient on the lagged schooling outcome explains for the continued improvements in schooling outcomes. The coefficient estimates reported in the dynamic specification are robust to omitted variables and measurement error in data, under the 142 assumption that random time-varying error term in the schooling outcome variable is serially uncorrelated over time. The coefficient estimates reported in this paper are robust to a number of method- ological and econometric concerns. We have shown earlier in the paper that the results obtained are not likely to be contaminated by concerns regarding non-random sample selection, attrition, omitted variables bias, measurement error bias, and weak instrument bias. 143 Chapter 5 Conclusion In view of the ever growing concern among development economists for improv- ing children’s health and education outcomes, this dissertation identifies the socioeco- nomic determinants of nutritional outcomes and schooling outcomes among children from developing countries. The findings of the second chapter of my dissertation sug- gests that it is parental height, household income, price of consumption goods, and measures of community infrastructure that are most important for improving health sta- tus among children. The results outlined here calls attention to programs and policies that focus on community level infrastructure development, regulating prices of essen- tial consumption goods, and providing access to credit. All these together can decrease childhood malnutrition in the population, which can further improve average grades of schooling completed and wage earnings in the long run. The third chapter identifies the extent to which childhood malnutrition affects an individual’s subsequent health status. I find evidence to suggest that children are able to partially recover from the deficits in health status caused by chronic malnutrition during childhood, that is, malnutrition during childhood is not likely to lock these children into lower health status as measured by height in centimeters in the future. As found in this dissertation, in the presence of partial recovery from chronic malnutrition, by adolescence, a malnourished child will grow to be 0.95 cm shorter than a well-nourished child. In the absence of any recovery potential, by adolescence, a malnourished child will grow to be 4.15 cm shorter than a well-nourished child. Some further implications as calculated in chapter 3 suggests that a decline in stature by 0.95 cm lowers schooling attainments by 0.6 grades of schooling. The third chapter also examines if the potential 144 of recovery from chronic malnutrition differs with age. I find that there exists only some evidence showing that the recovery potential is only marginally higher among younger children than older cohorts. From a practical standpoint, the presence of partial catch-up effects and age-differential catch-up effects suggests that continued efforts must be made on the part of households and policy makers towards improving children’s nutritional status at all ages. However, special emphasis must be on younger age groups as their catch-up potential is still the highest. The fourth chapter identifies the socioeconomic determinants of schooling enroll- ments and relative grades. The results from this chapter suggest that it is parental school- ing and household income that have the most important role in explaining schooling outcomes. Hence programs and policies focused at improving parental education and household income will translate into improvements in enrollments and relative grades. The main focus of the fourth chapter is to capture the association between past schooling decisions in the last period and current schooling outcomes. I find that a child who was enrolled in school in the last period is 32 percentage points more likely to be enrolled today compared to his/her counterpart who was not enrolled in the last period. There also exist a strong association between relative grades accumulated in the last period and relative grades today. The dependence on past schooling outcome suggests that even a one time policy initiative targeted towards improving household incomes in rural Ethiopia will not only improve children’s schooling outcomes today but also translate into improvements in the child’s complete trajectory of future schooling attainments. It is important that policy prescription is drawn from good empirical work. The estimation strategies used in this dissertation address both omitted variables bias and measurement error bias. The results reported also address a number of econometric concerns such as sample attrition, selection problems, and weak instruments. 145 To summarize, this dissertation uses both static and dynamic framework to outline the determinants of child health and education. The static results characterize the fac- tors that must be targeted to - (1) improve nutritional status among children in Indonesia and (2) improve schooling enrollments and grade attainments among children from rural Ethiopia. On the other hand the dynamic results indicate - (1) that there exists catch- up potential in health outcomes, that is, children who suffered from chronic malnutri- tion during childhood are not likely to remain as undernourished forever. The presence of catch-up potential suggests that focused attempts must be made towards improving nutritional outcomes of children at all ages with special emphasis on the very young. (2) The strong association between past schooling and current schooling suggests that even a one time policy initiative targeted towards improving household incomes in rural Ethiopia will not only improve children’s schooling outcomes today but also translate into improvements in the child’s complete trajectory of future schooling attainments. 146 Bibliography 1. Adair, L. S. 1999. Filipino children exhibit catch-up growth from age 2 to 12 years. Journal of Nutrition, 129: 1140-48. 2. Alderman, H., Hoddinott, J., and Kinsey, B. 2006. Long-term consequences of early childhood malnutrition. Oxford Economic Papers, 58(3): 450-474. 3. Alderman, H., Behrman, J. R., and Menon, R. 2001. Child health and school enrollment: A longitudinal analysis. The Journal of Human Resources, 56: 185- 205. 4. Arellano, M. and Bover, O. 1995. Another look at the instrumental variable esti- mation of error-components models. Journal of Econometrics, 68: 29-51. 5. Arellano, M. and Bond, S. 1991. Some tests of specification for panel data: Mote carlo evidence and an application of employment equations. Review of Economic Studies, 58: 277-297. 6. Barrera, A. 1990. The role of maternal schooling and it’s interaction with pub- lic health programs in child health production function. Journal of Development Economics, 32: 69-91. 7. Becker, G. S., and Tomes, N. 1976. Child endowments and the quality and quan- tity of children. Journal of Political Economy, 84: 143-62. 8. Behrman, J. R. and Rosenzweig, M. R. 2002. Does increasing women’s schooling raise the schooling of the next generation? American Economic Review, 92: 323- 334. 9. Behrman, J. R., Sengupta, P., and Todd, P. 2005. Progressing through PRO- GRESSA: An impact assessment of a school subsidy experiment in rural Mexico. Economic Development and Cultural Change, 54: 237-275. 10. Behrman, J.R. and Wolfe, B. L. 1984. The socioeconomic impact of schooling in a developing economy. Review of Economics and Statistics, 66(2): 296-303. 147 11. Behrman, J.R., and Knowles, J. C. 1999. Household income and child schooling in Vietnam. The World Bank Economic Review, 13 (2): 211-256. 12. Behrman, J. R., and Rosenzweig, M. R. 2002. Does increasing women’s school- ing raise the schooling of the next generation? American Economic Review, 92: 323-334. 13. Behrman, J. R. and Rosenzweig, M. R. 2005. Does increasing women’s school- ing raise the schooling of the next generation? Comment. American Economic Review, 95: 1745-1751. 14. Birdsall, N. 1985. Public inputs and child schooling in Brazil. Journal of Devel- opment Economics, 18 (1): 67-86. 15. Blundell, R. and S. Bond. 1998. Initial conditions and moment restrictions in dynamic panel data models. Journal of Econometrics, 87: 115-143. 16. Blundell, R., Bond, S., and Windmeijer, F. 2000. Estimation in dynamic panel data models: Improving on the performance of the standard GMM estimators. Working Paper 00/12, The Institute for Fiscal Studies. 17. Blundell, R. and MaCurdy, T. 1999. Labor supply: A review of alternative approaches. Handbook of Labor Economics, III: 1560-169. 18. Blundell, R. 2005. Dynamic panel data methods. Micro econometric lecture notes. 19. Brown, P. H., and Park, A. 2002. Education and poverty in rural China. Eco- nomics of Education Review, 21: 523-41. 20. Boersma, B. and Wit, J. M. 1997. Catch-up growth. Endocrine reviews, 18 (5): 646-661. 21. Card, D. 2001. Estimating the return to schooling: progress on some persistent econometric problems. Econometrica, 69(5): 1127-1160. 22. Cameron, N., Preece, M. A., and Cole T. J. 2005. Catch-up growth or regression to the mean? Recovery from stunting revisited. American Journal of Human Biology, 17: 412-417. 23. Cebu study team. 1992. A child health production function estimated from longi- tudinal data. Journal of Development Economics, 38: 323-51. 24. Chaudhury N., Christiansen, L., and Asadullah, M. 2006. School, household risks and gender: Determinants of child schooling in Ethiopia, working paper, World Bank. 148 25. Coly, A. N., Miler, J., Diallo, A., Ndiaye, T., Benefice, E., Simondon, F., Wade, S., and Simondon, K.B. 2006. Preschool stunting, adolescent migration, catch-up growth, and adult height in young Senegalese men and women of rural origin. Journal of Nutrition, pp 2412-2420. 26. Deaton, A. 1997. The analysis of household surveys: A microeconometric approach in development policy. Baltimore: Johns Hopkins University press. 27. Deaton, A. and Muellbauer, J. 1980. Economics and consumer behavior. Cam- bridge university press. 28. Deolalikar, A. 1993. Gender differences in the returns to schooling and in school- ing enrollments in Indonesia. Journal of Human Resources, 28(4): 899-932. 29. Deolalikar, A. 1996. Child nutritional status and child growth in Kenya: Socioe- conomic determinants. Journal of International Development, 8: 375-94. 30. Dercon, S., and Hoddinott, J. 2003. Health, shocks and poverty persistence. United Nations University-World Institute for Development Economics Research. 31. Dercon, S., and Hoddinott, J. 2004. The Ethiopian rural household survey: Intro- duction. mimeo, Washington D.C: International Food Policy Research Institute. 32. Dercon, S., Gilligan, D. O., Hoddinott, J. and Woldehanna, T. 2006. The impact of roads and agricultural extension on consumption growth and poverty in fifteen Ethiopian villages, mimeo, Washington D.C: International Food Policy Research Institute. 33. Dostie, B., and Jayaraman, R. 2006. Determinants of school enrollment in Indian villages. Economic Development and Cultural Change, 54: 405-421 34. Fedorov, L. and Sahn, D. 2005. Socioeconomic determinants of children’s health in Russia: A longitudinal survey. Economic Development and Cultural Change, pp 479-500. 35. Filmer, D., and Pritchett, L. 1999. The effect of household wealth on educational attainment: Evidence from 35 countries. Population and Development Review, 25(1): 85-120 36. Fitzgerald, J., Gottschalk, P., and Moffitt, R. 1998. An analysis of sample attrition in panel data. Journal of Human Resources, 33(2): 251-299. 37. Foster, A. 1995. Prices, credit markets, and child growth in low-income rural areas. Economic Journal, pp 551-70. 149 38. Foster, A., and Rosenzweig, M. 1996. Technical change and human capital returns and investments: Evidence from the Green Revolution. American Economic Review, 86(4): 931-53. 39. Frankenberg, E. and Thomas, D. 2000. The Indonesia family life survey (IFLS): Study design and results from Waves 1 and 2. DRU-2238/1-NIA/NICHD. 40. Frankenberg, E. and Karoly, L. 1995. The 1993 Indonesian family life survey: Overview and field report. RAND. DRU-1195/1-NICHD/AID 41. Frankenberg, E., Thomas, D., and Suriastini, W. 2005. Can expanding access to basic health care improve children’s health status? Lessons from Indonesia’s midwife in the village program. Population Studies, vol 59. 42. Ghuman, S., Behrman, J., Borja, J., Gultiano, S., and King, E. 2005. Family background, service providers, and early childhood development in Philippines: proxies and interactions. Economic Development and Cultural Change, pp 129- 164. 43. Glewwe, P., and Miguel, E. 2008. The impact of child health and nutrition on edu- cation in less developed countries. forthcoming in the Handbook of Development Economics, vol. 4, edited by T. Paul Schultz and John Strauss eds., Amsterdam: North Holland Press. 44. Glewwe, P., and Jacoby, H. 1994. Student achievement and schooling choice in low-income countries: Evidence from Ghana. Journal of Human Resources, 29(3): 843-864. 45. Glewwe, P., and Jacoby, H. 1995. An economic analysis of delayed primary school enrollment in a low income country: The role of early childhood nutri- tion. The Review of Economics and Statistics, 77: 156-69. 46. Glewwe, P. 2002. Schools and Skills in developing countries: Education policies and socioeconomic outcomes. Journal of Economic Literature, XI: 463-82. 47. Glick, P. and Sahn, D. E. 1998. Maternal labor supply and child nutrition in West Africa. Oxford Bulletin of Economics and Statistics, 60 (August): 325-55. 48. Griliches, Z. and Hausman, J. A.. 1986. Errors-in-Variables in panel data. Journal of Econometrics, 31: 93-118. 49. Grossman, M. 1972. On the concept of human capital and demand for health. Journal of Political Economy, 80(2): 223-255. 150 50. Haddad, L., Alderman, H., Appleton, S., Song, L., and Yohannes, Y . 2003. Reduc- ing child malnutrition: How far does income growth take us? World Bank Eco- nomic Review, 17(June): 10731. 51. Habicht, J. P., Martorell, R., and Rivera, J. 1995. Nutritional impact of supple- mentation in the INCAP longitudinal study: Analytic strategies and inferences. Journal of Nutrition, 125(4S): 1042S-1050S. 52. Handa, S., and Peterman, A. 2007. Child health and school enrollment: A repli- cation. Journal of Human Resources, forthcoming. 53. Hansen, L.P., J. Heaton, and A. Yaron. 1996. Finite sample properties of some alternative GMM estimators. Journal of Business and Economic Statistics, 14(3): 262-280. 54. Hausman, J. A. 1978. Specification tests in econometrics. Econometrica, 46: 1251-1271. 55. Heller, P. and Drake, W. 1979. Malnutrition, child morbidity, and the family decision process. Journal of Development Economics, 6: 203-235. 56. Horton, S. 1986. Child nutrition and family size in the Philippines. Journal of Development Economics, 23: 161-176. 57. Hoddinott, J. and Kinsey, B. 2001. Child growth in the time of drought. Oxford Bulletin of Economics and Statistics, 63(4): 409-436. 58. Hoddinott, J. and Kinsey, B. 1999. Adult health at the time of drought. Washing- ton D.C: International Food Policy Research Institute, mimeo. 59. Jacoby, H. and Skoufias, E. 1997. Risk, financial markets, and human capital in a developing country. Review of Economic Studies, 64: 311-335. 60. Holmes, J. 1999. Measuring the determinants of school completion in Pakistan: Analysis of censoring and selection Bias, working papers, Economic Growth Cen- ter, Yale University. 61. Johnston, F. E. and Macvean, R. B. 1995. Growth-faltering and catch-up growth in relation to environmental change in children of disadvantaged community from Guatemala city. American Journal of Human Biology, 7: 731-740. 62. Joshi, S., and Schultz, T. P. 2007. Family planning as an investment in develop- ment: Evaluation of a program’s consequences in Matlab: Bangladesh, working paper, Economic Growth Center, Yale University. 151 63. Kevane, M. and Levine, D. 2001. The changing status of daughters in Indonesia, mimeo. 64. Lavy, V . 1996. School supply constraints and children’s educational outcomes in rural Ghana. Journal of Development Economics, 51: 291-314. 65. Levine, D. and Anes, M. 2003. Gender bias and the Indonesian financial crisis: Were girls hit hardest? Center for International and Development Economics Research. 66. Lillard, L. and Willis, R. J. 1994. Intergenerational educational mobility: Effects of family and state in Malaysia. The Journal of Human Resources, 29: 1126-66. 67. Maccini, S. and Yang, D. 2005. Returns to health: Evidence from exogenous height variation in Indonesia, mimeo 68. MaCurgy, T. 1981. An empirical model of labor supply in a life-cycle setting. Journal of Political Economy, 89(6): 1059-1085. 69. Maddala, G. S. 1983. Limited-dependent and qualitative variables in economet- rics. Cambridge: Cambridge University Press. 70. Mani, S. 2007. Role of the household and the community in determining child health. WIDER Research paper No. 2007/X, Helsinki: UNU-WIDER. 71. Mankiw, G. N., Romer, D., and Weil, D. N. 1992. A Contribution to the empirics of economic growth. The Quarterly Journal of Economics, 107(2): 407-437. 72. Martorell, R. 1995. Promoting healthy growth: Rationale and benefits. In P. Pinstrup-Andersen, D. Pelletier, and H. Alderman (Eds.), Child Growth and Nutri- tion in Developing Countries: Priorities for Action, pp. 15-31, Ithaca: Cornell University Press. 73. Martorell, R. 1999. The nature of child malnutrition and its long-term implica- tions. Food and Nutrition Bulletin, 20: 288-92. 74. Martorell, R. and Habicht, J. 1986. Growth in early childhood in developing countries. In F. Falkner and J.M. Tanner, eds., Human Growth, Methodology, Eco- logical, Genetic, and Nutritional Effects on Growth, second edition, New York: Plenum, 241-262. 75. Martorell, R., Habicht, J. P., and Rivera, J. 1995. History and design of the INCAP Longitudinal Study (1969-77) and its follow-up (1988-89). Journal of Nutrition, 125(4S): 1027S-1041S. 152 76. Mincer, J. 1974. Schooling, experience, and earnings, New York: National Bureau of Economic Research. 77. Moretti, E. 2004. Estimating the social return to higher education: Evidence from longitudinal and repeated cross-sectional data. Journal of Econometrics 121: 175- 212. 78. Murray, M. P. 2006. Avoiding invalid instruments and coping with weak instru- ments. Journal of Economic Perspectives, 20(4): 111-132. 79. Onis, de M., Frongillo, E., and Blossner, M. 2000. Is malnutrition declining? An analysis of changes in levels of child malnutrition since 1980. Bulletin of the World Health Organization, 78 (10). 80. Orazem, P.F., King, E.M. 2008. Schooling in developing countries: The roles of supply, demand and government policy, forthcoming, T. P. Schultz and J. Strauss, eds. Handbook of Development Economics, V olume 4. Amsterdam: North Hol- land 81. Rosenzweig, M.R. and Wolpin, K.J. 1986. Evaluating the effects of optimally distributed public programs. American Economic Review, 76(3): 470-87. 82. Parish, W. L., and Willis, R. J. 1993. Daughters, education, and family budgets: Taiwan experiences. Journal of Human Resources, 28: 863-98. 83. Psacharopoulos G., and Patrinos, H. A. 2004. Returns to investment in education: A further update. Education Economics, 12(2): 111-134. 84. Sahn, D. 1994. The contribution of income to improved nutrition in Cote de Ivoire. Journal of African Economics, 3: 29-61. 85. Satyanarayana, K., Radhaiah, G., Murali, M. R., Thimmayama, B.V .S., Pralhad, R. N., and Narasinga, R.B.S. 1989. The adolescent growth spurt of height among rural Indian boys in relation to childhood nutritional background: An 18 year longitudinal study. Annals Human Biology, 16: 289-300. 86. Schaffner J., 2004. Determinants of schooling enrollment among primary school aged children in Ethiopia, mimeo. 87. Schultz T. W. 1960. Capital formation by education. Journal of Political Econ- omy, 68(12): 571-583. 88. Schultz, T. P. 2003. Evidence of returns to schooling in Africa from household surveys: Monitoring and restructuring the market for education, Yale University, mimeo. 153 89. Shrimpton, S., Victora, C.G., Onis, M. de, Lima, R.C., Blossner, M., and Clugston, G. 2001. Worldwide timing of growth faltering: Implications nutri- tional interventions. Pediatrics, 107 (5): e75. 90. Spurr, G. B. 1988. Body size, physical work capacity, and productivity in hard work: Is bigger better? In Linear Growth Retardation in Less Developed Coun- tries (Waterlow, J. C., ed.), Nestle Nutrition Workshop Series V olume 14. Raven Press, New York, NY . 91. Staiger, D. and Stock, J. H. 1997. Instrumental variables regression with weak instruments. Econometrica, 65(May): 557-86. 92. Stock, J. and Yogo, M. 2004. Testing for weak instruments in linear IV regression, mimeo, Harvard University. 93. Stock, J., Wright, J., and Yogo, M. 2002. A survey of weak instruments and weak identification in generalized method of moments. Journal of Business and Economic Statistics, 20(4): 518-529. 94. Stock, J.H. and Yogo, M. 2005. Testing for weak instruments in linear IV regres- sion. In D.W.K. Andrews and J.H. Stock, eds. Identification and Inference for Econometric Models: Essays in Honor of Thomas Rothenberg. Cambridge: Cam- bridge University Press, pp. 80-108. 95. Strauss, J., Lavy, V ., and Thomas, D. 1996. Public policy and anthropometric outcomes in the Cote d’Ivoire. Journal of Public Economics, 61: 155-192. 96. Strauss, J. 1986. Does better nutrition raise farm productivity, Journal of Political Economy, 94(2): 297-320. 97. Strauss, J. and Thomas, D. 1995. Human resources: empirical modeling of house- hold and family decisions. Handbook of Development Economics, vol. 3, edited by Jere R. Behrman and T.N. Srinivasan. 98. Strauss, J. and Thomas, D. 1998. Health, Nutrition and Economic Development, Journal of Economic Literature, 36(2): 766-817. 99. Strauss, J., Beegle, K., Sikoki, B., Dwiyanto, A., Herawati, Y . and Witoelar, F. 2004. The third wave of the Indonesia Family Life Survey (IFLS3): Overview and field report, WR-144/1-NIA/NICHD. 100. Strauss, J., Beegle, K., Dwiyanto, A., Herawati, Y ., Pattinasarany, D., Satriawan, E., Sikoki, B., Sukamdi and Witoelar, F. 2004. Indonesian living standards - before and after the financial crisis, Rand Corporation. 154 101. Strauss, J. and Thomas, D. 2008. Health over the life course, forthcoming in the Handbook of Development Economics, vol. 4, edited by T. Paul Schultz and John Strauss eds., Amsterdam: North Holland Press. 102. Tansel, A. 1998. Determinants of school attainment of boys and girls in Turkey, working paper, Economic Growth Center, Yale University. 103. Tanner, J. M. 1981. A history of the study of human growth, Cambridge university press. 104. Thomas, D., Strauss, J., and Henriques, M. 1990. Child survival, height for age, and household characteristics in Brazil. Journal of Development Economics, 33: 197-234. 105. Thomas, D., Strauss, J., and Henriques, M. 1991. How does mother’s education affect child height. Journal of Human Resources, 26(2): 183-211. 106. Thomas, D. and Strauss, J. 1992. Prices, infrastructure, household characteristics, and child height. Journal of Development Economics, 32: 301-331. 107. Thomas, D. and Strauss, J. 1997. Health and wages: Evidence on men and women in urban Brazil. Journal of Econometrics, 77(1): 159-185. 108. Thomas, D., Frankenberg, E., and Smith, J. 2001. Lost but not Forgotten: Attrition and follow up in the Indonesian family life survey. Journal of Human Resources, 36 (3): 556-592 109. Thomas, D. and Frankenberg, E. 2002. The measurement and interpretation of health in social surveys, mimeo. 110. Thomas, D., Frankenberg, E., Beegle, K. and Teruel, G. Household budgets, household composition, and the crisis in Indonesia: Evidence at a longitudinal survey data, mimeo. 111. Waterlow, J. 1988. Linear growth retardation in less developed countries. Nestle Nutrition workshop series vol. 14, chapters 1 and 2. 112. WHO. 2000. Global Data base on Child Growth and Malnutrition. Geneva: WHO. 113. White, H. 1980. A Heteroskedasticity-Consistent Covariance Matrix and a direct test for Heteroskedasticity. Econometrica 48: 817-38. 114. Wolfe, B. and Behrman, J.. 1982. Determinants of child mortality, health, and nutrition in a developing Country. Journal of Development Economics, 11: 163- 193. 155 115. Wooldridge, J. 2002. Econometric Analysis of Cross Section and Panel Data, Cambridge: MIT Press. 116. Wooldridge, J. M. 2003. Cluster-sample methods in Applied Econometrics. American Economic Review, 93: 133-138. 117. Zayats, Y . 2005. Schooling, wages, and the role of unobserved ability in the Philippines, mimeo, North Carolina: University of North Carolina, Chapel Hill. 156 Appendix A Appendix to Chapter 2 Table A.1: First-stage regression results for the preferred estimates reported in column 5 of table 2.4 Excluded and included coefficient estimates on the instruments from the first-stage regressions variables first-stage regressions reported in column 5, table 2.4 excluded instruments Total assets 0.06*** (0.004) included instruments Male dummy 0.05 (0.08) Spline in age in months (< 24 months) 0.007** (0.002) Spline in age in months (>= 24 months) -0.001*** (0.0004) Spline in age in months (<24)*male dummy -0.003 (0.004) Spline in age in months (>=24)*male dummy 0.0005 (0.0004) Mother’s height 0.002 (0.001) Father’s height 0.002 (0.001) Mother’s schooling 0.02*** (0.003) Father’s schooling 0.01*** (0.003) Price of rice - 0.22*** (0.07) Price of cooking oil 0.14*** (0.02) Price of condensed milk 0.003 (0.007) Rural dummy -0.32*** (0.08) Rural dummy*price of rice 0.15* (0.08) Number of health posts -0.0003 (0.002) Distance to health center -0.007 (0.002) 157 Table A.1: Continued Excluded and included coefficient estimates on the instruments from the first-stage regressions variables first-stage regressions reported in column 5, table 2.4 Electricity -0.0002 (0.0005) Dummy for paved road 0.004 (0.02) Male wage rate 0.06** (0.02) Female wage rate 0.03** (0.01) observations 5457 Location Yes fixed-effects F statistic on the excluded 174.14 instruments from the first-stage regressions - Source: IFLS - 1993, 1997, and 2000 - *** significant at 1%, ** significant at 5%, * significant at 10% 158 Appendix B Appendix to Chapter 3 Table B.1: First-stage results for estimates reported in columns 5 and 6 of table 3.3 Excluded and included coefficient estimates on coefficient estimates on the instruments from the the first-stage regressions first-stage regressions first-stage regressions variables reported in variables reported in column 5, table 3.3 column 6, table 3.3 excluded instruments Two-period lagged electricity 0.003 0.004 (0.005) (0.05) Two-period lagged no. of 0.12* 0.12* health posts (0.06) (0.06) Two-period lagged no. of health -0.003** -0.003** posts* two-period lagged (0.001) (0.001) age in months Two-period lagged no. of health 0.007* 0.007* posts*mothers schooling (0.003) (0.003) Included instruments First-difference in lag age 0.04 0.04 in months (0.06) (0.06) First-difference in lag age -0.05 -0.05 in months*male dummy (0.05) (0.05) First-difference -0.60*** -0.59*** in duration (0.14) (0.14) First-difference in -0.04 -0.05 duration*male dummy (0.14) (0.15) First-difference in duration 0.01*** 0.01*** *lag age in months (0.001) (0.001) First-difference in duration* 0.001 0.001 lag age in months*male dummy (0.001) (0.001) First-difference in lagged -0.44** log(PCE) (0.17) First-difference in lagged -0.03 total assets (0.04) First-difference in price -0.70 -0.85 of rice (0.86) (0.85) First-difference in price 0.34 0.39 of cooking oil (0.28) (0.28) First-difference in price of -0.04 -0.05 condensed milk (0.09) (0.08) First-difference in rural -1.81 -1.75 dummy (1.06) (1.04) 159 Table B.1: Continued Excluded and included coefficient estimates on the coefficient estimates on the instruments from the first-stage regressions first-stage regressions first-stage regressions variables reported in variables reported in column 5, table 3.3 column 6, table 3.3 First-difference in 0.52 0.45 rural dummy*price of rice (0.70) (0.71) First-difference in no. -0.01 -0.01 health posts (0.03) (0.03) First-difference in male -0.29 -0.29 wage rate (0.34) (0.33) First-difference in female -0.09 -0.08 wage rate (0.21) (0.21) First-difference in distance 0.05** 0.05** to health center (0.02) (0.02) First-difference in -0.009 -0.009 electricity (0.006) (0.006) First-difference in dummy 0.07 0.08 for paved road (0.39) (0.38) F statistic on the excluded 3.06 3.14 instruments from the first-stage regressions Hansen J statistic 2.31 2.12 (0.51) (0.54) - Source: IFLS - 1993, 1997, and 2000 - Two-period lagged corresponds to information from the year 1993 160 Table B.2: Determinants of sample attrition Covariates OLS attrition Height-for-age z-score 0.002 (0.004) Male dummy -0.0181 (0.013) Age in months -0.0006 (0.0004) Mother’s schooling 0.0027 (0.002) Father’s schooling -0.0020 (0.002) Mother’s height 0.0006 (0.001) Father’s height -0.002 (0.001) log(PCE) 0.0002 (0.01) Mother’s age -0.0007 (0.001) Father’s age -0.0008 (0.001) Rural dummy 0.1428** (0.06) Location fixed-effects Yes observations 2203 - Source: IFLS - 1993; robust standard errors reported in the parenthesis - *** significant at 1%, ** significant at 5%, * significant at 10% - Attrition =1 if the individual can be followed through the 1993, 1997, and 2000 waves of the IFLS and zero otherwise 161 Appendix C Appendix to Chapter 4 Table C.1: Determinants of sample attrition Covariates (1) OLS (2) OLS (3) OLS attrition attrition attrition Completed grades of schooling -0.0128* (0.007) Relative grades attained -0.0713*** (0.02) Enrollment status 0.0343 (0.03) Mother’s schooling 0.0577 0.0579 0.0497 (0.04) (0.04) (0.04) Father’s schooling -0.0284 -0.0263 -0.0380 (0.02) (0.02) (0.02) Log of real pce -0.0322** -0.0321** -0.0349** (0.01) (0.01) (0.01) Male dummy 0.0201 0.0237 0.0191 (0.05) (0.05) (0.05) dummy=1 if the child -0.1085* -0.1080* -0.1116** is 8 years (0.05) (0.05) (0.05) dummy=1 if the child -0.1548*** -0.1565*** -0.1600*** is 9 years (0.05) (0.05) (0.05) dummy=1 if the child -0.1246** -0.1276** -0.1299** is 10 years (0.05) (0.05) (0.05) dummy=1 if the child -0.2469*** -0.2515*** -0.2561*** is 11 years (0.06) (0.05) (0.06) dummy=1 if the child -0.2770*** -0.2824*** -0.2902*** is 12 years (0.05) (0.05) (0.05) dummy=1 if the child -0.3551*** -0.3623*** -0.3733*** is 13 years (0.05) (0.05) (0.05) dummy=1 if the child -0.3586*** -0.3670*** -0.3731*** is 14 years (0.05) (0.05) (0.05) Male dummy*dummy=1 if 0.1018 0.0975 0.1036 the child is 8 years (0.08) (0.08) (0.08) Male dummy*dummy=1 if 0.1308 0.1292 0.1293 the child is 9 years (0.08) (0.08) (0.08) Male dummy*dummy=1 if 0.0404 0.0376 0.0376 the child is 10 years (0.07) (0.07) (0.07) Male dummy*dummy=1 if 0.1428* 0.1404* 0.1360 the child is 11 years (0.08) (0.08) (0.08) Male dummy*dummy=1 if 0.1850** 0.1818** 0.1849** the child is 12 years (0.07) (0.07) (0.07) Male dummy*dummy=1 if 0.2466*** 0.2422*** 0.2456*** 162 Table C.1: Continued (1) OLS (2) OLS (3) OLS the child is 13 years (0.07) (0.07) (0.07) Male dummy *dummy=1 if 0.0964 0.0902 0.0817 the child is 14 years (0.07) (0.07) (0.07) Number of adult males -0.0132 -0.0135 -0.0142 (0.009) (0.009) (0.009) Number of adult females -0.0135 -0.0140 -0.0121 (0.01) (0.01) (0.01) Mother’s age -0.0006 -0.0005 -0.0012 (0.001) (0.0008) (0.001) Age of the head -0.0012 -0.0012 -0.0006 of the household (0.0008) (0.001) (0.0008) Village fixed-effects Yes Yes Yes observations 2047 2047 2047 - Source: ERHS - 1994; robust standard errors reported in the parenthesis - *** significant at 1%, ** significant at 5%, * significant at 10% - Attrition = 1 if the individual can be followed through the 1994, 1999, and 2004 waves of the ERHS and zero otherwise 163 Table C.2: Preferred IV estimates for Boys from 1994, 1999, 2004 Covariates (1) IV (2) IV (3) IV (4) IV (5) IV (6) IV Enroll RGA Enroll RGA Enroll RGA 1994 1994 1999 1999 2004 2004 Mother’s schooling 0.0909* -0.0368 0.0883 0.0198 0.0446 0.0823 (0.05) (0.05) (0.05) (0.05) (0.05) (0.06) Father’s schooling 0.1331*** 0.1243*** 0.1208*** 0.1292*** 0.0904** 0.0821* (0.03) (0.03) (0.04) (0.04) (0.04) (0.04) Log (real pce) 0.0106 0.1051 -0.0830 -0.0352 0.2111* 0.0844 (0.09) (0.08) (0.18) (0.14) (0.11) (0.10) dummy=1 if the 0.0121 -0.0474 0.2381*** -0.0466 0.1916*** 0.0743 child is 8 years (0.02) (0.07) (0.05) (0.11) (0.05) (0.07) dummy=1 if the 0.0662* 0.0221 0.2574*** -0.0871 0.2829*** 0.0062 child is 9 years (0.03) (0.07) (0.05) (0.11) (0.05) (0.06) dummy=1 if the 0.1064*** -0.0089 0.4878*** 0.0066 0.3503*** 0.0788 child is 10 years (0.03) (0.07) (0.05) (0.10) (0.05) (0.06) dummy=1 if the 0.1758*** 0.0394 0.4895*** -0.0111 0.4618*** 0.0960 child is 11 years (0.03) (0.07) (0.05) (0.10) (0.06) (0.06) dummy=1 if the 0.1504*** 0.0055 0.4808*** 0.0224 0.4820*** 0.0743 child is 12 years (0.03) (0.07) (0.05) (0.11) (0.05) (0.06) dummy=1 if the 0.1897*** 0.0134 0.4259*** -0.0298 0.5015*** 0.1390** child is 13 years (0.04) (0.07) (0.06) (0.10) (0.05) (0.06) dummy=1 if the 0.2567*** 0.0857 0.4930*** -0.0368 0.6088*** 0.1619*** child is 14 years (0.04) (0.07) (0.05) (0.10) (0.05) (0.06) Number of adult 0.0044 -0.0181 -0.0035 0.0211 -0.0001 0.0075 males (0.01) (0.01) (0.02) (0.02) (0.02) (0.01) Number of adult -0.0127 -0.0090 0.0084 -0.0289 0.0379* -0.0023 females (0.01) (0.01) (0.02) (0.01) (0.01) (0.01) Mother’s age 0.0019 -0.0000 -0.0052*** -0.0021 -0.0038* -0.0024 (0.001) (0.001) (0.001) (0.001) (0.002) (0.003) Age of the head -0.0011 0.0004 -0.0014 0.0023 -0.0012 0.0021 of the household (0.0008) (0.0006) (0.001) (0.002) (0.001) (0.001) Observations 1033 1033 942 942 843 843 164 Table C.2: Continued Village fixed-effects Yes Yes Yes Yes Yes Yes - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - In columns 2-6, PCE is instrumented with land, livestock units, and rainfall*land - omitted age dummy - 7 years - Also included in the regressions are dummy variables capturing missing observations for each particular variable (parental schooling, land, household composition ) variables where the missing observation was imputed by the sample mean 165 Table C.3: Preferred IV estimates for girls from 1994, 1999, 2004 Covariates (1) IV (2) IV (3) IV (4) IV (5) IV (6) IV Enroll RGA Enroll RGA Enroll RGA 1994 1994 1999 1999 2004 2004 Mother’s schooling 0.1242** 0.1477* -0.0091 -0.0449 0.0784 0.0733* (0.06) (0.07) (0.06) (0.06) (0.05) (0.04) Father’s schooling 0.0820*** 0.0999*** 0.0143 0.0346 0.0443 -0.0034 (0.02) (0.02) (0.05) (0.04) (0.04) (0.03) Log (real pce) -0.0709 0.1260* 0.6134*** 0.4948** 0.1599 0.1183 (0.06) (0.06) (0.20) (0.20) (0.12) (0.10) dummy=1 if the 0.0303 0.0363 0.0691 -0.2054** 0.1234** 0.0450 child is 8 years (0.02) (0.04) (0.07) (0.08) (0.05) (0.07) dummy=1 if the 0.0738*** 0.0108 0.1554** -0.1265 0.2564*** 0.0068 child is 9 years (0.02) (0.04) (0.06) (0.08) (0.05) (0.06) dummy=1 if the 0.0563** 0.0054 0.1322* -0.2411*** 0.4184*** 0.1083* child is 10 years (0.02) (0.03) (0.07) (0.08) (0.05) (0.06) dummy=1 if the 0.0897** 0.0313 0.1625* -0.2246** 0.4252*** 0.1089 child is 11 years (0.03) (0.04) (0.08) (0.08) (0.06) (0.07) dummy=1 if the 0.1362*** 0.0250 0.2978*** -0.1686** 0.5184*** 0.1553** child is 12 years (0.03) (0.04) (0.06) (0.08) (0.05) (0.06) dummy=1 if the 0.1894*** 0.0550 0.2761*** -0.2043** 0.4522*** 0.0792 child is 13 years (0.03) (0.04) (0.07) (0.08) (0.06) (0.06) dummy=1 if the 0.0787** 0.0294 0.2492*** -0.2244** 0.4240*** 0.1270** child is 14 years (0.03) (0.04) (0.07) (0.08) (0.05) (0.06) Number of adult 0.0217* 0.0115 0.0280 0.0413** -0.0008 -0.0177 males (0.01) (0.01) (0.02) (0.02) (0.01) (0.01) Number of adult -0.0223** -0.0093 0.0551** 0.0173 0.0535* 0.0423* females (0.01) (0.01) (0.02) (0.02) (0.02) (0.02) Mother’s age 0.0003 0.0014 -0.0032 0.0004 -0.0001 0.0004 (0.001) (0.001) (0.002) (0.001) (0.002) (0.001) Age of the head 0.0006 0.0001 -0.0001 0.00001 -0.0010 -0.0009 of the household (0.0008) (0.0007) (0.001) (0.001) (0.001) (0.001) Observations 1014 1014 935 935 786 786 Village fixed-effects Yes Yes Yes Yes Yes Yes - Robust standard errors in parentheses - *** significant at 1%; ** significant at 5%; * significant at 10% - p-values are reported for the F statistic on the excluded instruments and the Hansen J statistic - In columns 2-6, PCE is instrumented with land, livestock units, and rainfall*land - omitted age dummy - 7 years - Also included in the regressions are dummy variables capturing missing observations for each particular variable (parental schooling, land, household composition variables) where the missing observation was imputed by the sample mean 166
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Inter-temporal allocation of human capital and economic performance
PDF
Essays on health economics
PDF
Essays on health and aging with focus on the spillover of human capital
PDF
Essays on health insurance programs and policies
PDF
Three essays on economics of early life health in developing countries
PDF
Essays on the econometrics of program evaluation
PDF
Essays on health and well-being
PDF
The determinants and measurement of human capital
PDF
Essays on development and health economics
PDF
Three essays on human capital and family economics
PDF
Selection and impacts of early life events on later life outcomes
PDF
Three essays on health economics
PDF
Armed conflict, education and the marriage market: evidence from Tajikistan
PDF
Essays on the equitable distribution of healthcare
PDF
Three essays on cooperation, social interactions, and religion
PDF
Intergenerational transfers & human capital investments in children in the era of aging
PDF
Comparative study on Asian financial capitals' competitiveness: focused on strengths and weaknesss of the city of Seoul
PDF
Essays in the study of institutions and development
PDF
Policy termination: a conceptual framework and application to the local public hospital context
PDF
Essays on education and institutions in developing countries
Asset Metadata
Creator
Mani, Subha
(author)
Core Title
Essays on human capital accumulation -- health and education
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Publication Date
06/30/2008
Defense Date
04/04/2008
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
child health,first-difference,Indonesia,lagged dependent variable,OAI-PMH Harvest
Place Name
Ethiopia
(countries),
Indonesia
(countries)
Language
English
Advisor
Strauss, John A. (
committee chair
), Ham, John C. (
committee member
), Hoddinott, John (
committee member
), Melnick, Glenn (
committee member
), Nugent, Jeffrey B. (
committee member
)
Creator Email
smani@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-m1303
Unique identifier
UC1101277
Identifier
etd-Mani-20080630 (filename),usctheses-m40 (legacy collection record id),usctheses-c127-83157 (legacy record id),usctheses-m1303 (legacy record id)
Legacy Identifier
etd-Mani-20080630.pdf
Dmrecord
83157
Document Type
Dissertation
Rights
Mani, Subha
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Repository Name
Libraries, University of Southern California
Repository Location
Los Angeles, California
Repository Email
cisadmin@lib.usc.edu
Tags
child health
first-difference
lagged dependent variable