Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on economics of education
(USC Thesis Other)
Essays on economics of education
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ESSAYS ON ECONOMICS OF EDUCATION by Bo Min Kim A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ECONOMICS) August 2013 Copyright 2013 Bo Min Kim Tomyparents,DaejungKimandHyunsukChin, andmylovelywife,WoonwhaJu,andmypreciousangel,LeahKim ii Acknowledgments I feel fortunate to have had the guidance and encouragement of Geert Ridder. I would definitely not be where I am without his unconditional support, encouragement, and patience. His critical guidance, invaluable advice, and sharp questions have challenged me to become a better researcher. Moreover, his quiet encouragement helped me through a few very difficult months this past winter. I would also like to thank to my dissertation committee members, Tatiana Mel- guizo, Hyungsik Roger Moon, John Ham, and John Strauss, for their invaluable com- ments and enormous suggestions. I am particularly grateful to Tatiana Melguizo for her generous provision of the data used in this study’s Chapter 2. I would also like to thank my parents, Daejung Kim and Hyunsuk Chin, and my sister, Taeyoung Kim. They supported me and encouraged me with their best wishes from the very start. Finally, I would like to express my gratitude to my wife, Woonwha, and our daughter, Leah. Woonwha’s care and support for me and our daughter was indispensable to the completion of this dissertation. Her positive and optimistic attitude lifted me through my bad days, and Leah always brought constant joy to my life. iii Table of Contents Acknowledgments iii List of Tables vi List of Figures vii Abstract viii Chapter 1: Estimating Returns to Vocational Education at High Schools in Korea 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Vocational High School in Korea . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.2.1 High-School System in Korea . . . . . . . . . . . . . . . . . . . . . . 5 1.2.2 The Role of Vocational High Schools . . . . . . . . . . . . . . . . . 6 1.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Preliminary Analysis via OLS . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4.1 Less College . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4.2 College graduates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.5 Controlling Endogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5.1 Sources of Identification . . . . . . . . . . . . . . . . . . . . . . . . . 18 1.5.2 Estimation Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Chapter 2: Do Developmental Mathematics Courses Develop the Mathemat- ics? Addressing Missing Outcome Problem in Regression Discontinuity Design 32 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 2.2 Developmental Math Program in Community Colleges . . . . . . . . . . 37 2.2.1 Developmental Education in Community Colleges . . . . . . . . 37 2.2.2 A Sequence of Mathematics Courses . . . . . . . . . . . . . . . . . 39 2.2.3 Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2.2.4 Previous Literature and their Limitations . . . . . . . . . . . . . . 41 iv 2.3 Empirical Strategy: Bounding Approach in Regression Discontinuity Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.3.1 Regression Discontinuity Design . . . . . . . . . . . . . . . . . . . . 44 2.3.2 Missing Outcome Problem . . . . . . . . . . . . . . . . . . . . . . . 48 2.3.3 Bounding the Causal Effects . . . . . . . . . . . . . . . . . . . . . . . 51 2.3.4 Computation of Bounds by Local Linear Regression . . . . . . . 56 2.4 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.4.1 One Community College (OCCSC) . . . . . . . . . . . . . . . . . 57 2.4.2 Sample Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.4.3 Measures of Academic Achievement . . . . . . . . . . . . . . . . . 63 2.4.4 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 2.5.1 Differences in Enrollment . . . . . . . . . . . . . . . . . . . . . . . . 67 2.5.2 Main Results: GPA on the Main Course . . . . . . . . . . . . . . . 71 2.5.3 Main Results: Time to Complete the Main Course . . . . . . . . 76 2.5.4 Sensitivity to the Choice of Bandwidths . . . . . . . . . . . . . . . 80 2.6 Validity of Regression Discontinuity Design . . . . . . . . . . . . . . . . . 83 2.6.1 Discontinuities in Covariates . . . . . . . . . . . . . . . . . . . . . . 83 2.6.2 Jumps in the Distribution of the Test Scores . . . . . . . . . . . . 86 2.6.3 Discussion of Discontinuities in Multiple Measure Points and Density of Test Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 2.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 Bibliography 92 v List of Tables 1.1 Education Level Completed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2 Summary Statistics of the Employed between 1998 and 2010 . . . . . . . 10 1.3 OLS Results for High School Graduates without College Degrees . . . 13 1.4 OLS Result for College Graduates . . . . . . . . . . . . . . . . . . . . . . . . 16 1.5 First-stage Regression Estimation . . . . . . . . . . . . . . . . . . . . . . . . 26 1.6 IV Estimation Results for High School Graduates without College Degrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.1 Descriptive Statistics of the Entering Students who were Assessed between 2005/6 and 2007/2008 in OCCSC . . . . . . . . . . . . . . . . . . . . . . . . 65 2.2 Estimated Difference in the Enrollment in the Main Course between the Group Assigned to the Prerequisite and the Group Assigned Directly to the Main. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 2.3 Effects of Prerequisite Course on the Average Grade Points of the Main Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 2.4 Effects of Prerequisite Course on the Time to Complete the Main Course 77 2.5 Estimated Discontinuity in Covariates . . . . . . . . . . . . . . . . . . . . . 84 2.6 McCrary Manipulation Test Log Discontinuity Estimates . . . . . . . . 87 vi List of Figures 1.1 Ratio of Vocational High School . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.2 Ratio of Vocational High School: Male . . . . . . . . . . . . . . . . . . . . . 22 1.3 Ratio of Vocational High School: Female . . . . . . . . . . . . . . . . . . . 23 2.1 Cutoff Policy of OCCSC between 2005/6 and 2007/8 . . . . . . . . . . . 59 2.2 A Sequence of Developmental Mathematics Courses in OCCSC . . . . 61 2.3 The Proportion of the Assignment to the Prerequisite Courses and the Enrollment in the Prerequisite Assignment . . . . . . . . . . . . . . . . . . 67 2.4 The Proportion of Enrollment in the Main Course . . . . . . . . . . . . . 68 2.5 Finishing the Main Course and Mean Grade on the Main Course . . . . 72 2.6 Completion of the Main Course and Time to Complete the Main Course 78 2.7 Bandwidths and the Estimated Prerequisite Effect on the Mean Grade 81 2.8 Bandwidths and the Estimated Prerequisite Effect on the Time to Com- plete the Main Course . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 2.9 Density of the Test Scores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 vii Abstract This dissertation analyzes the effects of the educational programs for students who need special care in secondary and postsecondary school. These educational programs present serious endogeneity problems to a researcher estimating their causal effects, because most of students tried to avoid these programs. I extend the current literature in the context of applied econometrics. First, I search for valid instruments based on exogenous policy changes when either appropriate proxy variables for ability or tradi- tional instrumental variables are unavailable, and I look at the validity of instrumental variable estimation results. Second, I address the missing outcome problem in regres- sion discontinuity design without invoking the additional assumption, after which it is possible to obtain the bounds for the estimates when the missing outcome problem presents in regression discontinuity design. Chapter 1 evaluates evaluates the impact of vocational high school on labor-market success in Korea. As a measure of the efficacy of the vocational high school, I use the wage data from the Korean Labor and Income Panel Study (KLIPS). Restricting the dataset to high-school graduates only, the comparison of general high schools and vocational high schools is made. To address the endogeneity problem in the choice of school type, the preset capacity of each school is used as an instrumental variable for the choice of school. I find that vocational high school gives greater returns to viii graduates in the sense of local average treatment effects. I find that enrollment in voca- tional high school gives about 30 percent higher wages than enrollment in general high school. This study also shows that the usual OLS estimates underestimate the effect of vocational high school on the wages while the IV estimates eliminate the downward bias generated by the selection problem. The result of this chapter is contrary to the previous studies showing that vocational high school scarcely affects wages. Chapter 2 investigates the effects of developmental math course offered at com- munity colleges, addressing the missing outcome problem in the quasi-experimental studies. Many students are unprepared for college-level math in spite of many attempts to improve the math skills of high-school students. In community colleges, develop- mental mathematics courses are designed to help those students make up for the gaps in high-school math. However, there are few studies on the effect of developmental mathematics on mathematics achievement despite the vast quantity of research on the courses’ effects on various outcomes. Developmental mathematics consists of vari- ous courses in a tight sequence where course assignments are determined by a rigid placement rule based on students’ test scores, and in which students must master the assigned course before taking the next level of math. A course’s effectiveness can be measured by the letter grade or other test scores in its subsequent course. However, such an effect is difficult to investigate because of missing outcome problems; achieve- ment in the subsequent course is only observed for those who enrolled and finished it. Enrollment may be affected by assignment to a prerequisite course since those assigned to the prerequisite are less likely to enroll in the subsequent course compared to those assigned directly to the subsequent course. In regression discontinuity design (RDD), usual methods such as the control function approach cannot address these missing outcome problems as the outcome’s propensity to be observable is also dis- continuous. Applying a bounding approach in RDD, this study partially identifies ix the causal effects of developmental mathematics, and computes their bounds. Using the data from a community college in California, I find that assignment to develop- mental courses would increase achievement and learning efficiency in the subsequent math courses. x Chapter 1 Estimating Returns to Vocational Education at High Schools in Korea 1.1 Introduction During the last half century, Korea has experienced a large expansion in secondary as well as primary education. Unlike the U.S., Korea has two different institutions for secondary education: general high school and vocational high school. Vocational high schools have trained a number of students, and the graduates have played important roles in the manufacturing and commercial sectors. Meanwhile, their counterpart in the secondary education system, general high schools, focused on the development of the academic ability required in higher education. Until the mid-1990s, the demand for vocational high school education was large enough. Since the 1990s, however, the number of applicants for vocational high school has decreased radically and these schools could not fill their assigned capacities in 1999. In 1995, 42.2 percent of high- school students went to vocational high school, but only 30.7 percent were vocational high school students in 2003. To analyze these phenomena, a few economists in Korea recently tried to estimate the returns on investment in vocational high school (Chae, 2004; Nam, 2005). Both studies show that there is no significant evidence of returns to vocational high school compared to general high school, particularly in the case of workers who graduated from high school and did not go to college. The dataset from the Korean Youth Panel Surveys, which started in 2003, were used in their studies. Moreover, the sign of the 1 coefficients in relation to returns to vocational education is negative, even if the effects are found to be insignificant. Nam (2005) argues that when considering the greater expenditure on vocational high schools than on general high schools in order to main- tain expensive machinery and other facilities, vocational education at the high-school level is so inefficient that we should consider a reform of secondary education sys- tem in Korea. But these two studies did not control the endogeneity problem arising from self-selection and did not control ability bias, which has been most frequently observed and considered in a number of the returns-to-education studies. The pri- mary reason for these studies’ failure to deal with ability bias, in particular, lies in the lack of data. According to Griliches (1977), Willis and Rosen (1979), and Card (1995a, 1999, 2001), a primary empirical challenge arises when estimating wage differentials created by different education levels: controlling for selection bias or ability bias. Estimating the earnings effect of graduating from vocational high school for individuals is more complicated than just comparing the earnings of graduates of general high school and vocational high school because of the endogeneity problems caused by this selection bias or ability bias. One selection effect that might lead to an understatement of the benefit of vocational high school, or overstatement of the benefits of general high school, can be observed in that many vocational high-school students could not or would not attend a general high school because they had a low GPA, did not get a high score on the middle-school graduation examination or found it profitable to go to vocational school and participate directly in the labor market without a college degree. For simplicity’s sake, it is possible to say that more able students went to general high school while less able students went to vocational high school, a conclusion that is reported in the studies in the collection of vocational education literature in Korea (Kang, 1999). 2 Unfortunately, direct information on ability—such as exam scores, IQ tests, etc.— is strictly limited in Korea, and is not included in the dataset used for this chapter. The limitation of the dataset makes estimation of returns to vocational high school more difficult. As argued by Griliches and Mason (1972) and Griliches (1977), an explicit inclusion of ability measure—for example, IQ tests—can eliminate the bias caused by omitted-variable problems. Even if such data were available, however, the measurement-error problem would severely constrain the usefulness of ability mea- sures. The ability measure which is collected or reported in transcripts is not a perfect proxy that can capture every unobservable ability. Blackburn and Neumark (1993) suggest an upward bias in the least squares estimates if one includes an explicit proxy for ability. Rather, that proxy has been used for controlling some ability bias in the tradition of the self-selection model that has been used since Willis and Rosen (1979). In this situation, a convincing analysis of the causal link between education and earnings should seek an exogenous source of variation in education choices. Natu- ral variations in data caused by exogenous influences on the schooling decision have been used in many papers in the last two decades (Angrist and Krueger, 1991; Kane and Rouse, 1993; Butcher and Case, 1994; Card, 1995b). Angrist and Krueger (1991) employed a natural experiment instrument strategy, assuming that quarter of birth is uncorrelated with wages except through their effects on education via school-start age policy and compulsory school-attendance laws. Card (1995b) and Kane and Rouse (1993) used geographic proximity to educational institutions on the grounds that peo- ple living near a college are more likely to avail themselves of the facility than those living farther from colleges. In addition, Butcher and Case (1994) use sibling compo- sition in the household to provide instruments for schooling for women. The above studies and others’ papers using similar approaches suggest that the estimated returns 3 to education in the U.S. are biased downward in an ordinary least squares (OLS) regres- sion. In this chapter, I adopt the same IV approach used by the above authors. I rely on the exogenous changes in the educational distribution of individuals caused by the variation in the aggregate enrollment rates, which have been set and controlled by the Korean Ministry of Education and Human Resources Development. Using data from the fourth wave of the Korean Labor Income Panel Study (KLIPS), I find that persons who went to vocational high schools in the school district with more vocational high schools than other districts have significantly higher wages than workers with sec- ondary education alone done in general high schools in the same school district, when instrumental variables (IV) estimation is used. Furthermore, IV estimates of returns to vocational high school are larger than OLS estimates by more than 45-50%, which is consistent with the previous studies using IV to estimate returns to education. Con- trary to the widespread skepticism regarding the value of a vocational education in secondary-education institutions, the estimated returns to vocational high schools are remarkably high compared to those of general high schools if one restricts the sample to high-school graduates with no college degree. Section 1.2 provides information about the Korean secondary education system and describes the role of vocational high schools. Section 1.3 outlines the dataset used in the chapter, and the preliminary estimates by OLS regression are given in Section 1.4. Section 1.5 justifies the identification source to be used for controlling the endogeneity problem in this chapter and shows the estimated coefficients obtained through IV estimation. Section 1.6 concludes. 4 1.2 Vocational High School in Korea 1.2.1 High-School System in Korea In Korea, students from grades 10 through 12 attend high schools. Middle-school graduates or those with an equivalent academic background, usually about age 15, are admitted to high schools. Students bear the expenses of their high-school education, which lasts three years. However, a student may choose the class he or she wishes to take for liberal arts. High schools in South Korea may also have subject specialty tracks. There are two types of high schools in the Republic of Korea: general andvoca- tional. University-bound students may choose to go to Inmun-Gyae godeung hakgyo a general high school, while other students may choose Silup-Gyae godeung hakgyo, a vocational high school, which emphasizes agriculture, commerce, or technical trade curricula. Applicants for vocational high schools (covering agriculture, engineering, business and maritime studies) have a choice of schools and are admitted through examinations administered by each school. The curriculum at vocational high schools is usually 40- 60 percent general courses with the remainder being vocational courses. As of 2002, there were 741 vocational high schools with 535,363 students. Among general high schools, there are several specialized high schools in the areas of arts, physical education, science, and foreign languages. Courses at general high schools tend to center around preparing students to enter universities. As of 2002, there were 1,254 general high schools with 1.22 million students. When one considers both types of high school together, the proportion of middle-school graduates advanc- ing to high school was 99.5 percent in 2002. Traditionally, entrance examinations to individual high schools for most students were largely symbolic. Examinations were extremely challenging only for those who 5 tried to move upward into a better-rated high school. The progress of a middle-school graduate to high school was largely a function of his or her parents’ financial ability. This kind of entrance examination by individual high schools was, however, abolished in 1974. Instead, admission was based on middle-school grade point average (GPA) and records and on the scores on the national qualifying or “selection” examination, established in order to limit the number of high-school students. While successful applicants for general high schools are assigned to a school by a lottery system, appli- cants to vocational high schools compete based on the school’s own examination or the student’s middle-school record. 1.2.2 The Role of Vocational High Schools The role of vocational high schools in the labor market is important, considering the path from school to work in Korea. First, vocational high schools and junior colleges are more active in assisting their graduates to find jobs compared to universities. Since the public employment service plays an almost negligible role in assisting students to find employment in Korea, the training program given by vocational high school is more effective for new workers with only high-school certificates. For example, most of the commercial vocational high schools design their programs to prepare their stu- dents for information-processing jobs in the computerized work environment. The vocational high school implements a better school-to-industry transition path than the general high school. When a student completes the yearlong work-experience pro- gram in the third year of vocational high school, she or he can obtain a high-school diploma. In the sense that work experience before graduation, including fieldwork practice, is an important factor in finding jobs, the education in a vocational high school may produce better results than that in a general high school. In addition, a sur- vey developed by the Korea Research Institute for Vocational Education and Training 6 (KRIVET) reported that among the government-funded re-employment training pro- grams, those run by the higher education institutions were found to do the least work to find jobs for the trainees (Kang, 1999). Thus vocational education in secondary school has been thought to play a more important role in teaching new skills for some sectors such as manufacturing. As a cornerstone of upper secondary education, voca- tional education effectively served the rapidly developing economy during the initial stages of industrialization in Korea. In 1990, the Korean government passed a law obligating local authorities to increase enrollment in vocational high schools equal to that of academically oriented institutions in spite of the rising higher-educational aspirations of the people. The gov- ernment strongly emphasized this policy in order to supply a workforce from voca- tional high schools to the manufacturing and construction industries. As a result, vocational high-school enrollment figures increased from 35.5% in 1990 to 42.2% in 1995. However, the government policy has been sharply criticized by educators and some industrial-policy experts because of some structural problems vocational high schools faced. A closed and unchanging teaching force has been, over a decade, the major obstacle to much-needed changes in curriculum. Programs and curricular con- tent in a vocational education have not adapted themselves to changing industrial demands. As a result, the dropout rate in vocational high schools continues to increase. Also, the employment capacity in the sectors which had once demanded a large num- ber of vocational high-school graduates gradually shrank, and vocational education delivered in secondary schools rapidly lost its appeal. Thus, vocational high schools became the failing half of upper secondary education, undermining the foundation of government policy since 1990 to expand vocational high schools in their current form (Nam, 2005). 7 1.3 Data The data used in this chapter are the Korean Labor Income Panel Study (KLIPS) con- ducted by the Korean Labor Institute (KLI). It is a longitudinal survey that originally sampled 5,000 households and 13,321 individuals belonging to those households since 1998. The survey has been conducted annually and currently eight waves (1998-2005) have been released. It mainly focuses on the income, wealth and expenditure of the household and on the labor status and demographic information of individuals. How- ever, this survey has disadvantages in that the information about education is not detailed enough to reveal the type of high school the respondents attended. This prob- lem is commonly found in all the surveys done in Korea. Even though type of high school was surveyed in the first two years of KLIPS, the coverage is limited. Few high- school graduates report whether they attended a vocational or general high school. Only in the fourth wave, in 2001, did the study provide information about type of high school, region where the high school resided and year of entry. This is why I have chosen to use data from the 2001 fourth-wave survey. The fourth-wave data con- sists of 4,248 households and 11,651 individuals. A worse problem is, however, unlike the National Longitudinal Survey of Youth (NLSY) or National Longitudinal Survey of the High School Class of 1972 (NLS-72), KLIPS has no direct information on abil- ity such as exam scores, GPA and IQ tests, etc. Korean law restricts the collection of this type of information, and thus the data related to ability measure is not included in my dataset. Several criteria govern the creation of the sample. First, I limited the analysis to those who were working in 2001. Table 1.1 shows the percentage of education level completed of the employed. 71.8 percent of people had attended high schools. In par- ticular, 27.5 percent of the whole sample attended vocational high schools. Compared to 32.7 percent of college graduates, the ratio of vocational high-school graduates in the 8 Table 1.1: Education Level Completed Mean (%) Over college 32.7 General college 20.7 Vocational junior college 11.7 Other college 0.3 Over high school 71.8 General high school 43.6 Vocational high school 27.5 GE 0.7 Less high school 28.2 N 5046 Note: Table reports the percentage of the highest schooling level for the individuals who were working in 2001. total sample is large enough to justify the study of the effects of vocational education at the high-school level. Second, I only include those sample members who participated in any high school since I seek to find the effect of solely vocational high school. If respondents with less than a high-school education are considered in the analysis, the estimation is more complicated than one which focuses only on the wage differentials between the treat- ment group (vocational high-school graduates) and the comparison group (general high-school graduates). For convenience, I use the respondents who have finished their high-school education. The key variables used in this chapter are reported in Table 1.2 where Column (1) contains the summary statistics of the whole sample, while Column (2) has the sum- mary statistics of the workers who never went to college and Column (3) shows the 9 Table 1.2: Summary Statistics of the Employed between 1998 and 2010 (1) (2) (3) All VH GH All VH GH All VH GH Hourly wage 8.484 7.666 9.000 7.185 7.181 7.190 10.073 9.184 10.257 (1,000 won) (12.758) (13.729) (12.080) (12.832) (14.311) (10.862) (12.545) (11.608) (12.711) Wage worker 0.798 0.806 0.792 0.753 0.781 0.722 0.852 0.887 0.841 (0.402) (0.395) (0.406) (0.431) (0.414) (0.448) (0.355) (0.317) (0.365) Nonwage worker 0.202 0.194 0.208 0.247 0.219 0.278 0.148 0.113 0.159 (0.402) (0.395) (0.406) (0.431) (0.414) (0.448) (0.355) (0.317) (0.365) Age 36.415 35.937 36.717 38.157 36.588 40.000 34.297 33.896 34.445 (10.431) (10.807) (10.177) (10.609) (10.725) (10.176) (9.823) (10.827) (9.542) Female 0.336 0.330 0.339 0.321 0.330 0.310 0.352 0.330 0.359 (0.472) (0.471) (0.473) (0.467) (0.471) (0.463) (0.478) (0.471) (0.480) Seoul 0.269 0.223 0.298 0.241 0.214 0.273 0.299 0.253 0.315 (0.443) (0.417) (0.457) (0.428) (0.410) (0.446) (0.458) (0.435) (0.465) Part time 0.123 0.136 0.114 0.145 0.145 0.145 0.096 0.107 0.092 (0.328) (0.343) (0.318) (0.352) (0.353) (0.352) (0.294) (0.310) (0.290) Married 0.681 0.670 0.687 0.744 0.708 0.786 0.606 0.551 0.619 (0.466) (0.470) (0.464) (0.436) (0.455) (0.410) (0.489) (0.498) (0.486) College 0.455 0.242 0.590 (0.498) (0.428) (0.492) VC 0.164 0.138 0.180 0.363 0.568 0.306 (0.370) (0.345) (0.385) (0.481) (0.496) (0.466) GC 0.287 0.104 0.403 0.637 0.432 0.694 (0.453) (0.306) (0.491) (0.481) (0.496) (0.466) N 3590 1389 2201 1955 1053 902 1620 336 1299 % 38.7 61.3 53.9 46.1 20.7 80.2 Note: Table reports means and standard deviations which are shown in parentheses for the employed between 1998 and 2010 who were surveyed in KLIPS. VH is the indicator of graduates of vocational high school and VC is the one of graduated of vocational junior college. GH is the indicator of graduates of general high school and GC is the one of graduated of general college. (1) is for the whole sample. (2) is for the group attended only high school. (3) is for the group with more than college education. 10 summary statistics of the workers who entered college. The mean wage of the voca- tional high-school graduates is smaller than that of the general high-school graduates. As shown in Column (2), however, when one considers only high-school graduates who did not go to college, the mean wages of the vocational and general high-school graduates are almost same. Among the groups who entered high school but did not attain a college-level education, the graduates of vocational high schools earn as much as those of general high schools. The rates of entering college are also different between general high schools and vocational high schools. As shown in Section 1.2, the aims are different and the selected students in each school do not have the same levels of academic or nonaca- demic ability. It is noticeable that about 60 percent of students in general high school advanced to higher education while only a quarter of vocational high-school students obtained admission to a college. It should also be noted that vocational high-school students were more likely to go to vocational junior colleges rather than general col- leges if they entered post-secondary institutions at all. Comparing the average wage between college graduates in Column (3) and the workers who completed only secondary education in Column (2) leads to large differ- entials in wages. Since more general high-school students went to college and college graduates earned more, the gap in wage level could be clearly seen in the whole sample in Column (1). To avoid this problem, I will divide the whole sample in Column (1) into the group in Column (2) and the group in Column (3) when analyzing the effects of vocational high school on wages. 11 1.4 Preliminary Analysis via OLS In this section, the log wage regression to be estimated is y i = 0 x i + D i +u i (1.1) where y i is the logarithm of hourly wage for individual i, x i is the set of covariates including a constant andD i is a dummy variable equal to 1 if personi graduated from a vocational high school and 0 if he or she did otherwise. Nowu i is assumed to be inde- pendent of x i and D i . It is a typical Mincerian equation including years of schooling, experience and the square of experience. However, I have replaced years of school- ing with the variable D. Then the coefficient will estimate the return on or the loss from vocational high school compared to that of general high school. Moreover, instead of using experience, I will use age and the square of age to capture seniority. Other covariates will include female dummy, part-time status dummy, marital status, and residence in a metropolitan area (Seoul dummy). 1.4.1 Less College First, I look at the group consisting of those who went only to high school and attained no higher education. Table 1.3 gives the results by OLS regression. In every model specification (1)-(5), the estimated returns to vocational high school is not significantly different from zero. It coincides with the findings that the average wage levels are not much different across the two groups of vocational high school and general high school from Column (2) in Table 1.2. It means that even if I control the other covari- ates correlated with the wage, the wage does not depend on the type of high school from which workers graduated. If a student ends his or her education after secondary school and enters the labor market, the vocational education the student acquired in 12 the vocational high school grants no more productivity than does a general academic education that emphasizes the development of academic ability necessary for higher education over skills like using factory machinery or computers. Table 1.3: OLS Results for High School Graduates without College Degrees Dependent Variable: log of hourly wage (1) (2) (3) (4) (5) VH -0.026 -0.017 0.032 0.025 0.026 (0.030) (0.029) (0.031) (0.029) (0.029) Female -0.437*** -0.388*** -0.365*** (0.030) (0.030) (0.030) Age 0.083*** 0.074*** 0.066*** (0.011) (0.011) (0.012) Age 2 -0.090*** -0.084*** -0.075*** (0.014) (0.014) (0.016) Part Time -0.251*** (0.037) Married 0.082** (0.040) Seoul 0.041 (0.033) Constant 1.685*** 1.821*** -0.098 0.274 0.415* (0.022) (0.023) (0.190) (0.194) (0.217) R-Square 0.001 0.0925 0.0662 0.135 0.154 N 1955 1955 1955 1955 1955 * indicates the 10% significance, ** indicates the 5% significance , *** indicates the 1% significance level. Note: Columns represent OLS coefficients. Robust standard errors are reported in parentheses. VH is the indicator of graduates of vocational high school. This equalization of wages or productivity between the two types of high schools has two possible explanations. First, it is possible that the vocational education given to high-school students did not work as intended. If this is true, the public expenditure for vocational high schools seems inefficient. Under those circumstances, education policymakers should merge vocational high schools and general high schools into one 13 uniform high school as is done in the U.S. and should not operate vocational programs at the high-school level. However, that policy implication is reasonable only where the assignment of high-school students to vocational high school is random. Yet school assignment is based on student choice, and thus cannot be random. Second, Second, we can assume that this equalization results from self-selection, which is typically referred to as nonignorability of the treatment assignment (Rubin, 1978) or the selection or endogeneity problem (Heckman and Robb, 1985). Less able or less productive students went to vocational high schools while more productive stu- dents who expected to be able, academically or financially, to go to college entered gen- eral high schools. The vocational high schools train students to acquire skills required for specific jobs, especially those that do not require college credentials. Meanwhile, general high schools teach students the academic skills necessary to complete higher education. These skills are not useful unless matched with higher academic training because lower-level jobs make no use of these skills. Thus more able middle-school graduates are willing to go to general high schools, but if they cannot or will not go to college, then their productivity can be equal to that of vocational high school gradu- ates. If the second explanation is reasonable, the OLS assumption cannot identify and estimate the true parameter—returns to vocational high school. It is necessary to look for the panacea to this endogeneity problem. 1.4.2 College graduates I also investigate the effects of vocational high school among college graduates, adopt- ing the same empirical specification as above. First, I consider only vocational high school as the main education factor affecting hourly wages or productivity. As a result, the estimated wage differential ranges roughly from -15% to -9% in Panel A 14 of Table 1.4. Column (1) gives a simple estimate of the return on, or cost of, voca- tional high school: -14%, which is same as the comparison of the average logarithm of hourly wage of vocational high-school alumni and general high-school alumni in 2001. However, this differential cannot be interpreted as the effect of vocational high school. Rather, it can be seen to be related to the other factor influencing wages as a usual omitted variable problem. In order to address this problem, the estimation includes the dummy variableD vc,i to indicate whether a worker attended vocational junior college as well as D vh,i of vocational high school. The estimation result is reported in Panel B of Table 1.4. y i = 0 x i + 1 D vh,i + 2 D vc,i +u i (1.2) In Panel B, when one includes the vocational junior-college dummy, the effects of vocational high school which are shown in Panel A become insignificant. Instead of the effects of vocational high school, the loss from the vocational junior college is significantly high: It ranges from -30% to -15%. These huge losses result from the fact that the years that it takes for a student enrolled in a vocational junior college to complete the required courses are shorter than the years required to complete a degree at a more general college. The shorter schooling years bring about the lower wages or lower productivity due to less investment in human capital. From the findings in Panel B, we can argue that one of the reasons for the significant wage differentials between vocational high school and general high school in Panel A is that many vocational high-school students went to vocational junior colleges, and those college alumni earn less than general college graduates. Thus, the losses from the vocational high school in Panel A are absorbed in the negative effects of vocational junior college shown in Panel B. In other words, the effect of vocational high school on wages is not significant 15 Table 1.4: OLS Result for College Graduates Dependent Variable: log of hourly wage A B (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) VH -0.148*** -0.158*** -0.095*** -0.098*** -0.088*** -0.069* -0.088** -0.054 -0.057 -0.050 (0.042) (0.041) (0.036) (0.036) (0.036) (0.042) (0.041) (0.037) (0.037) (0.037) VC -0.305*** -0.266*** -0.165*** -0.164*** -0.152*** (0.035) (0.034) (0.032) (0.032) (0.032) Female -0.344*** -0.049 -0.041 -0.313*** -0.043 -0.035 (0.034) (0.034) (0.034) (0.034) (0.034) (0.034) Age 0.126*** 0.122*** 0.087*** 0.123*** 0.120*** 0.086*** (0.009) (0.009) (0.011) (0.009) (0.009) (0.011) Age 2 -0.123*** -0.119*** -0.083*** -0.121*** -0.118*** -0.084*** (0.011) (0.011) (0.013) (0.011) (0.011) (0.012) Part Time -0.215*** -0.226*** (0.052) (0.052) Married 0.201*** 0.190*** (0.043) (0.042) Seoul 0.136*** 0.120*** (0.032) (0.032) Constant 2.051*** 2.174*** -0.716*** -0.617*** -0.004 2.145*** 1.979*** -0.588*** -0.503*** 0.093 (0.019) (0.022) (0.164) (0.178) (0.203) (0.022) (0.033) (0.164) (0.178) (0.203) R-Square 0.008 0.065 0.249 0.251 0.2749 0.051 0.098 0.262 0.263 0.285 N 1620 1620 1620 1620 1620 1620 1620 1620 1620 1620 * indicates the 10% significance, ** indicates the 5% significance , *** indicates the 1% significance level. Note: Columns represent OLS coefficients. Robust standard errors are reported in parentheses. VH is the indicator of graduates of vocational high school. VC is the indicator of graduates of vocational college. 16 when controlling for type of college. When students enter college and complete higher education, the type of high school they attended does not matter to their wages or productivity. It should be noted that the cost of vocational junior college cannot be interpreted as the negative causal effects of such an education on wages. Rather it reflects the shorter time (two years) of human capital investment in these educational institutions compared to four years in general colleges. In addition, as in the previous subsec- tion, profound care should be taken with the endogeneity problems which arise from selection issues related to unobserved ability or omitting important covariates such as ability measure, since attendance at a vocational high school or vocational junior college is not assigned randomly. However, the matter is more complicated than the solution to the selection bias or endogeneity problem focusing only on the group who did not go to college. In order to analyze the students who attended any type of col- lege, paying due attention to the selection problem, one must consider the dynamic structure as in a study using structural estimation (Keane and Wolpin, 1997; Taber, 2001; Belzil and Hansen, 2002), or control function estimation or multiple treatment effects estimation (Garen, 1984; Blundell, Dearden, and Sianesi, 2005). 1.5 Controlling Endogeneity From the previous section, we can infer that estimation of the monetary return on the vocational high school, or the cost or benefits of graduation from a vocational school, is complicated by two problems. The first problem is the selection problem referred to in the previous section. This implies that comparing differences in average earnings for the treatment and control groups—that is, those who were vocational high-school 17 students and those who were general high-school students—would not lead to credible estimates of returns to vocational high school. The second problem is the lack of a large survey-based dataset containing informa- tion on which type of high school respondents graduated from and ability measures such as exam scores, IQ tests, etc. Although there are several large cross-sectional datasets and some recent panel datasets that have information on earnings and school- ing years, few surveys asked questions on education histories detailed enough to iden- tify the type of high school the respondents attended, or any ability measure. But, measurement error problems would severely constrain the usefulness of abil- ity measures even if such data were available. An ability measure is not a good proxy for capturing all unobservable ability because of serious measurement error prob- lems. Rather, it has been used for controlling some ability in the tradition of the self-selection model since Willis and Rosen (1979). Since the dataset has no informa- tion about academic ability measures, I have tried to solve the endogeneity problem arising from the selection without the ability measure. I exploit the exogenous varia- tions in the supply side of education in order to solve the selection problem. 1.5.1 Sources of Identification I attempt to address the selection or endogeneity problem by focusing on changes at the aggregate level of the number of new vocational high-school students across school districts and over time. Although there is considerable evidence that the decision to enter a vocational high school is endogenous, we assume that the average percentage of vocational high-school students who were born in a specific year and reside in a specific school district is not; the average percentage in one birth year-school district-cohort is exogenous. For example, say that Student A was born in 1980 and lived in Seoul. In 1996, she could apply for any high school in Seoul, but the average percentage of 18 vocational high school students in the school districtSeoul is exogenous and hence she should take this capacity 0.372 as given. It can be stated in the following assumption: ¯ D i ?u i (1.3) where ¯ D i represents the ratio of vocational high schools’ capacities to those of whole high schools and u i is an error term in the estimation equation (1.1). For this argument, it is necessary to assume that the capacities of general high schools and vocational high schools are exogenously determined by the educational authorities. The Korean education authorities, including the Ministry of Education and Human Resources Development, released the right to set the capacities to colleges in the mid-1990s, but the secondary education system is still controlled directly by the government, from textbook publishing to school capacities. The second assumption is the monotonicity of treatments. D i ( ¯ D i )>D i ( ¯ D i 0 ) if ¯ D i > ¯ D i 0 (1.4) where ¯ D i represents the ratio of vocational high schools’ capacities to those of all high schools. For simplicity, I make an extreme example: every female student wants to go to vocational high-school because the graduation from college is not attractive to female students. Thus, if Student A wishes to attend a vocational high school, her best course of action is to achieve a high GPA and to obtain a high score on the general entrance examination. She can gain admission to a vocational high school if her GPA is high enough to enter the school. But the capacity of vocational high schools must affect her decision or the result of her efforts. If the number of slots in a given vocational high school is limited, she cannot go to that school unless she is an excellent student. The likelihood of entering vocational high school thus depends on 19 the ratio of the capacities of vocational high schools to those of all high schools. In the other extreme case, where every student prefers to go to a general high school, the type of high school any given student will attend depends on the ratio, too. In any case, Student A’s likelihood of going to a vocational high school will be higher if she moves from Seoul to Pusan because the ratio in Pusan is 0.451, higher than Seoul’s 0.372. Another assumption is that an individual student cannot apply to another school district. Unless this assumption is satisfied, the ratio of the number of freshmen in vocational high school to those in all high schools in the district cannot affect the student’s choice of high-school type. In the 1960s, the authorities outlawed applying to schools outside one’s own district, and the law was strengthened after abolishing the high-school entrance exam. Thus this assumption is not violated in the real world in Korea. One might expect little variation over time in the proportion of vocational high- school students. Since 1970, however, the ratio of incoming general high-school stu- dents to those in vocational high schools has changed across school districts and over time. The ratio in each year and each district is calculated from the Statistical Year- book of Education published by the research institutes, the Korean Educational Devel- opment Institutes and the Ministry of Education and Human Resources Develop- ment. However, the Ministry began to publish these statistical yearbooks only in 1964 and the data available via internet date only from 1971. Figure 1.1, Figure 1.2, and Figure 1.3 show the ratios and patterns calculated from those books. The natural instrumental variable is the target fraction of the capacities that the educational authorities set exogenously. A problem with this strategy is that we do not know the exact capacity number targeted, but as a close approximation we use the 20 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1971 1973 1975 1977 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 seoul pusan daegoo inchon kwangjoo taejon ulsan gyeong-gi gangwon chung-book chung-nam jeon-book jeon-nam gyeong-book gyeong-nam jeju all Figure 1.1: Ratio of Vocational High School 21 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1971 1973 1975 1977 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 year ratio seoul pusan daegoo inchon kwangjoo taejon ulsan gyeong-gi gangwon chung-book chung-nam jeon-book jeon-nam gyeong-book gyeong-nam jeju all Figure 1.2: Ratio of Vocational High School: Male 22 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1971 1973 1975 1977 1979 1981 1983 1985 1987 1989 1991 1993 1995 1997 1999 2001 2003 year ratio seoul pusan daegoo inchon kwangjoo taejon ulsan gyeong-gi gangwon chung-book chung-nam jeon-book jeon-nam gyeong-book gyeong-nam jeju all Figure 1.3: Ratio of Vocational High School: Female 23 actual number or percentage of incoming vocational students as a fraction of all new high-school students abirthyear-schooldistrict-cohort. 1.5.2 Estimation Result As in Section 1.4, the log earnings regression to be estimated follows the linear specifi- cation. y i = 0 x i + D i +u i wherey i are the log of hourly wages for individuali, x i is a covariate vector includ- ing a constant andD i is a dummy variable equal to 1 if personi graduated from a voca- tional high school and 0 if otherwise. Now, contrary to the preliminary OLS analysis, I do not make the assumption that the unexplained part of log earnings, u i , is assumed to be independent of the dummy variable indicating the status of the graduates of vocational high school D i . As discussed in Subsection 1.5.1, the ratio of new students in vocational high school to those in all high schools in a birth year-school district cohort will be used as an instrument for the indicator variable of vocational high-school attendance. I define NV j ,t and NH j ,t to be the number of new students in vocational high schools and the whole number of new students in all high schools, respectively, in year t and in a school district j in order to construct ¯ D j ,t . ¯ D j ,t = NV j ,t NH j ,t 24 The instrument to be used in the IV regressions is ¯ D i = ¯ D j ,t if an individual i entered the high school in the school district j in year t. Now I exploit the third assumption—that an individual student cannot apply to another school district. The school district in which the individual i resided when i was about to decide can be another instrumental variable, R i , a vector of dummy variables. R j ,i = 8 < : 1 if i lived in a school district j 0 Otherwise R i =(R 1,i , ,R 15,i ) Consider the first-stage regression of the endogenous regressor, the vocational high-school graduates indicatorD i , on covariates x i on the instrument ¯ D i and on other instrumental variables R i . D i = 0 x i + ¯ D i + 0 R i + i (1.5) Table 1.5 reports the first-stage regression results describing the relationship between the dummy variable of vocational high school and the ratio of the number of students in vocational high school to the number of all new high-school students in the year when the respondent enters the school in the school district where she or he resides. Conditional on the place where the workers resided when they were high-school students and other covariates, the ratio of capacities of vocational high schools to those of all high schools significantly affects the choice of high-school type. In fact, the 25 Table 1.5: First-stage Regression Estimation Dependent Variable: VH (1) (2) (3) (4) (5) The Ratio of 0.138 0.525*** 0.487*** 0.485*** 0.507*** Vocational High School (0.169) (0.132) (0.181) (0.182) (0.181) Female 0.041 0.011 0.007 (0.027) (0.027) (0.028) Age -0.029* -0.028** -0.038** (0.014) (0.014) (0.015) Age 2 0.026 0.026 0.036 (0.021) (0.021) (0.023) Part Time 0.023 (0.036) Married 0.048 (0.038) Seoul -0.068* (0.036) Constant 0.593** 0.567** 1.140*** 1.121*** 1.250*** (0.237) (0.243) (0.325) (0.330) (0.343) School region Dummy Yes Yes Yes Yes Yes R-Square 0.040 0.045 0.069 0.070 0.074 N 1539 1539 1539 1539 1539 * indicates the 10% significance, ** indicates the 5% significance , *** indicates the 1% significance level. Note: Columns represent estimated coefficients. Robust standard errors are reported in parentheses. VH is the indicator of graduates of vocational high school. coefficient of the ratio of new students in vocational high schools to those in all high schools in abirthyear-schooldistrict-cohort has a positive sign. This confirms the mono- tonicity of treatment assumption in the equation (1.4). Table 1.6 reports IV estimates, adopting the same empirical specification used in Subsection 1.4.1. In Columns (3), (4) and (5) of Table 1.6, IV estimates of returns to vocational high school are reported to be significantly positive. This also implies that 26 Table 1.6: IV Estimation Results for High School Graduates without College Degrees Dependent Variable: log of hourly wage (1) (2) (3) (4) (5) VH 0.168 0.114 0.328** 0.274* 0.310** (0.171) (0.161) (0.156) (0.148) (0.147) Female -0.402*** -0.348*** -0.329*** (0.034) (0.035) (0.034) Age 0.084*** 0.061*** 0.055*** (0.019) (0.019) (0.021) Age 2 -0.090*** -0.062** -0.055* (0.029) (0.029) (0.030) Part Time -0.296*** (0.041) Married 0.061 (0.044) Seoul 0.094** (0.039) Constant 1.559*** 1.737*** -0.317 0.293 0.376 (0.098) (0.093) (0.359) (0.362) (0.394) R-Square 0.001 0.096 0.072 0.141 0.172 N 1539 1539 1539 1539 1539 * indicates the 10% significance, ** indicates the 5% significance , *** indicates the 1% significance level. Note: Columns represent estimated coefficients. Robust standard errors are reported in parentheses. VH is the indicator of graduates of vocational high school. Every column has the ratio of vocational high schools over high schools and the dummy variable indicating the school district as instruments. IV estimates of returns to vocational high schools are larger by more than 30 percent than OLS estimates, which are insignificantly different from zero. Even though the estimated coefficients are not significant when one controls no covariates other than female dummy in Column (1) and (2) of Table 1.6, they have positive signs and they are much greater than OLS estimates reported in Columns (1) and (2) of Table 1.3. These IV estimates show that workers who were trained under vocational secondary education earn more than general high-school graduates who did not attend higher 27 education by about 30 percent as long as the instrumental variables are truly exoge- nous to the error terms u i in the empirical specification. Vocational education has a clear effect on productivity and hourly wages. The finding that the IV estimates are greater than the OLS estimates is consistent with the previous studies estimating returns to schooling by the IV estimation, in the sense that OLS has downward-biased estimates. It is evident from the Card (1999)’s survey and the papers reviewed in its introduction that the IV estimates of returns to schooling typically exceed the corresponding OLS estimates, often by about 30%. However, the papers assume on these grounds that OLS methods lead to upward- biased estimates of the true returns to education. For example, they think that a more able person is likely to invest his human capital and that his schooling years are thus enlarged. Under that kind of assumption, the even larger IV estimates present some- thing of a puzzle. Contrary to the above cases, in the case of vocational high school in Korea, the selection bias runs in the direction opposite to what is usually believed. The treatment is assigned to a less able student if every student in middle school is willing to go to high school. cov(D i ,u i )< 0 (1.6) Then the OLS estimates underestimate the effect of vocational education in secondary school on the logarithm of wages or productivity of a worker. As long as the ratio ¯ (D i ) is truly orthogonal to the error term u i , that is, unobservable ability or some financial constraints the student faces, the IV estimates reported in Table 6 remain consistent by eliminating the selection bias. 28 In other words, we can give the following interpretation. Less able or less pro- ductive students went to vocational high schools but more productive students, who expected to be able to go to college, entered general high schools for academic skills. The vocational high schools give students productive training programs and these stu- dents acquire skills demanded by specific jobs, especially those that do not require col- lege credentials. Meanwhile, general high schools teach students the academic skills necessary to complete higher education. However, some students in general high schools fail to enter institutions of higher education, and they participate in the labor market without the skills imparted by vocational high schools. These academic skills are not useful for jobs that do not require college degrees. Even if general high-school students are potentially productive, the education they have acquired in their school years does not meaningfully increase their productivity. On average, the productivity of general high-school alumni without college degrees seems to be equal to that of vocational high-school graduates. The OLS esti- mates show no significant effect of vocational education at the high-school level. But the IV estimates control the selection problem and eliminate the bias, showing a sig- nificant effect of vocational high school on productivity. Contrary to the widespread skepticism regarding the value of a vocational education in secondary institutions, the estimated returns to vocational high school are positive and significantly high if we consider only those high school graduates who did not go on to college. Note that these results can be interpreted as the Local Average Treatment Effects (LATE) introduced by Imbens and Angrist (1994). They can explain why the magni- tude of the estimates is so large. The IV estimates explain the results for only those students who fall into the academic margins—those who are neither remarkably high achievers nor profoundly low achievers, and who could thus be shifted from one school to another by small changes in capacity. The students who were induced to 29 attend vocational high schools by the higher capacity of those schools earn more than the students who took general education in high school. Both students have the same academic ability and the same unobservable characteristics except for their choice of high school. The skills acquired through the vocational courses are better matched with non-college-level jobs than is the academic ability developed by general courses. 1.6 Conclusion This chapter examines the relative benefits of vocational training in high school in the context of a rapidly developing economy that has experienced major technological and institutional change. I exploit changes in the capacities of vocational high schools, set by the educational authorities, to avoid the selection bias that arises because less- able students are more likely to enroll in vocational high schools. The capacity of vocational high schools must affect the decisions of the students choosing where they will receive their secondary education. If the capacity of vocational high schools were larger, the likelihood of any given student’s going to vocational high school would be higher. The changes in capacity were exogenous and shifted students from general high schools to vocational high schools every year. For marginal students, this policy change did not significantly affect other academic skills or unobservable characteristics except their choice between attending vocational school and attending general school. Using the KLIPS data from 1999-2010 and the Statistical Yearbook of Education from 1970-1999, I find evidence that men who were affected by the changes in the capacity of vocational high schools earn more than their similarly qualified general-high-school counterparts. 30 These findings are in sharp contrast to the results of previous studies, which found that vocational training programs in secondary schools are ineffective. The relation- ship between vocational training and labor-market returns highlighted by previous studies may be largely a consequence of selection. It is important to note that these estimates are more relevant for the marginal student, shifted from general to voca- tional education, than for the average student whose decision is unaffected by changes in capacity. But this group of marginal students is probably the most likely to be affected by any change in the capacity that encourages general education over voca- tional training. 31 Chapter 2 Do Developmental Mathematics Courses Develop the Mathematics? Addressing Missing Outcome Problem in Regression Discontinuity Design 2.1 Introduction The importance of mathematics cannot be emphasized too much. The importance of mathematics taught in secondary school has been shown in many studies of wages and productivity. Among all the subjects taught in secondary school, mathematics and science matter most to the productivity of individuals and national economies. Individual achievement in high-school mathematics 1 correlates with the wages of high- school graduates. The estimated effects of mathematics achievement on individual’s productivity are shown to be stronger than those of any other subjects such as English reading or writing (Rose and Betts, 2004; Goodman, 2012; Altonji, Blom, and Meghir, 2012) 2 . Also, the average student performance in mathematics 3 has been shown to 1 As measured by i) indicator variables of whether to complete the advanced math course and ii) the number of math courses completed by the individual. 2 At first, Altonji (1995) makes an attempt to systematically show the effects of high-school curricu- lum on wages. But their effects are shown to be weak in this result, though mathematics has more effects than any other subject. Similarly, Levine and Zimmerman (1995) try to estimate such effects, restricting their study to math and science courses, and their estimation results are stronger than Altonji (1995)’s. 3 As gauged by scores on standardized and international tests such as PISA international test. 32 contribute more to economic growth than performance in any other subject when one controls for quantity of education or years of schooling and restricts results to the developed countries (Hanushek and Kimko, 2000; Barro, 2001; Hanushek and Woessmann, 2008). Differences in math achievement might contribute to the wage gap, and lower performance in math might impair economic growth. In fact, in the U.S, the lower the economic and social status of an individual student, the lower his or her math achievement is likely to be. In addition, American students perform more poorly than their peers in other industrialized countries on standardized math exams. Aware of the importance of math in high school and of the weakness of education in the U.S, many studies have focused on the determinants of high-school math achievement and how to elevate them through reforms in the graduation requirements and curricu- lum standards. In particular, recent studies have shown that algebra courses play an important role in math achievement at the secondary level (Schneider, Swanson, and Riegle-Crumb, 1997; Gamoran and Hannigan, 2000), and hence are a key factor in performance in postsecondary-level math (Adelman, 2006; Long, Iatarola, and Con- ger, 2009; Long, Conger, and Iatarola, 2012), and finally, in the outcomes of labor markets (Rose and Betts, 2004; Goodman, 2012; Altonji, Blom, and Meghir, 2012). Moreover, there have been made many attempts to improve low achievement in alge- bra courses; for example, an acceleration of algebra 1 and a universal algebra policy were implemented in California. Yet in spite of many interventions at the early stage in secondary school, many students graduate from high school with insufficient math skills. To those students, community colleges have generously granted a second chance through the open admission policy and a sequence of developmental mathematics courses. Many students are assessed as lacking skills in algebra or high-school math, 33 and they are banned from college-level mathematics such as trigonometry and calcu- lus. Instead, they are assigned to any one of a variety of courses in developmental mathematics. Only about 9% of community-college students were assigned to non- developmental mathematics courses in California (Serban et al., 2005), and over 40% of such students need to be educated at the high-school level nationwide (Adelman, 2004). Since most of these students come from minority backgrounds, the use of developmental mathematics may be an excellent way to decrease the wage gap by improving their math skills. However, there is one respect in which developmental mathematics offered at community colleges is identical to high-school math: material. Some opponents argue that this is a typical example of waste of the public resources and that providing almost costless second chances can demoralize high-school mathe- matics education. Despite its prevalence and the controversy surrounding it, few elaborate studies have paid attention to the effect of the developmental mathematics offered at com- munity colleges. Most studies of community colleges’ developmental mathematics are descriptive analyses. There are two remarkable studies of general developmen- tal education where the effect of developmental mathematics on various outcomes is estimated (Calcagno and Long, 2008; Martorell and McFarlin, 2011). To address the endogeneity or self-selection problem, both exploit the assignment rule based on test scores for regression discontinuity design. One drawback exists– these studies pay attention not to academic performance in mathematics itself, but to general long-term outcomes such as credential attainment, transfer to 4-year institutions, and graduation with a degree. They are far from the direct measure of math achievement. Both stud- ies neglect the estimation of the fundamental function of developmental education, 34 which is that a developmental course should help a student make up for the lack of that course and be ready for the next-level course. Instead of long-term outcomes, I examine a short-term outcome: the performance in a math course 4 . I investigate the effect of developmental math on the student’s performance in the next-level math course. For each developmental math course, its corresponding outcome is defined by i) the grade point average (GPA) on its subse- quent course and ii) the time to complete that subsequent course. The corresponding control group is the students assigned directly to this subsequent course. Because of my interest in improving algebra achievement, I choose to look at the performance in two algebra courses: elementary algebra and intermediate algebra. Their corre- sponding treatments are their prerequisite courses, pre-algebra and elementary algebra respectively. Thus, studying the effect of the developmental math sequence program is equivalent to studying the effect on the one specific course of its prerequisite course. Using a longitudinal dataset of one community college in southern California, this study tries to estimate the effect of a developmental math course the performance in its subsequent math course. I also rely on regression discontinuity design, since the placement policy of the chosen community college assigns students to the specific course based on their scores on the assessment test. Contrary to the other studies using regression discontinuity design, however, one serious difficulty arises here: the missing outcome problem. Achievement in a math course can be observed and defined only if a student finishes/completes it. A conventional approach to the missing out- comes (or the sample selection problem) is to generate control function variables to correct the bias from the missing outcomes (or sample selection) by exploiting the 4 For a developmental English program, Moss and Yeaton (2006) use the letter grade on the first college-level course as the measure of performance. The treatment is the course of one level below the college English course. They use regression discontinuity design to address the endogeneity of course assignment. However, their way of interpreting the estimated effects is uncertain and they ignore the missing outcome problems. 35 exclusion restriction or instrumental variables. But such an approach cannot be used here, since the observability of the outcome is discontinuous at the cutoff point 5 . Due to the structure of developmental sequence, the observability differs drastically between the control and treatment groups, even when the study is restricted to the students whose test scores are close to the cutoff point. Those who are assigned to the prerequisite course are significantly less likely to proceed to the next-level course and finish it, because the assignment to the prerequisite course requires longer time and higher opportunity costs. The main contribution of this study is to compute the bounds for the treatment effect of the prerequisite math course on the next math course in a sequence of devel- opmental math courses, addressing the missing outcome problem in regression dis- continuity design, a problem that cannot be handled by the conventional method. By modifying the bounding approach in the case of missing outcome problems (Horowitz and Manski, 1995, 2000; Lee, 2009) into regression discontinuity design, I can solve the problem of the structural difference in observability between the control and the treatment groups. Applying this bounding approach in regression discontinuity design, I find that assignment to developmental courses would increase achievement in the subsequent math courses and reduce the time to complete the main course. The estimated effects of some developmental courses are found to be insignificant, but these results are con- founded with the discontinuity in high-school math achievement, which is measured by multiple measure points calculated by the selected community college. Adjusting for this discontinuity, the insignificance of estimates is shown to be due to downward bias. This result contrasts with the ineffectiveness of developmental mathematics on 5 Another reason is that reliable exclusion restrictions or instrumental variable cannot be found. 36 long-term outcomes such as transferring to four-year colleges and labor-market out- comes (Calcagno and Long, 2008; Martorell and McFarlin, 2011). The rest of this chapter is organized as follows. Section 2.2 describes develop- mental education and the sequences of mathematics courses, and gives a summary of the previous literature. Section 2.3 begins with a brief description of the estimation method in regression discontinuity design. It is followed by the explanation of the nature of the missing problems which appear in this study. Regarding the missing out- come problem and regression discontinuity design, the bounding approach is derived in the context of this study. In Section 2.4, I describe the sample and the outcomes used for the analysis. Section 2.5 reports the results from the empirical analysis of the chosen community college, and Section 2.6 discusses the validity tests for regression discontinuity design. Section 2.7 concludes. 2.2 Developmental Math Program in Community Colleges 2.2.1 Developmental Education in Community Colleges One of the primary roles of community colleges is to offer developmental, remedial or preparatory education (Cohen and Brawer, 2008; Grubb, 2004). The definition of developmental course work is straightforward. Developmental education in com- munity colleges is defined as coursework below college level offered at postsecondary institutions 6 . In the process of developmental education, students learn the academic skills and knowledge that should have been acquired in high school. 6 Developmental or remedial programs in K-12 are different from the ones used by community col- leges or postsecondary schools. For example, summer school and grade retention are designed to help disadvantaged students to reach the minimum standard at those schools. 37 The reason developmental education is so widely practiced in community colleges is that such colleges adopt an open admission policy. The open admission policy lets in anyone who wants to enroll in a community college without entrance require- ments. Due to the open admission policy, however, there exists wide variation in students’ academic preparation. In particular, the most poorly prepared group of high school graduates is in community colleges and they want to go to four-year col- leges. Developmental education is designed to give those students the chance to make up their deficiencies of skill. Due to developmental education, community colleges have been called the most important “second-chance” institutions and “people’s col- leges” (Grubb, 2004). Most of community colleges offer developmental education in two fundamental subjects: English reading/writing 7 and mathematics. In order to determine whether a student should enroll in developmental course- work, he or she should be assessed through placement tests when entering the com- munity college. Placement tests assess how much students learned in high school and determine what courses are appropriate for them. The placement test can reveal how many students need developmental education. According to Serban et al. (2005), only about 9% of students were assigned to non-developmental mathematics courses and about 27% of students were assigned to any non-developmental English courses in the California Community College System 8 . 7 English as a second language (ESL) programs can belong to developmental reading/writing educa- tion. 8 Nationwide, more than 40% of community-college students need to be educated at the high-school level (Adelman, 2004). 38 2.2.2 A Sequence of Mathematics Courses In community colleges, developmental mathematics takes priority over other develop- mental education. On average, community colleges typically offer one more develop- mental course in mathematics than in English reading or writing (Parsad, Lewis, and Greene, 2003). At the level of individuals, Adelman (2004) finds that the proportion taking only developmental mathematics is at least 25% higher than the proportion taking other developmental courses among freshmen enrolled in any development education at community colleges. Moreover, a developmental mathematics course costs more than other developmental courses or regular college courses because of the large number of students in the developmental courses and the very high rates of withdrawal. Developmental mathematics courses are differentiated and sequentially organized. Typically, mathematics is organized as a cumulative and linear sequence of topics. These sequences are designed so that a student must master certain concepts and skills in an assigned course before advancing to a course one level higher. If the student does not master the given concepts and skills, he or she cannot enroll in higher-level courses such as college-level mathematics. Thus, individual courses are part of a larger uni- fied subject that is minimally necessary for learning college-level mathematics. These courses are taught with progressive levels of difficulty throughout the developmental sequence. The sequence of developmental mathematics courses is organized hierarchi- cally by topic and ability tracking. Most of these properties of developmental mathematics are shared with secondary schools’ mathematics sequences 9 . The courses taught in the developmental mathe- matics sequence of community colleges are equivalent to the ones in the high-school 9 High-school mathematics sequences are described in Schneider, Swanson, and Riegle-Crumb (1997). 39 mathematics sequence. The most common courses of the development sequence are 1) arithmetic, 2) pre-algebra, 3) elementary algebra, and 4) intermediate algebra. Arith- metic is generally the lowest level of mathematics. It reviews the fundamentals of arithmetic that are essential to success in the other mathematics courses, and it cov- ers the material of pre-8 th grade mathematics. A pre-algebra course bridges the gap between arithmetic and general algebra. An elementary algebra course is for those who have had no algebra 1 in high school or whose preparation is deficient, while an intermediate algebra covers the material of algebra 2 in high-school math. The dis- tinctive feature of developmental math courses offered in community colleges is that they teach students high-school mathematics within the one unique sequence. How- ever, taking high-school math in community colleges could be a waste of time and resources for some students who are assigned to it in spite of having already taken it. 2.2.3 Algebra Especially in high-school curriculum, algebra courses are regarded as the most impor- tant. Intermediate algebra or algebra 2 is a key factor in academic achievement at the college level nationwide (Adelman, 2006; Long, Iatarola, and Conger, 2009; Long, Conger, and Iatarola, 2012) and in the outcomes of labor markets (Rose and Betts, 2004; Goodman, 2012; Altonji, Blom, and Meghir, 2012). The largest gains occur at algebra 2. Although taking elementary algebra or algebra 1 (or pre-algebra) alone does not guarantee any improvement in readiness for college-level math or in labor-market outcomes, it is identified as the gateway to success in the algebra sequence. Many con- cerns are raised about algebra courses as a serious equity and civil-rights issue, and hence many policies accelerating algebra instruction into middle school 10 have been 10 A kind of early intervention. 40 implemented to enhance student success in algebra (Gamoran and Hannigan, 2000; Loveless, 2008; Clotfelter, Ladd, and Vigdor, 2012). Unlike accelerating algebra instruction, algebra courses in developmental math- ematics sequences offered in community colleges may be a kind of late intervention. They are intended for students who are deficient in algebra or who have not attempted it in secondary school. The developmental sequence intends to help those students not only prepare for college-level math but also develop skills and knowledge of algebra. For example, knowledge of elementary algebra is weak and the rate of completion of intermediate algebra is low among the students enrolled in community colleges in California (Serban et al., 2005), though early algebra-taking rates exceeded 59% in Cal- ifornia and it is higher than in the other states 11 (Loveless, 2008). Many attempts to improve algebra skills in high school have been made, but they have been shown to be ineffective (Clotfelter, Ladd, and Vigdor, 2012). With little effect from this early inter- vention, late interventions such as developmental education in community colleges can play an important role. The disadvantage of late interventions is that they would doubly waste resources unless they are effective in improving mathematics skills in those who did not benefit from intervention at an early age. So far, studies have not investigated whether the developmental mathematics sequence has assisted students who are deficient in algebra to make up for their lack of knowledge. 2.2.4 Previous Literature and their Limitations While most early studies of developmental mathematics suffer from endogeneity or selection problems because math enrollments are not randomized, the recent studies are good at addressing selection bias by use of regression discontinuity 11 This result may come from the fact that the mathematics requirement for graduation is not strict in California. Completing one year of algebra 1 is the minimum requirement, though the other course is necessary for postsecondary success. 41 design (Calcagno and Long, 2008; Martorell and McFarlin, 2011) or instrumental vari- able estimation (Bettinger and Long, 2009). One drawback of these recent studies is that all they are interested in only general academic outcomes such as credential attain- ment, transfer to four-year institutions, or graduation with a degree, which are a little far from any direct measure of mathematics achievement 12 . They do not pay attention to academic performance in mathematics itself. Moreover, they do not consider the detailed structure of developmental sequences which assign students to various levels. In contrast to the previous studies, Bailey, Jeong, and Cho (2010) examine the relation- ship between the initial assigned mathematics and an interesting outcome– the highest level that a student reaches in the structure of the developmental sequence. They show that the lower the level on which a student is placed, the less likely he or she is to com- plete the developmental sequence, but this cannot be firm evidence of causality due to the limitation of their descriptive method. The essence of any developmental mathematics sequence is that a course in the sequence should be designed to help a student make up for his or her own lack and be ready for the next-level course. Any given course is the prerequisite course to the next-level course in any developmental sequence. Most of the studies of developmen- tal mathematics have not investigated whether the aim of developmental math pro- grams is attained or not, i.e., whether the assigned courses in a developmental math sequence are effective in developing skill in their subsequent courses or not. It seems that estimating the effect of the developmental math sequence program is equivalent to estimating the prerequisite course’s effect on its subsequent course 13 . 12 The studies of developmental mathematics using the other measures of outcomes and other methodologies are well summarized in Bahr (2008), but most are descriptive analyses. 13 In this study, algebra courses and their relevant courses are of interest. 42 Like other studies using regression discontinuity design, the enforced assignment rule using test scores generates a good regression discontinuity design to address con- cerns regarding selection into courses on the basis of unobserved characteristics, when studying the effect of the prerequisite course on the achievement in its subsequent course. Although regression discontinuity design controls the endogeneity problem in the study of math achievement itself, one serious difficulty arises: the missing out- come problem. Many of those who were assigned to a low level math course do not proceed to the next level, even when they completed their assigned course 14 . The lower the level to which a student is assigned, the more time he or she spends there and the more it costs him or her to be in a community college. Those students are more likely to leave. In addition, easy access to community colleges through the open admission policy and low tuition makes it easier not only to enter and but also to leave the institutions. Restricting the sample to those students who finish/complete the main course would introduce a sample selection problem. A missing outcome (or sample selection) problem that occurs in the study of com- munity colleges’ developmental mathematics sequence is much more difficult to han- dle than one that arises in the other studies because the assignment itself creates a dis- continuity in missing outcome proportions between the treatment group and the con- trol group. In this case, it is impossible to correct or adjust the bias problem from the sample selection in the context of regression discontinuity design, even if any exclu- sion restrictions can be found. In the next section, I suggest how to address missing outcome (or sample selection) problems in regression discontinuity design. 14 The same pattern is frequently observed in the high-school math sequence. (Schneider, Swanson, and Riegle-Crumb, 1997). 43 2.3 Empirical Strategy: Bounding Approach in Regression Discontinuity Design 2.3.1 Regression Discontinuity Design This subsection presents an econometric model in the regression discontinuity design. It is understood in the context of Rubin’s potential outcome model. I simplify the sit- uation of community colleges, assuming that there is one main math course and one prerequisite course; e.g., the main math course is elementary algebra and the prereq- uisite is pre-algebra. The latter is a treatment to improve achievement in the main course. The treatment group consists of the students who are assigned to the prereq- uisite, while the control group consists of the students who are assigned to the main course directly. Y i,1 is what a given student i would achieve in the main course if he or she were assigned to the prerequisite. Y i,0 is what a given student i would achieve in the main course if he or she were assigned to the main course. Both outcomes can- not be observed simultaneously for the same student i. Denote a binary indicator for taking the prerequisite mathematics byT i . T i = 8 < : 1 if a student i is assigned to the prerequisite 0 Otherwise Then the observable achievement in the main math course Y i for a student i is expressed in the following equation. Y i =T i Y i,1 +(1T i )Y i,0 (2.1) 44 The individual causal effect of the prerequisite is the difference in two potential outcomes, i = Y i,1 Y i,0 . Then the average treatment effect is identified as the difference in two conditional expectations, E( i )= E(Y i,1 )E(Y i,0 )= E(Y i,1 jT i = 1)E(Y i,0 jT i = 0) if T i is randomly assigned, i.e.,(Y i,1 ,Y i,0 )?T i . However, the pre- requisite course is not randomly assigned in the real world. This causes the problem in identification of the causal effect of the prerequisite course. Regression discontinuity design takes advantage of the cutoff policy rules to esti- mate the causal effect of the prerequisite course on the achievement in the main math course. A usual assignment rule is the cutoff policy based on the student’s assessment test score. A student is assigned to a prerequisite course if her or his score on the assess- ment test is less than the exogenously determined cutoff score. When looking at the students whose test scores are close to a preset cutoff point, regression discontinuity design is similar to a random experiment in which a prerequisite course is assigned by a randomization process. Let X i be student’s assessment test score. The cutoff point c is set by the com- munity college. Then the treatment or the assignment of the prerequisite course, T i is a deterministic function of student’s test score X i in the following way 15 : T i = 1(X i <c). The assignment, however, is not random as the test score may be correlated with the educational outcome. Since students who must take a prerequisite may differ from those who are directly assigned to the main math course, the comparison of achieve- ments in the main math course between the two groups yields a biased estimator of the effect of the prerequisite on main course achievement. Yet it is reasonable to con- sider that students whose test scores are close to the cutoff score are similar. The idea 15 It is implicitly assumed that all students would always follow the placement result. A student who is assigned to the prerequisite course must take the prerequisite and always takes it.This case is called a sharp regression discontinuity. 45 that two groups whose scores are close to the cutoff score are similar is equivalent to the idea that they are similar to each other in terms of potential outcomes. It implies that the outcomes would be the same among the students who score close to the cutoff point in the assessment test, were it not for the assignment to the prerequisite course. It can be rephrased in the following assumption. Assumption 2.1. i) E(Y i,1 jX i = x) is continuous in x at c, and ii) E(Y i,0 jX i = x) is continuousin x at c If it is true, the two groups whose test scores are close to the cutoff score are thought to be randomly assigned. Then the effect of the prerequisite can be identi- fied by a comparison of outcomes between the two groups whose test scores are close to the cutoff. This is the main idea of regression discontinuity design. Under the Assumption 2.1, the effect of the prerequisite would be identified by the difference in the achievement in the main course between the students who score just below the cutoff and the students who score just above the cutoff. E( i jX i =c)= lim x"c E[Y i jX i =x] lim x#c E[Y i jX i =x] (2.2) Without further assumptions of the common effect assumption, i = for all i, only at the cutoff score x=c can treatment effects be identified. Compared to the random- ization experiment, the disadvantage of a regression discontinuity design is that what can be known are only treatment effects near the cutoff score c. Local Linear Regression Estimation The estimation of equation (2.2) may be accomplished in various ways. The most often used estimators are global polynomial regressions (Black, Galdo, and Smith, 46 2007; Lee and Card, 2008; Lee and Lemieux, 2010) 16 and local linear regression (Hahn, Todd, and van der Klaauw, 2001; Porter, 2003; Imbens and Lemieux, 2008). These two estimation approaches are generally competitive, with differing strengths and weak- nesses. Since the first approach is more sensitive to outcomes far from the cutoff than the second one, I use the second procedures to estimate the effect of its prerequisite on achievement in the main math course. Local linear regressions provide a nonparametric way of consistently estimating i in (2.2). According to Imbens and Lemieux (2008) who derive the special case of Hahn, Todd, and van der Klaauw (2001), the simple version of the local linear regression estimation can be presented. Define the conditional means on the left-hand side of x 0 in equation (2.3) and define the conditional means on the right-hand side of x 0 in equation (2.4). l (x 0 )= lim x"x 0 E(Y i jX i =x) (2.3) r (x 0 )= lim x#x 0 E(Y i jX i =x) (2.4) Then, the estimand of interest is E( i jX i =c)= l (c) r (c), denoted by. I can fit linear regression functions to the observations within a given bandwidth h on either side of the discontinuity point x= c, applying rectangular or uniform kernel to Hahn, Todd, and van der Klaauw (2001)’s estimation. min l , l X i:ch<X i <c (Y i l l (X i c)) 2 (2.5) min r , r X i:cX i <c+h (Y i r r (X i c)) 2 (2.6) 16 Global polynomial regression estimations are also thought to be nonparametric as they are variants of series estimations (Lee and Lemieux, 2010). 47 The estimate of l (c) is ^ l (c)=^ l ^ l (cc)=^ l , and the estimate of r (c) is ^ r (c)=^ r ^ r (cc)=^ r . Then the estimated treatment effect is ^ =^ l ^ r (2.7) With the additional assumption of undersmoothing the bandwidth, h/N for 1=5<< 2=5, p Nh(^ )!N 0, 4( 2 l + 2 r ) f X (c) ! (2.8) where 2 l = lim x"c Var(Y i jX i = x), and 2 r = lim x#c Var(Y i jX i = x), and f X is a density function ofX i . 2.3.2 Missing Outcome Problem The critical problem–the missing outcome problem–arises since a student’s achieve- ment in the main course can be observed only if that student completes the course. One of the reasons a student may not complete the course is withdrawal. The frequen- cies of withdrawal from the main course might not differ much between the treatment group and the control group if the sample is restricted to the students who enroll in the main course. Another reason for missing outcomes is that many students do not enroll in the main course so that their achievements Y i in the main course cannot be observed. There could be a large difference in the proportion of enrollment in the main course between two groups, compared to small differences in the proportion of withdrawal. The students who are assigned to the prerequisite course are less likely to enroll in the main course even though most have successfully completed the prereq- uisite. The propensity to enroll in the main course can be said to systemically differ 48 between the two groups. According to Lee (2009)’s general sample selection model, outcome observability can be modeled in Rubin’s potential outcome setting, which can allow the treatment to cause the difference in the observability of the outcome between the treatment group and the control group. S i,1 and S i,0 are potential observ- ability indicators for the treatment and control states, respectively. Denote the indica- tor of observability by S i . Then the model in (2.1) can be presented in the following way. Y i =T i Y 1,i +(1T i )Y 0,i (2.9) S i =T i S 1,i +(1T i )S 0,i (2.10) Y i is observed if S i = 1 orY i is missing if S i = 0. In addition to Assumption 2.1, the continuity assumption for the observability is necessary for the identification in regression discontinuity design. It is given in the following. Assumption 2.2. i) E(S i,1 jX i = x) is continuous in x at c, and ii) E(S i,0 jX i = x) is continuousin x at c This implies that the observability of the outcomes would be indifferent between the treatment group and the control group when the students in both groups score close to the cutoff point in the assessment test, were it not for the assignment to the prerequisite course. The first estimand of interest is E( i jX i = c)= E(Y 1,i Y 0,i jX i = c), but it is impossible to identify it by the way of (2.2) when there exists the structural difference in the observability of the outcomes between two groups. Only with the indifference in observability, i.e, S i = S i,1 = S i,0 , can the estimand be bounded via the method of 49 Horowitz and Manski (2000), additionally assuming the boundedness of the outcome Y . L E( i jX i =c) U (2.11) where U = lim x"c [E(Y i jX i =x,S i = 1)P(S i = 1jX i =x)+Y max P(S i = 0jX i =x)] lim x#c [E(Y i jX i =x,S i = 1)P(S i = 1jX i =x)+Y min P(S i = 0jX i =x)] (2.12) and L = lim x"c [E(Y i jX i =x,S i = 1)P(S i = 1jX i =x)+Y min P(S i = 0jX i =x)] lim x#c [E(Y i jX i =x,S i = 1)P(S i = 1jX i =x)+Y max P(S i = 0jX i =x)] (2.13) For the same reason, the previous studies (McCrary and Royer, 2011; Martorell and McFarlin, 2011) which include the additive separable control function to handle the missing outcome problem in regression discontinuity design fail to identify the treatment effect when there exists a structural difference in the observability of the outcomes. Instead of using the potential outcome model (2.9) with the observability equation (2.10), they model the selection process as follows: Y i =T i +m(X i )+U i (2.14) S i = 1(T i +n(X i )+V i 0) (2.15) Y i is observed if S i = 1 orY i is missing if S i = 0. Along the ways, as suggested by Heckman (1976, 1979), they assume the bivariate normality of(U i ,V i ) as for the exclusion restriction to the sample selection or the observability, and generate the control function, which is called the inverse Mill’s ratio, and include it in the main model (2.14) to estimate the treatment effect. The 50 control function is, however, discontinuous at the cutoff point when the observabil- ity of the outcomes varies structurally between the treatment group and the control group. Thus, their strategy using the exclusion restriction or the bivariate normality of(U i ,V i ) cannot identify the treatment effect if there exists systemic difference in the observability of the outcomes. 2.3.3 Bounding the Causal Effects It is necessary to invoke the additional assumption to address the structural difference in the observability of outcomes. The students who are assigned to the prerequisite course are less likely to enroll in the main course than the students who are allowed to take the main course without the prerequisite, because most students try to avoid stay- ing longer in school. As a result, assignment to the prerequisite course always reduces the observability of the outcome. This is summarized in the following assumption: Assumption 2.3 (Monotonicity). S i,1 S i,0 withprobability1. It implies that treatment assignment can only affect observability in one direction. A student who is assigned to taking the prerequisite and completes the main course would enroll in the main course and complete it if he or she had no duty to take the prerequisite. Conversely, a student who is allowed to take the main course directly and completes it might not enroll in the course and thus fail to complete it if he or she had to take the prerequisite. 51 Invoking the monotonicity assumption (Assumption 2.3), the conditional expec- tations of the non-missing outcomes Y at the limit point at c both from below and from above can be shown in the following equations. lim x"c E(YjX=x,S= 1)=E(Y 1 jX=c,S 1 = 1) (2.16) =E(Y 1 jX=c,S 1 = 1,S 0 = 1) lim x#c E(YjX=x,S= 1)=E(Y 0 jX=c,S 0 = 1) (2.17) =P(S 1 = 1jX=c,S 0 = 1)E(Y 0 jX=c,S 1 = 1,S 0 = 1) +P(S 1 = 0jX=c,S 0 = 1)E(Y 0 jX=c,S 1 = 0,S 0 = 1) The limit from below in (2.16) exactly identifies the conditional mean of Y 1 on the one groupfi :S 1 = 1,S 0 = 1g. Contrary to the limit from below, the limit from above in (2.17) cannot identify the outcomes of the one unique group. It is the mixture of the distributions of the two groups; E(Y 0 jX=c,S 1 = 1,S 0 = 1) and E(Y 0 jX=c,S 1 = 0,S 0 = 1). First, the difference in the two limits will identify the treatment effect for the subgroupfi :S 1 = 1,S 0 = 1g, if P(S 1 = 0jX=c,S 0 = 1)=P(S 1 = 0,S 0 = 1jX=c,S 0 = 1)= 0. It would be the case, if the propensity to finish/complete the main course so that the outcome might be observable were the same irrespective of the assignment to the prerequisite course, S 0 =S 1 in probability 1. However, the probability offi :S 1 = 0,S 0 = 1g would be positive for those who score barely higher than the cutoff of the prerequisite. Among those students, some would not enroll in the main course if they were forced to take the prerequisite course first. Meanwhile, they would take the main course if they were allowed to take the main course without the prerequisite. 52 If it were possible to identify and discard that subgroup lim x#c fi : S 1 = 0,S 0 = 1,X=xg from the control group lim x#c fi :S 0 = 1,X=xg, then the remaining would be lim x#c fi : S 1 = 1,S 0 = 1,X = xg, which would be comparable to the treatment group, lim x"c fi :S 1 = 1,X= xg= lim x"c fi :S 1 = 1,S 0 = 1,X= xg at the cutoff point. However, it is impossible to identify and disentangle only the subgroup lim x#c fi : S 1 = 1,S 0 = 1,X = xg from the control group. Moreover, it is important to note that only E( i jS 1 = 1,S 0 = 1,X=c) can be identified at best, since the monotonicity assumption can identify only lim x"c fi : S 1 = 1,S 0 = 1,X = xg from the treatment group. For convenience, denote the probability ofP(S 1 = 0jX=c,S 0 = 1) by, and then P(S 1 = 1jX= c,S 0 = 1)= 1. Note that can be identified by lim x#c E(SjX= x) and lim x"c E(SjX=x) from the data by the monotonicity assumption. =P(S 1 = 0jX=c,S 0 = 1)= P(S 0 = 1,S 1 = 0jX=c) P(S 0 = 1jX=c) = P(S 0 = 1jX=c)P(S 0 = 1,S 1 = 1jX=c) P(S 0 = 1jX=c) = P(S 0 = 1jX=c)P(S 1 = 1jX=c) P(S 0 = 1jX=c) = lim x#c E(SjX=x) lim x"c E(SjX=x) lim x#c E(SjX=x) is the proportion of the students whose outcomes in the main course are observ- able because of the assignment to the main course directly, but whose outcomes would not be observable if they were made to take the prerequisite course before the main course. In terms of Imbens and Angrist (1994), can be interpreted as the proportion of the marginal students who are induced to enroll in the main course and finish it to 53 finally show their outcomes. The identification result and the form are also similar to their LATE’s. Using the notation, the limit from above in (2.17) is expressed: E(Y 0 jX=c,S 0 = 1)=(1)E(Y 0 jX=c,S 1 = 1,S 0 = 1) (2.18) +E(Y 0 jX=c,S 1 = 0,S 0 = 1) Recall that it is impossible to distinguish the two subgroups lim x#c fi :S 1 = 1,S 0 = 1,X= xg and lim x#c fi : S 1 = 0,S 0 = 1,X= xg from the control group without addi- tional assumptions. Instead of invoking additional assumptions to separate those two subgroups from the control group, the extreme situation can be imagined. Consider the potential achievements Y 0 in the main course when the prerequisite course is not being taken. Without the help of the prerequisite course (T= 0), the potential achieve- mentsY 0 of those who would always take the main course even with the restriction of taking the prerequisite course (lim x#c fi : S 0 = 1,S 1 = 1,X= xg) are always higher (or lower) than the maximum (or minimum) achievement in the main course of those who would not proceed to the main course if they scored barely less than the cutoff point and were assigned to the prerequisite restrictions (lim x#c fi :S 1 = 0,S 1 = 1,X=xg): inffY 0 jX=c,S 1 = 1,S 0 = 1g supfY 0 jX=c,S 1 = 0,S 0 = 1g with probability 1 or supfY 0 jX=c,S 1 = 1,S 0 = 1g inffY 0 jX=c,S 1 = 0,S 0 = 1g with probability 1 Since the proportion offi : X= c,S 1 = 0,S 1 = 1g among the control group fi : X= c,S 1 = 1g can be identified from the data, an upper bound for E(Y 0 jX= c,S 1 = 1,S 0 = 1) can be obtained, trimming the lower tail of the Y 0 distribution by the proportion. Similarly, a lower bound for E(Y 0 jX= c,S 1 = 1,S 0 = 1) can be obtained, trimming the higher tail of theY 0 distribution by the proportion. 54 It is necessary to look at the distribution of the observed outcome Y of those students who score just above the cutoffc and are assigned to the main math, and find out theqth quantile,y q ; for a givenq,y q =H 1 (q) withH(y)=P(Y 0 yjX=c,S 0 = 1) 17 . Using the notation of theqth quantile, the upper bound and the lower bound for E(Y 0 jX=c,S 1 = 1,S 0 = 1) are to be obtained and they are proven to be sharp 18 . They are expressed in the following equations. E(Y 0 jX=c,S 0 = 1,S 1 = 1)E(Y 0 jX=c,S 0 = 1,Y 0 >y 1 ) E(Y 0 jX=c,S 0 = 1,S 1 = 1)E(Y 0 jX=c,S 0 = 1,Y 0 y ) Consequently, both the lower bound L and the upper bound U for E( i jX= c,S 1 = 1,S 0 = 1) are to be obtained, both of which are shown to be sharp. L =E(Y 1 jX=c,S 1 = 1,S 0 = 1)E(Y 0 jX=c,S 0 = 1,Y 0 >y 1 ) = lim x"c E(YjX=x,S= 1) lim x#c E(YjX=x,S= 1,Y>y 1 ) U =E(Y 1 jX=c,S 1 = 1,S 0 = 1)E(Y 0 jX=c,S 0 = 1,Y 0 y ) = lim x"c E(YjX=x,S= 1) lim x#c E(YjX=x,S= 1,Yy ) Note that only E( i jS 1 = 1,S 0 = 1,X = c)= E(Y 1 Y 0 jS 1 = 1,S 0 = 1,X = c) can be partially identified at best. The other parameters such as E( i jX = c) and E( i jS= 1,X= c) cannot be even partially identified with Assumption 2.1 through Assumption 2.3. 17 This distribution is identified by lim x#c P(YyjX=x,S= 1). 18 Horowitz and Manski (1995) formally proves the expectation of the outcome after truncating the tails is the sharp upper or lower bound (Horowitz and Manski, 1995, Corollary 4.1), and Lee (2009) applies it to the context of missing outcome problems in the treatment effects. 55 2.3.4 Computation of Bounds by Local Linear Regression Since the boundary problem appears in the application of the usual nonparametric kernel estimation to the regression discontinuity design, the estimation of both lower and upper bounds uses the local linear regression 19 The estimation strategies are presented in the following. First, s,r (c) = lim x#c E(SjX= x) and s,l (c)= lim x"c E(SjX= x) are to be estimated by local lin- ear regression. (^ s,r , ^ s,r )=argmin s,r , s,r X i:cX i <c+h (S i s,r s,r (X i c)) 2 (^ s,l , ^ s,l )=argmin s,l , s,l X i:chX i <c (S i s,l s,l (X i c)) 2 The estimate of s,r (c) is^ s,r (c)=^ s,r , and the estimate of s,l (c) is^ s,l (c)=^ s,l . Then the estimator of can be obtained in the following way. ^ = ^ s,r (c)^ s,l (c) ^ s,r (c) (2.19) Second, the ^ th quantile and(1 ^ )th quantile of the Y are to be estimated conditional on S 0 = 1 and around the cutoff X = c, i.e., restricting the sample to fi :cX i <c+hg, given the bandwidth h, which is used in the estimation of. ^ y ^ ,h = inffy : ^ b H h (y)g with b H h (y)= P i 1(Y i y,cX i <c+h,S i = 1) P i 1(cX i <c+h,S i = 1) (2.20) 19 Lee (2009) shows the consistency and the asymptotic normality of the kernel estimators by the generalize moments methods. 56 Third, the estimands of the upper bound and the lower bound for E(Y 0 jX = c,S 1 = 1,S 0 = 1) are r,U (c)= lim x#c E(YjX = x,S= 1,Y >^ y ^ ,h ) and r,L (c)= lim x#c E(YjX=x,S= 1,Y^ y 1 ^ ,h ), respectively. They are also estimated by local linear regression, using the same bandwidth h as in the estimation of. min r,U , r,U X i:cX i <c+h, Y i >^ y ^ ,h (Y i r,U r,U (X i c)) 2 min r,L , r,L X i:cX i <c+h, Y i ^ y 1 ^ ,h (Y i r,L r,L (X i c)) 2 The estimate of r,U (c) is^ r,U (c)=^ r,U , and the estimate of r,L (c) is^ r,L (c)=^ r,L . Finally, the upper bound and the lower bound for the treatment effect are to be computed in the following way. ^ U =^ l ^ r,L (2.21) ^ L =^ l ^ r,U (2.22) 2.4 Data Description 2.4.1 One Community College (OCCSC) Unlike Florida and Texas (Calcagno and Long, 2008; Martorell and McFarlin, 2011, respectively), California has not maintained a single universal assignment policy across all the community colleges in the state, and hence a state-level analysis is impossible when using regression discontinuity design. Since each college in California has its 57 own assignment policy, it is sensible to choose one community college when estimat- ing the effect of developmental mathematics sequence by use of regression discontinu- ity design. The chosen college is located in an urban area of southern California; it is a large state institution with an annual freshman enrollment of around 3,000 students and an annual total enrollment of around 20,000 students. Thus it is called the one of community colleges in southern California (henceforth denoted as OCCSC). All the students entering OCCSC are required to take the assessment test so that the administration can determine their level of mathematics skill. The level of mathe- matics course a student must take is determined by the cutoff points set up by OCCSC as well as her or his score on the assessment test. The assessment test used in OCCSC is the ACCUPLACER test developed by the College Board. The ACCUPLACER mathematics test is not a single-subject test. In particular, ACCUPLACER consists of three sub-categories: 1) an arithmetic test (ACCUPLACER AR), 2) an elementary algebra test (ACCUPLACER EA), and 3) a college-level mathematics test (ACCU- PLACER CLM). With the background questionnaire on an individual student, the computer admin- istrative system chooses the beginning subject test for this student. Every student should begin the ACCUPLACER test in the one specific subject. Students might fin- ish ACCUPLACER mathematics test in the same subject area as in the beginning, and be placed into some mathematics course. However, students sometimes proceed to another subject if their scores on the first subject test are too low or too high. As a result, they could take more than one subject test and finish the ACCUPLACER mathematics test in a subject area different from the beginning subject test. The assignment result depends on the score that a student receives in the last stage of the assessment test. Figure 2.1 shows the detailed cutoff policy that had been used 58 Figure 2.1: Cutoff Policy of OCCSC between 2005/6 and 2007/8 (a) Placement by ACCUPLACER AR Test ACCUPLACER AR Ref AR< 35: Arithmetic 35AR< 65: Pre-algebra AR 65: ACCUPLACER EA Ref (b) Placement by ACCUPLACER EA Test ACCUPLACER EA Ref EA< 28: ACCUPLACER AR Ref 28EA< 50: Pre-algebra 50EA< 76: Elementary Algebra 76EA< 109: Intermediate Algebra EA 109: ACCUPLACER CLM Ref (c) Placement by ACCUPLACER CLM Test ACCUPLACER CLM Ref CLM< 43: ACCUPLACER EA Ref 43CLM< 63: Intermediate Algebra CLM 63: College Level Math Note: AR means the score on ACCUPLACER AR, EA means the score on ACCUPLACER EA, and CLM means the score on ACCUPLACER CLM. ACCUPLACER AR Ref means that a student is referred to taking ACCUPLACER AR test. ACCUPLACER EA Ref means that a student is referred to taking ACCUPLACER EA test. ACCUPLACER CLM Ref means that a student is referred to taking ACCUPLACER CLM test. 59 between academic years 2005/6 and 2007/8 in OCCSC. From the fall semester of 2005 to spring 2008, the cutoff scores for the assignments had not been changed. Note that multiple measure points which are calculated from the background ques- tionnaire on students must be automatically added to all test scores in order to protect minorities. Multiple measure points are calculated based on the quantity and quality of high-school math which students previously took. The range of multiple measure points is from 0 to 4. In OCCSC, a sequence of developmental math courses consists of 4 levels of math- ematics as shown in Figure 2.2: 1) arithmetic, 2) pre-algebra, 3) elementary algebra, and 4) intermediate algebra. The description of each course in the sequence was already given in Section 2.2.2, and the aim of the developmental sequence is to help students be ready for college-level mathematics through the instruction of high-school-level courses. Not until a student completes the required intermediate algebra courses can he or she take any college-level math course as long as he or she is not assigned to college-level math. The rule of enrollment is that for taking the one specific course, a student should either complete its prerequisite course or be placed into that course. 2.4.2 Sample Criteria I examine the students who took the assessment test between academic years 2005/6 and 2007/8 in OCCSC. In this period, the assessment policy was stable, and 10,874 students were assessed in the area of mathematics. 19% of students were assigned to arithmetic, 39% to pre-algebra, 20% to elementary algebra, 18% to intermediate alge- bra, and only 4% to college-level math. Since the study’s main interest is in algebra courses, the sample was restricted to the students whose the last subject during the math assessment test was ACCUPLACER EA; 7,419 students were selected. If the stu- dent’s last subject is ACCUPLACER EA during the assessment, then her/his scores 60 Figure 2.2: A Sequence of Developmental Mathematics Courses in OCCSC Assigned to AR Assigned to PA Assigned to EA Assigned to IA Enrolled in AR Not Complete AR Complete AR Not Enrolled in PA Enrolled in PA Not Complete PA Complete PA Not Enrolled in EA Enrolled in EA Not Complete EA Complete EA Not Enrolled in IA Enrolled in IA Not Complete IA Complete IA Note: AR means arithmetic course, PA means pre-algebra course, EA means elementary algebra course, and IA means intermediate algebra course. 61 on ACCUPLACER EA will assign a student to the one of algebra courses. When the outcomes of interest are elementary algebra’s (intermediate algebra’s), the correspond- ing treatment is assignment to pre-algebra (elementary algebra). Then the control group is the students who are assigned to elementary algebra (intermediate algebra), while the treatment group is the students who are assigned to pre-algebra (elementary algebra). The first motive of this study is to examine whether a developmental math sequence in community colleges can make up for a lack of the mathematical skill that should have been imparted in domestic high schools. The students in the sample should have completed the high-school mathematics sequence not in foreign countries but in the U.S. Moreover, the students in the sample are restricted to those of an age with the average college student. Their placement results can show the effectiveness of the high-school sequence offered recently, without depreciation in math knowledge. These restrictions impart meaning to the question of whether developmental mathe- matics in community colleges can help students catch up with their peers in four-year colleges. Students were excluded from the sample if 1) they were concurrent in high school and 2) they graduated from foreign high schools, or 3) they were older than 22 years old at the assessment. Two additional but important criteria generate the final sample. The first criterion is to choose students who took the assessment test and enrolled in any math course. A student is said to participate in the developmental mathematics sequence if he or she enrolls in any math course. The students who did not enroll in any math course can- not have any meaningful outcome except the decision not to enroll in math. Although the assessment test was compulsory, the registration in OCCSC and its math program was up to an individual student’s decision. There were many students who did not enroll in any math after the assessment test. Those who did not enroll in the math 62 course, however, are not of interest because the aim of the developmental course is not to induce those to the developmental sequence but to develop mathematics skills for those who participate in the program. The second criterion is to choose the students who did not retest. The primary reason is that for those who retested, it is difficult to construct the outcomes regard- ing the development sequence because they might stop taking the assigned course and then retest to be placed at a level higher than their first assignment. The rule is rigid on retesting. Without a strong excuse, a student cannot retake the assessment test within three years after the first assessment. Nonetheless, a few students retested despite the rule. If those students were more motivated than any others, excluding those stu- dents will create some bias in the estimation. The assignment results from the retests, however, were not different from the first results, and the individual characteristics of those who retested were not different from the ones of those who did not retest. Thus, excluding those who retested does not seem to create a bias. 2.4.3 Measures of Academic Achievement Before presenting the descriptive statistics, it is necessary to define the appropriate outcomes in the study. The outcome of interest is achievement in the one mathe- matics course. Regarding the measurement of student’s achievement, however, one important problem arises. There are no standardized end-of-course tests, and hence no standardized measure of achievement. Instead, grade point averages (GPA) on a course would be used as a measure of academic achievement in a course. The first reason to use its average as the measure of achievement is that a course might consist of two semesters; e.g., elementary algebra consists of Math 113 and 114 in OCCSC. The other reason is that in many cases a student repeats taking a course. The average 63 points adjust the waste of time involved in repeating the course. Calculating the GPA on a course includes the letter grade of failure as well as the letter grade of withdrawal. Another measure is the time (or semesters) to complete the main course. Complet- ing the main course means that a student gets at least D on all the courses of the main course. If a course consists of two semester courses, the completion of a course means the completion of both semester courses. The time to complete the main course can measure the efficiency of producing meaningful achievement from the main course. Note that the GPA on the main course cannot be seen if a student withdraws from it or if he or she does not enroll. It can be observable only if he or she finishes at least one course and gets a letter grade including F. Similar to GPA, the time to complete the main course can be observable only if a student completes it or obtains at least D in all the courses of the main course. The enrollment in the main course is related to the observability of the outcomes of the main course. Unless a student enrolls in the course, he or she can neither finish nor complete it. 2.4.4 Descriptive Statistics Table 2.1 reports descriptive statistics of the selected sample. After selecting the sample by the criteria summarized above, the number of the sample is reduced to 2,483. The most significant reason for the decrease in the sample size is that many students did not enroll in any math course. The first column of the table reports all the students whose last subject during the assessment test was ACCUPLACER EA. These students were assigned to one of three algebra courses: 1) pre-algebra, 2) elementary algebra, and 3) intermediate algebra. The second column corresponds to the students who were placed in pre-algebra and the third column corresponds to those in elementary algebra. The final column describes those whose assignment results are intermediate algebra. 64 Table 2.1: Descriptive Statistics of the Entering Students who were Assessed between 2005/6 and 2007/2008 in OCCSC All Assigned Assigned Assigned to PA to EA to IA Age at the Assessment 19.0 19.1 18.9 19.0 (1.2) (1.2) (1.1) (1.2) Female 0.55 0.58 0.55 0.49 Black/Hispanic 0.71 0.79 0.71 0.52 Non U.S. Citizen 0.28 0.24 0.30 0.33 English is NOT Primary 0.42 0.43 0.41 0.44 Language Test Score 56.6 38.5 62.5 90.3 (20.9) (5.9) (7.2) (10.7) Multple Measure Points 2.28 2.11 2.36 2.57 (0.86) (0.82) (0.85) (0.87) Assigned to PA 0.47 Assigned to EA 0.34 Assigned to IA 0.19 Enroll in the Assignment 0.96 0.96 0.96 0.95 Enrolled in PA 0.46 0.96 Finish PA 0.37 0.76 Mean Grade † on PA 1.45 1.43 (1.27) (1.27) Complete PA 0.26 0.54 Semesters ‡ to Complete PA 1.25 1.26 (0.52) (0.52) Enrolled in EA 0.54 0.42 0.96 Finish EA 0.43 0.33 0.76 Mean Grade † on EA 1.59 1.38 1.69 (1.20) (1.13) (1.22) Complete EA 0.32 0.25 0.57 Semesters ‡ to Complete EA 1.42 1.57 1.34 (0.76) (0.90) (0.66) Enrolled in IA 0.43 0.19 0.47 0.95 Finish IA 0.34 0.14 0.39 0.74 Mean Grade † on IA 1.63 1.44 1.60 1.75 (1.14) (1.05) (1.14) (1.17) Complete IA 0.27 0.11 0.32 0.59 Semesters ‡ to Complete IA 1.36 1.38 1.41 1.30 (0.68) (0.71) (0.74) (0.59) Number of Observations 2483 1157 851 475 Note: Table reports means and standard deviations which are shown in parentheses for the entering students who were assessed between 2005/6 and 2007/2008 in OCCSC, and their last subject during the assessment test was elementary algebra. See text for details of sample selection. † : The mean grade on the course can be obtained if a student finishes it or gets a letter grade on it. ‡ : The semester to complete the course can be obtained if a student completes it. 65 Three important features of the data are worth mentioning. First, the lower the levels to which students are assigned, the worse their outcomes. Second, students assigned to the lower level are with from backgrounds; they are more likely to be African American or Hispanic, and they have lower multiple measure points. But it cannot be said that the assignment itself cause the results. Rather, the students with lower baseline characteristics produce worse outputs, and assignment status is corre- lated with these factors. Finally, the observability in the outcomes varies among the three groups. Their mean grades on the main course can be observable only if stu- dents finish it and get letter grades, while their time to complete the main course can be observable only if students complete it. Thus, the indicators of finishing the course and of completing the course are observability indicator variables for its mean grades and the time to complete it, respectively. The lower the level of the assigned course is, the less its propensity to be observable is. Relating to the observability of the out- comes, enrollment in the main course is important. The rate of enrollment in the main course also shows the same patterns as the finishing rate and the completion rate. Description statistics shows that almost every student (95%) followed the assign- ment result. Among the assignment statuses there are no differences in the likelihood of complying with the assignment results. In addition, Figure 2.3 shows the propor- tion of the students who were assigned to the prerequisite course and the students who actually took that course as a prerequisite. The assignment results seem to perfectly align with the placement rules, while compliance with the assignment results does not seem to be perfect. However, very few students did not follow the course assignments and it is thought to be all right to regard compliance with the assignments as almost perfect. Contrary to the other studies of community colleges using regression discon- tinuity design (Calcagno and Long, 2008; Martorell and McFarlin, 2011), I do not have 66 (a) Prerequisite: Pre-algebra (PA) Main: Elementary Algebra (EA) (b) Prerequisite: Elementary Algebra (EA) Main: Intermediate Algebra (IA) Figure 2.3: The Proportion of the Assignment to the Prerequisite Courses and the Enrollment in the Prerequisite Assignment to use fuzzy regression discontinuity and instrumental variables defined by the assign- ment in order to control noncompliance problem. Only the results based on sharp regression discontinuity will be shown. 2.5 Results 2.5.1 Differences in Enrollment Enrollment in a course is an important indicator of the observability of achievement, though not all those who enrolled in a course finished/completed it. Figure 2.4 plots the likelihood of enrollment in the main course for two cases: 1) where pre-algebra is a prerequisite to elementary algebra and 2) where elementary algebra is a prerequisite to intermediate algebra. Both cases have the same result: the rate of the enrollment in the main course is discontinuous at the cutoff point between the prerequisite course assignment and the main course assignment. Half of those assigned to the prerequisite course do not enroll in the main course. Because the relationship between test scores 67 and enrollment rates looks very flat except for the cutoff point, test scores themselves do not seem to affect the likelihood of enrollment in the main course. The difference in the enrollment can be due to only the difference in the course assignments. (a) Prerequisite: Pre-algebra (PA) Main: Elementary Algebra (EA) (b) Prerequisite: Elementary Algebra (EA) Main: Intermediate Algebra (IA) Figure 2.4: The Proportion of Enrollment in the Main Course Table 2.2 reports the estimated prerequisite assignment effects on the enrollment in the main course. ^ s,l (or^ s,r ) is the estimate of the proportion of enrollment in the main course for those assigned to the prerequisite (or those directly assigned to the main). All the estimates are obtained by local linear regression. The difference ^ s,l ^ s,r is the estimate of the causal effect of the prerequisite course assignment. Each column presents a different bandwidth used in local linear regression, and reports its corresponding result. An important issue is the choice of the smoothing parameter, the bandwidth h. There are many automatic bandwidth selectors for nonparametric regression, but two methods are used here. The first is to use Ludwig and Miller (2005) and Imbens and Lemieux (2008)’s modified cross-validation procedure. Modified cross-validation pro- cedure discards observations close to both tails, when calculating the cross-validation 68 Table 2.2: Estimated Difference in the Enrollment in the Main Course between the Group Assigned to the Prerequisite and the Group Assigned Directly to the Main. A. The main course is elementary algebra (EA) The prerequisite is pre-algebra (PA) (1) CV (2) ROT (3) Medium Bandwidth 17.5 2.9 10 ^ s,r 0.976 0.959 0.991 ^ s,l 0.530 0.523 0.562 ^ s,l – ^ s,r -0.446*** -0.436*** -0.429*** Standard Error 0.038 0.100 0.053 B. The main course is intermediate algebra (IA) The prerequisite is elementary algebra (EA) (1) CV (2) ROT (3) Medium Bandwidth 11.4 2.7 7 ^ s,r 0.951 0.921 0.935 ^ s,l 0.517 0.522 0.514 ^ s,l – ^ s,r -0.433*** -0.399*** -0.421*** Standard Error 0.064 0.133 0.084 * indicates the 10% significance, ** indicates the 5% significance , *** indicates the 1% significance level. Note: s,r = lim x#c E(SjX=x) is the fraction of the students who enroll in the main course among those who score barely above the cutoff so that they do not have to take the prerequisite course (control group). s,l = lim x"c E(SjX=x) is the fraction of the students who enroll in the main course among those who score barely below the cutoff so that they must take the prerequisite course (treatment group). Those estimators^ s,l and^ s,r are obtained by local linear regression in (2.5) and (2.6), respectively.^ s,l ^ s,r is difference in the fraction of the students who enroll in the main course, and it measures how many students in the margin do not enroll in the main course because of the assignment to the prerequisite course. Each column corresponds to the method to obtain the bandwidth h. In column (1), h is obtained by the modified cross validation (CV) method suggested by Imbens and Lemieux (2008) and Ludwig and Miller (2005), discarding the 95% of observations in tails. In column (2), h is obtained by the rule of thumbs (ROT) derived by Fan and Gijbels (1996), assuming the rectangular kernel. In column (3), h is arbitrarily set. Standard errors of^ s,l ^ s,r are estimated by (2.8) 69 criterion. It chooses the optimal bandwidthh opt , which minimizes the modified cross- validation criterion. I discard 95% of observations when choosing the optimal band- width h opt . The second is a simple automatic procedure that Fan and Gijbels (1996, Section 4.2) provide. This procedure fits a fourth-order global polynomial separately on the left and the right of the cutoff point. For either side, the rule-of-thumb (ROT) bandwidth is c ^ 2 (maxfX i gminfX i g) P i m 00 (X i ) 1 5 , where m 00 (X i ) is the estimated second derivative of the global polynomial evaluated at X i ,^ 2 is the mean squared error for the regres- sion, maxfX i g minfX i g is the range of X i , and a constant c= 2.702 is specific to the rectangular kernel used here. Between two ROT bandwidths, I choose the smaller one. The results are not only robust to the choice of bandwidth, but also to the kinds of courses. If a student was assigned to pre-algebra, he or she was 43 – 45% less likely to enroll in elementary algebra than a student who could enroll in it directly. Similarly, a student who was assigned to elementary algebra was 40 – 43% less likely to take intermediate algebra than a student assigned to intermediate algebra. In addition, the flatness of the conditional expectation of the enrollment in the main course implies that once a student was assigned to the prerequisite course he or she was 40 – 50% less likely to take the main course irrespective of test scores and the kind of prerequisite. It can be inferred that students do not enroll in the next-level course just due to the requirement of the prerequisite in itself. Whenever developmental math courses are differentiated and sequentially organized, the same problems always occur. The main course’s outcomes cannot be observed for some of those assigned to the prerequisite course all the time. The way of addressing the missing outcome problems is salient in the evaluation of developmental mathematics offered at community colleges. 70 2.5.2 Main Results: GPA on the Main Course I now turn to the results for main outcomes, the GPA on the main course. Figure 2.5 shows the proportion of finishing the main courses and the conditional expectation of mean grade on the main course. There is evidence that the rate of finishing the main course is also discontinuous at the cutoff point between the prerequisite course and its subsequent course, as seen in Figures 2.5a and 2.5b. Figure 2.5c shows that those who barely failed the cutoff score and hence were assigned to the prerequisite course pre-algebra surpassed the counterpart assigned to elementary algebra directly. But Figure 2.5d finds no discontinuity at the cutoff point between elementary algebra and intermediate algebra Note that Figures 2.5c and 2.5d can plot only the observable outcomes, and thus the shown discontinuity could overestimate or underestimate the true effects on the GPA on the main course. Table 2.3 reports the estimates of the effect of the assignment to the prerequisite course on the GPA on the subsequent main course. Two estimation procedures are used. Panel I has the result of the local linear regression estimation by conditioning on the observable GPA on the main course. Panel II reports the lower (or upper) bound for the treatment effects by discarding some portion of highest (or lowest) outcomes of the control group. The procedure in panel I corresponds to Figures 2.5c and 2.5d and it serves as the benchmark to the bounding procedure in panel II, though the first yields biased estimates due to the sample selection problem. There are two issues to be discussed before presenting the results. First, the choice of bandwidth is not yet clear, because the same bandwidth should be used for the estimation of and the computation of the bounds, L and U . The optimal band- width can be attained from either the estimation of the effect on observability or the estimation of the treatment effect on the outcome by conditioning on the observable outcomes. The curvatures are different between two outcomes, so the corresponding 71 Figure 2.5: Finishing the Main Course and Mean Grade on the Main Course (a) Finishing the Main Course Prerequisite: Pre-algebra (PA) Main: Elementary Algebra (EA) (b) Finishing the Main Course Prerequisite: Elementary Algebra (EA) Main: Intermediate Algebra (IA) (c) Mean Grade on the Main Course Prerequisite: Pre-algebra (PA) Main: Elementary Algebra (EA) (d) Mean Grade on the Main Course Prerequisite: Elementary Algebra (EA) Main: Intermediate Algebra (IA) bandwidths are different. I choose the derived bandwidth from the estimation of the effect on the outcome. It is reasonable to think that the curvature of the true out- comes is more similar to the one of observable outcomes than the curvature of the observability indicators despite the possible bias. 72 The second issue is the inference on the treatment effects as well as the bounds. Imbens and Manski (2004) suggest the way to compute a 95% confidence interval for the parameter of interest, the effect of the prerequisite course on the GPA on the next course. The interval of[^ L ¯ C n d L p n ,^ U + ¯ C n Ô U p n ] contains the parameterE(Y 1 Y 0 jS 1 = 1,S 0 = 1,X=c) with a probability of at least 0.95, where n is the sample size,Ó L and d U are the standard errors of the lower bound and the upper bound, respectively, and ¯ C n satisfies ¯ C n + p n(^ U ^ L ) max(Ó L ,d U ) ! ¯ C n = 0.95 But the variances 2 L and 2 U are not discussed, though the identification and esti- mation of the bounds L and U are shown in Section 2.3. Instead of deriving the analytic asymptotic variances 20 , bootstrapping is used to estimate the variance of the bounds (Horowitz, 2001; Horowitz and Manski, 2000). When bootstrapping the stan- dard error of the bounds, sampling is done at the level of the test score X i , given the bandwidth. The left side of panel I in Table 2.3 contains the benchmark results for the impact of pre-algebra on the GPA on elementary algebra. Even when one restricts the sample to the students who finished elementary algebra, pre-algebra seems to help to improve the skill of students in elementary algebra. Note that this estimate can exaggerate the effect if those assigned to pre-algebra did not enroll in elementary algebra because they were believed to be inferior in math. Otherwise, it can be biased in the downward direction. There is no telling whether the estimated effects in panel I are overestimated or underestimated from the given data and assumptions. 20 This approach is unattractive because the expressions for the asymptotic variance are very lengthy and thus tedious to implement. Even the simplest case with no covariates is very complicated (See Lee, 2009, Proposition 3 and its proof). 73 Table 2.3: Effects of Prerequisite Course on the Average Grade Points of the Main Course I. Local Linear Regression Estimation, Conditioning on the Observable Outcomes. A. The main course is EA B. The main course is IA The prerequisite is PA The prerequisite is EA (1) CV (2) ROT (3) Medium (1) CV (2) ROT (3) Medium Bandwidth 11.6 4.7 8 12 4.6 8 ^ 0.687*** 0.676** 0.664*** 0.166 0.082 0.201 (0.210) (0.325) (0.251) (0.245) (0.406) (0.312) N 1022 1022 1022 684 684 684 II. Local Linear Regression Estimation of the Bounds A. The main course is EA B. The main course is IA The prerequisite is PA The prerequisite is EA (1) CV (2) ROT (3) Medium (1) CV (2) ROT (3) Medium ^ 0.334*** 0.460*** 0.344*** 0.381*** 0.435*** 0.405*** (0.076) (0.115) (0.094) (0.085) (0.157) (0.104) ^ L -0.007 -0.030 -0.055 -0.685** -0.922 -0.725* (0.258) (0.478) (0.321) (0.328) (0.735) (0.416) ^ U 0.966*** 1.530*** 0.947** 0.763*** 0.905** 1.048*** (0.271) (0.520) (0.371) (0.252) (0.427) (0.346) Lower 95% -0.021 -0.072 -0.076 -0.707 -1.001 -0.759 Upper 95% 0.980 1.576 0.972 0.780 0.950 1.077 N 2008 2008 2008 1326 1326 1326 * indicates the 10% significance, ** indicates the 5% significance , *** indicates the 1% significance level. Note: Standard errors are reported in parentheses. In the panel I, is the effect of the prerequisite on the main course, conditioning on the students who finish the main course. It is estimated by local linear regression in (2.5), (2.6) and (2.7). Its standard error is calculated by (2.8). In column (1), h is obtained by the modified cross validation (CV) method suggested by Imbens and Lemieux (2008) and Ludwig and Miller (2005), discarding the 95% of observations in tails. In column (2), h is obtained by the rule of thumbs (ROT) derived by Fan and Gijbels (1996), assuming the rectangular kernel. In column (3), h is arbitrarily set. In the panel II, the estimated bounds and their relevant results are reported. The same bandwidths as in the panel A are used for the estimation of the bounds. is the proportion of the students who finish the main course because of the assignment to the main course directly, but would not finish the main course if they were ordered to take the prerequisite before the main course. It is estimated by the local linear estimates of^ s,l and^ s,r defined in (2.19). L and U are the lower and upper bound for the effect of prerequisite course on the average grades on the main course, restricting to the students who always finish the main course irrespective of the assignment result. Both are estimated by (2.21) and (2.22) and their preceding procedures. Standard errors of ^ , ^ L , and ^ U are calculated via 500 bootstrap replications, where sampling is done at the test score level. 74 Instead of point identification of the effects, I employ bounding procedures to compute the effect of pre-algebra in panel II. The upper bound estimates means that the assignment to pre-algebra would increase the GPA on elementary algebra by 0.95 – 1.3 points in the best case. The lower bound estimates implies that there are no significant effects on elementary algebra in the worst case. The estimated propor- tion ^ implies that 33 – 44% of those who were assigned to elementary algebra and finished it would not finish elementary algebra if they were required to take pre- algebra. Some would drop off the developmental program because they thought that they had already mastered pre-algebra and that taking pre-algebra would be a waste of time. In contrast, some would exit the developmental math sequence because they were not confident of passing pre-algebra. The lower bound corresponds to the case where all the students in lim x#c fi : S 0 = 0,S 1 = 1,X = xg are of the first type, while the upper bound corresponds to the case where all the students in lim x#c fi : S 0 = 0,S 1 = 1,X= xg are of the second type. It is the extreme case that lim x#c fi : S 0 = 0,S 1 = 1,X= xg consists of only one type. In particular, the lower bounds can be realized only if all the students who would not finish elementary alge- bra were assigned to pre-algebra, and its corresponding estimate looks like almost zero. As a result, the true effect can be significantly positive, though it can be lower than not only the upper bound but also the estimates from conditioning on the observable outcomes. Contrary to the effect of pre-algebra, the estimated effects of elementary algebra on the mean grade on intermediate algebra are easy to interpret. First, restricting the sam- ple to the students who finish intermediate algebra, it appears that elementary algebra does not raise intermediate algebra skill. Second, the lower-bound estimates are signif- icantly negative and the upper-bound estimates are significantly positive, irrespective of the choice of bandwidth. In addition, the median value of both bounds is close 75 to zero. It implies that the effect of elementary algebra is much more likely to be insignificant. 2.5.3 Main Results: Time to Complete the Main Course The second main outcome is the time to complete the main course. The time is mea- sured in terms of semesters. This outcome might reflect learning efficiency. Figure 2.6 shows the conditional expectation of time (semesters) to complete the main course and the proportion of observable outcomes, i.e., the proportion of completion of the main courses. Compared to the rate of finishing the main course, the rate of completion of the main course looks less discontinuous at the cutoff point, as seen in Figures 2.6a and 2.6b. Even if the proportion of getting letter grades on the main course is discontinu- ously higher for those directly assigned to the main course, the likelihood that those students finally obtain positive grades on the main course is not so high. It implies that students who are assigned to prerequisite courses are more likely to get positive grades on the main courses, at least when the sample is restricted to students who finish the main course or get the letter grade. Both Figures 2.6c and 2.6d give no visible evidence of the discontinuity at the cutoff point between the prerequisite and the main course. Note that Figures 2.6c and 2.6d can only plot the observable outcomes, and thus the shown graphs may not reflect the true patterns. Table 2.4 reports the effects of the prerequisite course on the time to complete the main course. Panel I has the result of the local linear regression estimation, condi- tioning on the observable time to complete the main course. This panel corresponds to Figures 2.6c and 2.6d. The estimates in panel I may give biased estimates due to the sample selection problem. Panel II reports the lower (or upper) bound for the treatment effects discarding the highest (or lowest) outcomes of the control group. 76 Table 2.4: Effects of Prerequisite Course on the Time to Complete the Main Course I. Local Linear Regression Estimation, Conditioning on the Observable Outcomes. A. The main course is EA B. The main course is IA The prerequisite is PA The prerequisite is EA (1) CV (2) ROT (3) Large (1) CV (2) ROT (3) Large Bandwidth 5.3 2.6 8 7.9 2.5 12 ^ -0.512** -0.652** -0.141 0.260 0.329 -0.110 (0.219) (0.314) (0.179) (0.228) (0.415) (0.180) N 775 775 775 548 548 548 II. Local Linear Regression Estimation of the Bounds A. The main course is EA B. The main course is IA The prerequisite is PA The prerequisite is EA (1) CV (2) ROT (3) Large (1) CV (2) ROT (3) Large ^ 0.349** 0.234 0.233* 0.269* 0.388 0.283** (0.155) (0.344) (0.135) (0.160) (0.287) (0.137) ^ L -0.988*** -1.099*** -0.703** -0.414 0.236 -0.798* (0.253) (0.412) (0.279) (0.382) (0.993) (0.328) ^ U 0.165 -0.386 0.105 0.518** 0.729 0.344* (0.265) (0.351) (0.257) (0.229) (0.465) (0.203) Lower 95% -1.009 -1.147 -0.722 -0.445 0.093 -0.820 Upper 95% 0.187 -0.345 0.122 0.537 0.796 0.357 N 2008 2008 2008 1326 1326 1326 * indicates the 10% significance, ** indicates the 5% significance , *** indicates the 1% significance level. Note: Standard errors are reported in parentheses. In the panel I, is the effect of the prerequisite on the main course, conditioning on the students who complete the main course. It is estimated by local linear regression in (2.5), (2.6) and (2.7). Its standard error is calculated by (2.8). In column (1), h is obtained by the modified cross validation (CV) method suggested by Imbens and Lemieux (2008) and Ludwig and Miller (2005), discarding the 95% of observations in tails. In column (2), h is obtained by the rule of thumbs (ROT) derived by Fan and Gijbels (1996), assuming the rectangular kernel. In column (3), h is arbitrarily set. In the panel II, the estimated bounds and their relevant results are reported. The same bandwidths as in the panel A are used for the estimation of the bounds. is the proportion of the students who complete the main course because of the assignment to the main course directly, but would not complete the main course if they were ordered to take the prerequisite before the main course. It is estimated by the local linear estimates of^ s,l and^ s,r defined in (2.19). L and U are the lower and upper bound for the effect of prerequisite course on the time to complete the main course, restricting to the students who always complete the main course irrespective of the assignment result. Both are estimated by (2.21) and (2.22) and their preceding procedures. Standard errors of ^ , ^ L , and ^ U are calculated via 500 bootstrap replications, where sampling is done at the test score level. 77 Figure 2.6: Completion of the Main Course and Time to Complete the Main Course (a) Completion of the Main Course Prerequisite: Pre-algebra (PA) Main: Elementary Algebra (EA) (b) Completion of the Main Course Prerequisite: Elementary Algebra (EA) Main: Intermediate Algebra (IA) (c) Time to Complete the Main Course Prerequisite: Pre-algebra (PA) Main: Elementary Algebra (EA) (d) Time to Complete the Main Course Prerequisite: Elementary Algebra (EA) Main: Intermediate Algebra (IA) First, I look at the effect of the pre-algebra course on the time to complete elemen- tary algebra. Contrary to the effects on the GPA, this estimation’s results are sensitive to the choice of bandwidths when restricting to observable outcomes. In the case of estimation of pre-algebra’s effect on elementary algebra, the estimate conditioning on 78 the observable outcomes is from -0.65 to -0.51 when using the optimal bandwidth selected by the rule-of-thumb and the cross-validation procedure. Meanwhile, the esti- mate using large bandwidths is not significant. Their bounds are less sensitive to choice of bandwidth. Upper bounds are always insignificant. Although the variation in the lower bounds’ magnitude is a little larger than in the case of estimating the effect on the mean grades, the lower bounds are also significantly negative. Applying the same reasoning as in interpreting the bounds of the effects on the GPA, the assignment to pre-algebra seems to reduce the time to complete elementary algebra. In particular, the computed bounds and 95% intervals belong to the negative region, when using the ROT chosen bandwidth. This result supports the conclusion that pre-algebra would reduce the time to complete elementary algebra, and hence accelerate learning it. Second, the effects of elementary algebra on time to complete intermediate alge- bra are examined. The estimates conditioning on the observable outcomes are always insignificant regardless of bandwidth. However, the estimated bounds are sensitive to the choice of bandwidth. Using the bandwidth (2.5) by ROT, elementary alge- bra seems to rather increase the time to complete intermediate algebra. Even though the computed bounds are insignificantly different from zero, 95% intervals belong to the positive area. Using the bandwidth (7.9) from the cross-validation method, the upper bound is significantly positive and the lower bound is insignificant 21 But the large-sized bandwidth gives the opposite results. The estimates by use of the above bandwidths imply that elementary algebra assignment would reduce the time to com- plete the main course. Thus, the effect of the assignment to elementary algebra on the time to complete intermediate algebra cannot be concluded now. 21 The median of two bounds seems to be zero, and hence it implies no significant effects of the elementary algebra. 79 2.5.4 Sensitivity to the Choice of Bandwidths Using the all available bandwidths, I check the estimates’ robustness to the choice of bandwidths. Figure 2.7 shows the estimated effects on the GPA on the main course along the line of the bandwidth. The estimated bounds are stable for both i) the pre- algebra assignment’s effect on the GPA on elementary algebra and ii) the elementary algebra assignment’s effect on the GPA on intermediate algebra. Pre-algebra would increase the GPA on elementary algebra, while elementary would not significantly increase the GPA on intermediate algebra. There is no difficulty in interpreting the results for the prerequisite effects on the GPA on the main course. Figure 2.8 shows the estimated effects on the time or semesters it took for students to complete the main course along the line of the bandwidths. It is notable that the estimated effects are very strong when using the tiny bandwidths: i) the effects of the assignment to pre-algebra on elementary algebra are strongly negative, and ii) the effects of the assignment to elementary algebra on intermediate algebra are strongly positive. But when using the wider bandwidths, the significance looks smaller and smaller. Another noticeable point is that the sign of the effect of elementary algebra on the time to complete intermediate algebra flips when the size of bandwidth exceeds around 9. Contrary to the estimation of the effect on the GPA, the prerequisite effects on the time to complete the main course are more sensitive to the bandwidth choice since the estimation in regression discontinuity design is a limit at the cutoff point. The reason is that the curvatures of the conditional expectations of the time to complete main course change drastically along the test scores, as seen in Figures 2.6c and 2.6d. Despite the sensitivity to the bandwidth choice, the preferred results are the computed bounds which smaller bandwidths give. Thus, it follows that pre-algebra would increase the efficiency of learning elementary algebra, by reducing the time to 80 Figure 2.7: Bandwidths and the Estimated Prerequisite Effect on the Mean Grade (a) Prerequisite: Pre-algebra (PA) Main: Elementary Algebra (EA) (b) Prerequisite: Elementary Algebra (EA) Main: Intermediate Algebra (IA) 81 Figure 2.8: Bandwidths and the Estimated Prerequisite Effect on the Time to Com- plete the Main Course (a) Prerequisite: Pre-algebra (PA) Main: Elementary Algebra (EA) (b) Prerequisite: Elementary Algebra (EA) Main: Intermediate Algebra (IA) 82 complete elementary algebra by a half semester. However, the assignment to elemen- tary algebra would slow down learning intermediate algebra. 2.6 Validity of Regression Discontinuity Design So far I have estimated the effect of the prerequisite on achievement in the subsequent course. The estimation results are reliable as long as Assumptions 2.1 and 2.2 hold. Two assumptions imply that students who are assigned to the prerequisite and students who are not required to take the prerequisite are similar in all the other characteris- tics that determine achievement outcomes when one restricts the sample to students who were close to the cutoff point for the prerequisite course. Then the assumptions required for the validity of regression discontinuity design have two testable hypothe- ses: (1) the conditional expectation of preset covariates and (2) the density of the test scores should be continuous at the cutoff (Lee, 2008; McCrary, 2008). In this section, I will test these important hypotheses. 2.6.1 Discontinuities in Covariates The continuity Assumptions 2.1 and 2.2 implies that all predetermined characteristics at the time of the assessment should be similar for the groups of students barely fail- ing and barely passing the cutoff. Although it is impossible to test those assumptions directly, Lee (2008) derives the necessary condition by invoking additional assump- tions. While it cannot be certain that the unobservable characteristics of students are continuous at the cutoff point, the validity of this assumption can be tested by ensur- ing that the conditional expectations of the observable characteristics do not vary dis- continuously in the neighborhood of the cutoff score. Table 2.5 reports the estimated 83 Table 2.5: Estimated Discontinuity in Covariates Discontinuity at the Cutoff Point Discontinuity at the Cutoff Point between Pre-algebra between Elementary Algebra and Elementary Algebra and Intermediate Algebra Bandwidth Bandwidth Variable (1) 11.6 (2) 4.7 (3) 8 (4) 5.3 (5) 2.6 (6) 8 (1) 12 (2) 4.6 (3) 8 (4) 7.9 (5) 2.5 (6) 12 Age at the Assessment 0.08 -0.05 -0.07 -0.04 -0.15 -0.07 0.03 0.29 0.03 0.05 0.24 0.03 (0.15) (0.26) (0.19) (0.24) (0.35) (0.19) (0.19) (0.30) (0.23) (0.23) (0.43) (0.19) Female 0.12* 0.13 0.13 0.14 0.02 0.13 -0.04 -0.08 -0.05 -0.05 -0.15 -0.04 (0.06) (0.11) (0.08) (0.10) (0.14) (0.08) (0.08) (0.13) (0.10) (0.10) (0.18) (0.08) Black/Hispanic -0.03 0.03 -0.04 0.05 -0.02 -0.04 -0.08 -0.09 -0.04 -0.05 0.05 -0.08 (0.06) (0.09) (0.07) (0.08) (0.12) (0.07) (0.08) (0.13) (0.10) (0.10) (0.17) (0.08) Non U.S. Citizen -0.04 -0.06 -0.09 -0.09 -0.04 -0.09 0.09 0.08 0.07 0.08 0.22 0.09 (0.06) (0.09) (0.07) (0.09) (0.12) (0.07) (0.07) (0.12) (0.09) (0.09) (0.16) (0.07) English is NOT Primary 0.03 -0.05 -0.03 -0.04 -0.12 -0.03 0.00 0.13 0.01 0.01 0.23 0.00 Language (0.06) (0.10) (0.08) (0.10) (0.14) (0.08) (0.08) (0.13) (0.10) (0.10) (0.18) (0.08) Multple Measure Points -0.01 0.10 -0.01 0.10 0.11 -0.01 -0.22* -0.43** -0.30* -0.32** -0.71** -0.22* (0.11) (0.18) (0.14) (0.17) (0.26) (0.14) (0.14) (0.21) (0.16) (0.16) (0.31) (0.14) Number of Observation 2008 1326 * indicates the 10% significance, ** indicates the 5% significance , *** indicates the 1% significance level. Note: Standard errors are reported in parentheses. All the estimated discontinuities are obtained by local linear regression. The corresponding bandwidths come from Table 2.3 and Table 2.4. 84 discontinuities in the covariates available from the data set, using the same local lin- ear regression procedures as treatment effects being estimated. All the bandwidths correspond to the ones in Table 2.3 and Table 2.4. First, look at the case in which pre-algebra is assigned as the prerequisite course and elementary algebra is the main course. The estimated discontinuities for most of the covariates are tiny, as seen in the table. It is particularly notable that multiple mea- sure points are nearly identical. Multiple measure points are calculated based on the quantity and quality of high-school math which students previously took. Unlike the other covariates, multiple measure points are academic performance measures related to community-college math. If they are discontinuous at the cutoff point between pre-algebra and elementary algebra, the estimated effect of the pre-algebra cannot be reliable because of the confoundedness. Moreover, particular attention should be paid to multiple measure points since they are directly added to the test score (ACCU- PLACER) to form the adjusted test score, which is actually used for course place- ment. No difference in multiple measure points around the cutoff ensures that the assignment to pre-algebra is locally randomized by the placement policy and the test score (ACCUPLACER). For those who are close to the cutoff point between elementary algebra and inter- mediate algebra, all the covariates are continuous at the cutoff point except the most important covariate: multiple measure point. Those assigned to elementary algebra have (0.29 – 0.36 points) smaller multiple measure points, compared to those who were assigned to intermediate algebra without the requirement of elementary alge- bra. It means that those assigned to elementary algebra are significantly less likely to complete either algebra 2 or beyond-algebra math. From the standpoint of measuring outcomes in regression discontinuity design, this difference in multiple measure points poses a potential problem. It implies that the estimated effect of elementary algebra on 85 intermediate algebra outcomes might not be unbiased. The estimated effect captures the effect of difference in achievement in high-school math between two groups. It is interesting how much this discontinuity biases the estimates of the effect of elemen- tary algebra on the outcomes in the subsequent math course. It is connected with the cause of the discontinuity. I discuss the cause of the discontinuity and the direction of bias due to this continuity after checking the discontinuity in the density of test scores. 2.6.2 Jumps in the Distribution of the Test Scores As emphasized by Lee (2008) and McCrary (2008), manipulation of test scores can be a critical threat to the validity of regression discontinuity design. Assignment to the prerequisite course around the pass-fail threshold would be randomized as long as test scores cannot be perfectly manipulated by students, teachers, etc. In the present case the manipulation seems quite unlikely, because the test used in OCCSC is ACCU- PLACER. ACCUPLACER is a computerized test system, so there is no chance to manipulate test scores. There would be an observable discontinuity in the density of baseline test scores at the cutoff if there were the manipulation of the test scores. I test for a discontinuity in the density function of the ACCUPLACER EA test 22 using a test proposed by McCrary (2008), which is called the McCrary test. Table 2.6 confirms that no statistically significant discontinuities are evident in the log of the test score densities for the ACCUPLACER EA test for the case where the prerequisite is pre-algebra and the main is elementary algebra. Figure 2.9 shows the corresponding pictures of the log of the density of ACCUPLACER EA. Note that, however, there is a noticeable difference in the log of the density, when the sample 22 Actually, the test score for the placement rule is the sum of test scores from ACCUPLACER test and multiple measure points. 86 Table 2.6: McCrary Manipulation Test Log Discontinuity Estimates A. Students Enrolled in Math, who were selected to the sample in Table 2.1 Discontinuity Discontinuity at the Cutoff Point at the Cutoff Point between Pre-algebra between Elementary Algebra and Elementary Algebra and Intermediate Algebra Estimates 0.019 -0.313* Standard Errors 0.144 0.194 Bin Size 0.84 0.84 Bandwidth 12.83 10.71 Number of Observation 2483 B. All Students who were assesed and whose last test was ACCUPLACER EA Discontinuity Discontinuity at the Cutoff Point at the Cutoff Point between Pre-algebra between Elementary Algebra and Elementary Algebra and Intermediate Algebra Estimates 0.027 0.014 Standard Errors 0.098 0.115 Bin Size 0.54 0.54 Bandwidth 10.92 10.61 Number of Observation 7419 * indicates the 10% significance, ** indicates the 5% significance , *** indicates the 1% significance level. Note: Standard errors are reported in parentheses, and they are derived from McCrary (2008). is restricted to the students who stand between elementary algebra and intermediate algebra; the log of the density of the test scores of students who score just below the 87 Figure 2.9: Density of the Test Scores (a) Students Enrolled in Math Prerequisite: Pre-algebra (PA) Main: Elementary Algebra (EA) (b) Students Enrolled in Math Prerequisite: Elementary Algebra (EA) Main: Intermediate Algebra (IA) (c) All Assessed Students Prerequisite: Pre-algebra (PA) Main: Elementary Algebra (EA) (d) All Assessed Students Prerequisite: Elementary Algebra (EA) Main: Intermediate Algebra (IA) cutoff score, 76, is remarkably smaller than that of their counterparts. This is seen in Figure 2.9b. But it cannot be said that this discontinuity results from the manipulation of test scores. Rather, it is the result of restricting the sample to the students who participate in the developmental math program offered at OCCSC. For the unrestricted sample, all the students who were assessed at OCCSC, the McCrary test shows no disconti- nuity at the cutoff between those who were assigned to elementary algebra and those 88 who were not required to take elementary algebra, in Figure 2.9d and in the right column of Table 2.6’s panel B. 2.6.3 Discussion of Discontinuities in Multiple Measure Points and Density of Test Scores The discontinuities in 1) multiple measure points and 2) the density of test scores (ACCUPLACER) are found at the cutoff between elementary algebra assignment and intermediate algebra. Now I examine why this discontinuity occurs when I restrict the sample to the students who barely pass or fail the cutoff point between elementary algebra and intermediate algebra. The conjecture is that among the students whose test scores are close to the cutoff point between elementary algebra and intermediate algebra, those who completed higher-level math in high school but are assigned to ele- mentary algebra are more likely to be out of the sample than any others, when sample is restricted to those who participate in the developmental math sequence offered at OCCSC. Consider three students. Two students completed all the algebra courses and beyond in high school and have the same high multiple measure points. One stu- dent did not complete them and hence has no multiple measure points. Between the two students who completed higher-level math, 1) one student barely passes the cutoff so that he or she is assigned to intermediate algebra, while 2) another barely fails so that he or she must take elementary algebra before intermediate algebra. And 3) the student who did not complete higher-level math is assigned to elementary algebra. If the second student is significantly likely to refuse to participate in the developmen- tal mathematics sequence while the first and the third participate in the sequence, the discontinuity in multiple measure points can be seen. As a result, the density of test scores can be seen to be discontinuous at the cutoff point, too. 89 These discontinuities appear only when assigning students to elementary algebra or intermediate algebra. The first reason is that intermediate algebra or algebra 2 is the optional course in high school, unlike elementary algebra or algebra 1 in California. Those who completed intermediate algebra in high school are inclined to be proud of their completion of the optional and higher-level course. If they barely fail so that they are assigned to elementary algebra, they are too discouraged to participate in the developmental math sequence. The second reason is that those who are close to the cutoff point between pre-algebra and elementary algebra may be more homogeneous than those who are close to the cutoff point between elementary algebra and interme- diate algebra. Since multiple measure points can be acquired only if higher algebra and beyond is taken, those who stand between pre-algebra and elementary algebra obtain almost zero multiple measure points. And those might not be disappointed though they were assigned to pre-algebra. If the above statement can justify the discontinuity in multiple measure points, these discontinuities can explain the insignificance of the estimated effects of elemen- tary algebra on the subsequent intermediate algebra’s outcomes. The estimated effects of elementary algebra on the GPA on intermediate algebra (or the time to complete it) are downward (or upward) biased. Among the students assigned to elementary alge- bra, the students who completed algebra 2 are more likely to refuse the developmental math sequence or community-college education than any others. Those who are out of the sample can be considered to have higher potential to make up for their lack of knowledge of intermediate algebra, or to develop their skills. Incidentally, those stu- dents back out, and hence the placement policy based on the test score fails the local randomization in regression discontinuity design. As a result, the insignificant effects are unreliable when investigating elementary algebra’s effect on the subsequent math course, intermediate algebra. 90 2.7 Conclusion This chapter focuses on an important issue in evaluating the effects of developmental mathematics: the missing outcome problem. It is a serious issue even when the assign- ment to the prerequisite course is believed to be locally randomized close to the cutoff point for the assignment to the prerequisite and the assignment to no prerequisite. Existing sample-selection correction methods are not feasible due to discontinuity in the observability of the outcomes. In order to estimate the effects of the prerequisite course on achievement in the subsequent course, this chapter tries to bound treat- ment effects in a regression discontinuity designin the presence of missing-outcome problems. An appealing feature of the method is that the assumptions for identifica- tion and continuity of potential outcomes are typically already assumed in standard models of regression discontinuity design. Additional assumption of monotonicity of observability reflects the properties of developmental education offered at community colleges. In the case of regression discontinuity design, the continuity assumption is shown to be satisfied in the check of the validity of regression discontinuity designas illustrated in the previous section. The analysis using the proposed bounds points to two substantive conclusions about the developmental mathematics offered at community colleges. First, the evi- dence shows that assignment to developmental courses should increase the student’s GPA on the subsequent math courses. Second, prerequisite courses should reduce the time to complete the subsequent courses, and increase learning efficiency. Thus, the magnitudes found in this analysis of developmental mathematics support the opinion that developmental mathematics can help students who are unprepared for college- level math make up for their lack of skill in high-school mathematics. 91 Bibliography Adelman, Clifford. 2004. PrincipalIndicatorsofStudentAcademicHistoriesinPostsec- ondaryEducation,1972-2000. U.S. Department of Education, Institute of Education Sciences. ———. 2006. The Toolbox Revisited: Paths to Degree Completion from High School to College. U.S. Department of Education. Altonji, Joseph G. 1995. “The Effects of High School Curriculum on Education and Labor Market Outcomes.” JournalofHumanResources 30 (3):409–438. Altonji, Joseph G., Erica Blom, and Costas Meghir. 2012. “Heterogeneity in Human Capital Investments: High School Curriculum, College Major, and Careers.” NBER Working Paper Series 17985, National Bureau of Economic Research. Angrist, Joshua D. and Alan B. Krueger. 1991. “Does Compulsory School Attendance Affect Schooling and Earnings?” QuaterlyJournalofEconomics 106 (4):979–1014. Bahr, Peter Riley. 2008. “Does Mathematics Remediation Work?: A Comparative Analysis of Academic Attainment among Community College Students.” Research inHigherEducation 49 (5):420–450. Bailey, Thomas, Dong Wook Jeong, and Sung-Woo Cho. 2010. “Referral, Enrollment, and Completion in Developmental Education Sequences in Community Colleges.” EconomicsofEducationReview 29 (2):255–270. Barro, Robert J. 2001. “Human Capital and Growth.” American Economic Review 91 (2):12–17. Belzil, Christian and Jorgen Hansen. 2002. “Unobserved Ability and Return to Schooling.” Econometrica 70 (5):2075–2091. Bettinger, Eric P. and Bridget Terry Long. 2009. “Addressing the Needs of Underpre- pared Students in Higher Education.” JournalofHumanResources 44 (3):736–771. Black, Dan A., Jose Galdo, and Jeffrey A. Smith. 2007. “Evaluating the Regression Discontinuity Design Using Experimental Data.” Unpulished Paper, Carleton Uni- versity. 92 Blackburn, McKinley L. and David Neumark. 1993. “Omitted Ability Bias and the Increase in the Return to Schooling.” JournalofLaborEconomics 11 (3):521–543. Blundell, Richard, Lorraine Dearden, and Barbara Sianesi. 2005. “Evaluating the Impact of Education on Earnings.” Journal of the Royal Statistical Society: Sereis A(StatisticsinSociety) 168 (3):473–512. Butcher, Kristine F. and Anne Case. 1994. “The Effect of Sibling Sex Composition on Women’s Education and Earnings.” QuaterlyJournalofEconomics 109 (3):531–563. Calcagno, Juan Carlos and Bridget Terry Long. 2008. “The Impact of Postsecondary Remediation Using a Regression Discontinuity Approach: Addressing Endogenous Sorting and Noncompliance.” NBER Working Paper Series 14194, National Bureau of Economic Research. Card, David. 1995a. “Earnings, Schooling, and Ability Revisited.” InResearchinLabor Ecoomics, vol. 14, edited by Solomon Polachek. JAI Press, 23–48. ———. 1995b. “Using Geographic Variation in College Proximity to Estimate the Return to Schooling.” InAspectsofLaborMarketBehavior: EssaysinHonourofJohn Vanderkemp, edited by Robert Swindinsky Luis N. Christofides, Kenneth E. Grant. Univerisy of Toronto Press, 201–222. ———. 1999. “The Causal Effect of Education on Earnings.” In Handbook of Labor Economics, vol. 3A, edited by Orley C. Ashenfelter and David Card, chap. 30. Else- vier, 1801–1863. ———. 2001. “Estimating the Returns to Schooling: Progress on Some Persistent Econometric Problems.” Econometrica 69 (5):1127–1160. Chae, Chang-Kyun. 2004. “Is the Education of Vocational High School Successful? (in Korean).” In Symposium of Occupational Employment Statistics and Youth Panel Data. Korean Employment Information Service. Clotfelter, Charles T., Helen F. Ladd, and Jacob L. Vigdor. 2012. “The Aftermath of Accelerating Algebra: Evidence from a District Policy Initiative.” NBER Working Paper Series 18161, National Bureau of Economic Research. Cohen, Arthur M. and Florence B. Brawer. 2008. TheAmericanCommunityColleges. Jossey-Bass, 5th ed. Fan, Jianqing and Irene Gijbels. 1996. LocalPolynomialModellingandItsApplications. Chapman & Hall. Gamoran, Adam and Eileen C. Hannigan. 2000. “Algebra for Everyone? Benefits of College-Preparatory Mathematics for Students With Diverse Abilities in Early Secondary School.” EducationalEvaluationandPolicyAnalysis 22 (3):241–254. Garen, John. 1984. “The Returns to Schooling: A Selectivity Bias Approach with a Continuous Choice Variable.” Econometrica 52 (5):1199–1218. 93 Goodman, Joshua. 2012. “The Labor of Division: Returns to Compulsory Math Coursework.” HKS Faculty Research Working Paper Series RWP12-032, John F. Kennedy School of Government, Harvard University. Griliches, Zvi. 1977. “Estimating the Returns to Schooling: Some Econometric Prob- lems.” Econometrica 45 (1):1–22. Griliches, Zvi and William M. Mason. 1972. “Education, Income and Ability.” Journal ofPoliticalEconomy 80 (3):S74–S103. Grubb, W. Norton. 2004. The Educational Gospel: The Economic Power of Schooling. Harvard University Press. Hahn, Jinyong, Petra E. Todd, and Wilbert van der Klaauw. 2001. “Identification and Estimation of Treatment Effects with a Regression-Discontinuity Design.” Econo- metrica 69 (1):201–209. Hanushek, Eric A. and Dennis D. Kimko. 2000. “Schooling, Labor-Force Quality, and the Growth of Nations.” AmericanEconomicReview 90 (5):1184–1208. Hanushek, Eric A. and Ludger Woessmann. 2008. “The Role of Cognitive Skills in Economic Development.” JournalofEconomicLiterature 46 (3):607–668. Heckman, James and D Robb. 1985. “Alternative Methos for Evaluating the Impact of Interventions.” In Longitudinal Analysis of Labor Market Data, edited by James Heckman and Burton Singer. Cambridge, U.K.: Cambridge University Press, 156– 245. Heckman, James J. 1976. “The Common Structure of Statistical Models of Trunca- tion, Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models.” AnnalsofEconomicandSocialMeasurement 5 (4):475–492. ———. 1979. “Sample Selection Bias as a Specification Error.”Econometrica 47 (1):153– 161. Horowitz, Joel L. 2001. “The Bootstrap.” In Handbook of Econometrics, vol. 5, chap. 52. Elsevier, 3159–3228. Horowitz, Joel L. and Charles F. Manski. 1995. “Identification and Robustness with Contaminated and Corrupted Data.” Econometrica 63 (2):281–302. ———. 2000. “Nonparametric Analysis of Randomized Experiments with Missing Covariate and Outcome Data.” Journal of the American Statistical Association 95 (449):77–84. Imbens, Guido W. and Joshua D. Angrist. 1994. “Identification and Estimation of Locat Average Treatment Effects.” Econometrica 62 (2):467–476. Imbens, Guido W. and Thomas Lemieux. 2008. “Regression Discontinuity Designs: A Guide to Practice.” JournalofEconometrics 142 (2):615–635. 94 Imbens, Guido W. and Charles F. Manski. 2004. “Confidence Intervals for Partially Identified Parameters.” Econometrica 72 (6):1845–1857. Kane, Thomas J. and Cecilia E. Rouse. 1993. “Labor Market Returns to Two- and Four-year colleges: Is a Credit a Credit and Do Degrees Matter?” Tech. Rep. 311, Industrial Relation Section, Princeton University. Kang, M-S, editor. 1999. The Studies of Vocational High School Education (in Korean). Seoul, Korea: Korea Research Institute for Vocational Education and Training. Keane, Michael P. and Kenneth I. Wolpin. 1997. “The Career Deciosions of Young Men.” JournalofPoliticalEconomy 105 (3):473–522. Lee, David S. 2008. “Randomized Experiments from Non-random Selection in U.S. House Elections.” JournalofEconometrics 142 (2):675–697. ———. 2009. “Training, Wages, and Sample Selection: Estimating Sharp Bounds on Treatment Effects.” ReviewofEconomicStudies 76 (3):1071–1102. Lee, David S. and David Card. 2008. “Regression Discontinuity Inference with Speci- fication Error.” JournalofEconometrics 142 (2):655–674. Lee, David S. and Thomas Lemieux. 2010. “Regression Discontinuity Designs in Eco- nomics.” JournalofEconomicLiterature 48 (2):281–355. Levine, Phillip B. and David J. Zimmerman. 1995. “The Benefit of Additional High- School Math and Science Classes for Young Men and Women.” Journal of Business andEconomicStatistics 13 (2):137–149. Long, Mark C., Dylan Conger, and Patrice Iatarola. 2012. “Effects of High School Course-Taking on Secondary and Postsecondary Success.” American Educational ResearchJournal 49 (2):285–322. Long, Mark C., Patrice Iatarola, and Dylan Conger. 2009. “Explaining Gaps in Readi- ness for College-Level Math: The Role of High School Courses.” EducationFinance andPolicy 4 (1):1–33. Loveless, Tom. 2008. “The Misplaced Math Student: Lost in Eighth-Grade Algebra.” Brown Center Report on American Education Series 8, Brown Center on Educa- tion Policy, Brookings Institution. Ludwig, Jens and Douglas L. Miller. 2005. “Does Head Start Improve Children’s Life Chances? Evidence from a Regression Discontinuity Design.” NBER Working Paper Series 11702, National Bureau of Economic Research. Martorell, Paco and Isaac McFarlin. 2011. “Help or Hindrance? The Effects of College Remediation on Academic and Labor Market Outcomes.” ReviewofEconomicsand Statistics 93 (2):436–454. McCrary, Justin. 2008. “Manipulation of the Running Variable in the Regression Discontinuity Design: A Density Test.” JournalofEconometrics 142 (2):698–714. 95 McCrary, Justin and Heather Royer. 2011. “The Effect of Female Education on Fer- tility and Infant Health: Evidence from School Entry Policies Using Exact Date of Birth.” AmericanEconomicReview 101 (1):158–195. Moss, Brian G. and William H. Yeaton. 2006. “Shaping Policies Related to Devel- opmental Education: An Evaluation Using the Regression-Discontinuity Design.” EducationalEvaluationandPolicyAnalysis 28 (3):215–229. Nam, Kigon. 2005. “Wage Effect of Vocational High School Education (in Korean).” KyeongjaeBaljonYeonku(JournalofKoreanEconomicDevelopment) 11 (1):143–169. Parsad, Basmat, Laurie Lewis, and Bernard Greene. 2003. Remedial Education at Degree-Granting Postsecondary Institutions in Fall 2000. U.S. Department of Edu- cation, National Center for Education Statistics. Porter, Jack. 2003. “Estimation in the Regression Discontinuity Model.” Unpulished Paper, Harvard University. Rose, Heather and Julian R. Betts. 2004. “The Effect of High School Courses on Earnings.” ReviewofEconomicsandStatistics 86 (2):497–513. Rubin, Donald B. 1978. “Bayesian Inference for Causal Effects: The Role of Random- ization.” TheAnnalesofStatistics 6:34–58. Schneider, Barbara, Christopher B. Swanson, and Catherine Riegle-Crumb. 1997. “Opportunities for Learning: Course Sequences and Positional Advantages.” Social PsychologyofEducation 2 (1):25–53. Serban, Andreea M., Judith Beachler, Deborah J. Boroch, Craig Hayward, Edward Karpp, and Kenneth Meehan. 2005. Environmental Scan: California Community CollegeSystemStrategicPlan. The Research & Planning Group for California Com- munity Colleges. Taber, Christopher R. 2001. “The Rising College Premium in the Eighties: Return to College or Return to Unobserved Ability?” ReviewofEconomicStudies 68 (3):665– 691. Willis, Robert J. and Wherwin Rosen. 1979. “Education and Self-selection.” Journal ofPoliticalEconomy 79 (2):S7–S36. 96
Abstract (if available)
Abstract
This dissertation analyzes the effects of the educational programs for students who need special care in secondary and postsecondary school. These educational programs present serious endogeneity problems to a researcher estimating their causal effects, because most of students tried to avoid these programs. I extend the current literature in the context of applied econometrics. First, I search for valid instruments based on exogenous policy changes when either appropriate proxy variables for ability or traditional instrumental variables are unavailable, and I look at the validity of instrumental variable estimation results. Second, I address the missing outcome problem in regression discontinuity design without invoking the additional assumption, after which it is possible to obtain the bounds for the estimates when the missing outcome problem presents in regression discontinuity design. ❧ Chapter 1 evaluates evaluates the impact of vocational high school on labor-market success in Korea. As a measure of the efficacy of the vocational high school, I use the wage data from the Korean Labor and Income Panel Study (KLIPS). Restricting the dataset to high-school graduates only, the comparison of general high schools and vocational high schools is made. To address the endogeneity problem in the choice of school type, the preset capacity of each school is used as an instrumental variable for the choice of school. I find that vocational high school gives greater returns to graduates in the sense of local average treatment effects. I find that enrollment in vocational high school gives about 30 percent higher wages than enrollment in general high school. This study also shows that the usual OLS estimates underestimate the effect of vocational high school on the wages while the IV estimates eliminate the downward bias generated by the selection problem. The result of this chapter is contrary to the previous studies showing that vocational high school scarcely affects wages. ❧ Chapter 2 investigates the effects of developmental math course offered at community colleges, addressing the missing outcome problem in the quasi-experimental studies. Many students are unprepared for college-level math in spite of many attempts to improve the math skills of high-school students. In community colleges, developmental mathematics courses are designed to help those students make up for the gaps in high-school math. However, there are few studies on the effect of developmental mathematics on mathematics achievement despite the vast quantity of research on the courses' effects on various outcomes. Developmental mathematics consists of various courses in a tight sequence where course assignments are determined by a rigid placement rule based on students' test scores, and in which students must master the assigned course before taking the next level of math. A course's effectiveness can be measured by the letter grade or other test scores in its subsequent course. However, such an effect is difficult to investigate because of missing outcome problems
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on causal inference
PDF
Essays on econometrics
PDF
To what extent does being a former high school English learner predict success in college mathematics? Evidence of Latinx students’ duality as math achievers
PDF
Essays on education: from Indonesia to Los Angeles, CA
PDF
How extending time in developmental math impacts persistence and success: evidence from a regression discontinuity in community colleges
PDF
Three essays on the high school to community college STEM pathway
PDF
A multi-perspective examination of developmental education: student progression, institutional assessment and placement policies, and statewide regulations
PDF
Essays in the economics of education and conflict
PDF
Essays on development and health economics: social media and education policy
PDF
Three essays on econometrics
PDF
Essays on nonparametric and finite-sample econometrics
PDF
Three essays on linear and non-linear econometric dependencies
PDF
Math and the making of college opportunity: persistent problems and possibilities for reform
PDF
Essays on treatment effect and policy learning
PDF
Essays on health economics
PDF
Reforming developmental education in math: exploring the promise of self-placement and alternative delivery models
PDF
Essays on family and labor economics
PDF
Three essays on the identification and estimation of structural economic models
PDF
Essays on economics of education and private tutoring
PDF
The EITC, labor supply, and child development
Asset Metadata
Creator
Kim, Bo Min
(author)
Core Title
Essays on economics of education
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Publication Date
07/02/2014
Defense Date
04/22/2013
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
bounding,Community colleges,developmental courses,economics of education,instrumental variables,local treatment effects,Mathematics education,missing outcome problem,OAI-PMH Harvest,partial identification,program evaluation,regression discontinuity design,treatment effects,vocational high schools
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Ridder, Geert (
committee chair
), Melguizo, Tatiana (
committee member
), Moon, Hyungsik Roger (
committee member
)
Creator Email
ball556@gmail.com,bokim@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c3-282639
Unique identifier
UC11293815
Identifier
etd-KimBoMin-1730.pdf (filename),usctheses-c3-282639 (legacy record id)
Legacy Identifier
etd-KimBoMin-1730.pdf
Dmrecord
282639
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Kim, Bo Min
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
bounding
developmental courses
economics of education
instrumental variables
local treatment effects
missing outcome problem
partial identification
program evaluation
regression discontinuity design
treatment effects
vocational high schools