Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on health insurance programs and policies
(USC Thesis Other)
Essays on health insurance programs and policies
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
ESSAYS ON HEALTH INSURANCE PROGRAMS AND POLICIES by Hongming Wang A dissertation presented to the faculty of the University of Southern California Graduate School in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Economics August 2019 Copyright 2019 Hongming Wang Abstract Health is a key aspect of human capital. In modern societies, health insurance programs play important roles in promoting health and health investments, particularly for the vulnerable population. In the US, expenditures on government-sponsored health insurance programs accounted for 8.1% of GDP in 2017, a 3.5% increase from 2016. 1 On the other hand, the extent to which the spending benefits enrollees instead of providers, and the effectiveness of the spending in improving health and human capital, are more nuanced but important questions for policy design. This dissertation analyzes the benefits and beneficiaries of policy interventions in three health insurance programs. Chapter 2 looks at the valued-based payment reform in the private Medicare market that took effect in 2012. The reform cut the payment to private Medicare insurers, but increased quality-adjusted payments to insurers with higher quality rating. I find that the bonus payment did not reduce the out-of-pocket costs for enrollees: the pass-through of the bonus payment to enrollee premium and deductible is indistinguishable from zero. Nor did enrollees in high-quality contracts report greater improvement in health. Rather, I find that high-quality contracts decreased (increased) premium in low (high) risk counties, differentially enrolling low-risk enrollees. Examining the rubric of the quality rating, I note that health outcome measures are not adjusted by diagnoses codes prior to enrollment, but receive the greatest weight in the computation of quality. The risk selection therefore has the unintended consequence of restricting access to quality among the sicker population, who face higher premium to purchase high-quality contracts. Chapter 3 seeks to understand the rationale of expanding formal health insurance with policies such as premium subsidy and mandate penalty. The classic justification for the insurance mandate is adverse selection. In the US context, because the uninsured are not turned away at the emergency room, the social cost of uncompensated care provides additional motivation for expanding formal insurance. To understand the joint and relative relevance of either argument, I turn to the 2006-2007 health insurance reform in Massachusetts. I focus on two policy instruments that expanded formal insurance in the state: subsidy to private insurance premium and penalty on the high-income uninsured. I derive and calculate the welfare effects exploiting behavioral responses to policy incentives and the re- sulting externality on premium, uncompensated cost, and tax-subsidy transfers. I find that 1 https://www.cms.gov/research-statistics-data-and-systems/statistics-trends-and-reports/ nationalhealthexpenddata/downloads/highlights.pdf i the rationale of mandate penalty rests entirely on the selection effect on premium: shutting down the premium response, uncompensated care alone does not offset the cost of penalty. Premium subsidy, by contrast, is mostly motivated by the high cost of uncompensated care in the low-income uninsured. Including a modest effect on premium—realistic even in the presence of mark-up adjustment to price-linked subsidy—generates net positive return of subsidy dollars purely from an efficiency standpoint. Chapter 4 looks at the long-run impact of in-utero investment response to Children’s Health Insurance Program (CHIP). I find pregnant mothers impacted by CHIP onset during pregnancy reduced smoking and drinking, and their children have higher birth weight and lower chance of cognitive difficulty in teenage. The forward-looking behavior implies that public investment in children can “crowd-in” private investment that precedes program participation. Accounting for the short-term benefits at birth and the long-run benefits on later-life outcomes increases the cost-effectiveness of public insurance programs for children. ii Acknowledgement The development of the dissertation benefited tremendously from a wonderful group of mentors, colleagues, and friends. I would like to thank my committee members, Jeffrey Nugent, Alice Chen, and in particular, committee chair John Strauss, for providing constant support and guidance over my projects. I learned from them not only the skills to be a better researcher, but also the potential for academic research to effect real changes in the world through evidence and logic. I would also like to thank Geert Ridder and Cheng Hsiao for providing the necessary econometric backgrounds for causal inference analysis and panel data analysis, both vital in the toolkit for an empirical economist. I am further grateful for healthcare faculty from the Schaeffer Center, who introduced me to interesting topics in health economics and beyond. The first chapter of the dissertation, Quality, Quality Payments, and Risk Selection in Private Medicare, is joint work with fellow student Michele Fioretti, whose devotion and thoughtful comments significantly improved the paper. Lastly, I recall lively discus- sion with seminar speakers at the Department of Economics, CESR, and the Schaeffer Center. Their scholarship and diverse interests are inspirational for young generations of researchers. i Contents 1 Introduction 1 2 Quality, Quality Payments, and Risk Selection in Private Medicare 3 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2 Medicare Advantage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Subsidies Before the Quality Based Payment Demonstration . . . . . . . . . 6 2.3.1 Subsidies After the Quality Based Payment Demonstration . . . . . 7 2.3.2 Measures of quality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.4 Data Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.5 Contract-Level Evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.1 Risk Score . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.5.2 Market Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.5.3 Bid, Rebate, and Pricing . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.6 Within-Contract Cross-County Evidence . . . . . . . . . . . . . . . . . . . . 21 2.6.1 Robustness Checks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.7 Why Does QBP Induce Risk Selection? . . . . . . . . . . . . . . . . . . . . . 25 2.7.1 A Model of Risk, Quality, and Insurer Profit . . . . . . . . . . . . . . 26 2.7.2 Empirical Evidence on the Risk-Quality Mechanism . . . . . . . . . 27 2.7.3 Characterizing the Risk-Outcome Correlation . . . . . . . . . . . . . 33 2.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 3 Expanding Health Insurance with Mandate and Subsidy: Theory and Evidence 39 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.2 Massachusetts health insurance reform . . . . . . . . . . . . . . . . . . . . . 42 3.2.1 Mandate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 3.2.2 Subsidized insurance . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.2.3 Rating regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.2.4 Uncompensated care programs . . . . . . . . . . . . . . . . . . . . . 46 3.3 Motivating insurance expansion . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3.1 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 3.3.2 Adverse selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.3.3 Uncompensated care . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.4 Incremental expansion in Massachusetts . . . . . . . . . . . . . . . . . . . . 52 3.4.1 Setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 i 3.4.2 Welfare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 3.4.3 Premium subsidy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.4.4 Mandate penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 3.4.5 Model discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 3.5 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.5.1 Sample summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 3.5.2 Subsidy rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.5.3 Empirical strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 3.5.4 Simulated generosity . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.5.5 Econometric model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.5.6 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.5.7 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 3.6 Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 3.7 Welfare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.7.1 Subsidy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.7.2 Mandate penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.7.3 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.7.4 Fiscal cost of subsidy . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 3.7.5 Uncompensated cost saving . . . . . . . . . . . . . . . . . . . . . . . 83 3.7.6 Reform rationale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.7.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 4 Long-run impact of in-utero investment response to CHIP 88 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 4.2 Background and Previous Literature . . . . . . . . . . . . . . . . . . . . . . 91 4.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 4.4 In-utero Investment Response . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.4.1 Event Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.4.2 Robustness and Alternative Mechanisms . . . . . . . . . . . . . . . . 108 4.4.3 Simulated Eligibility . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 4.4.4 Birth Weight Production Functions . . . . . . . . . . . . . . . . . . . 122 4.5 Long-run Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 4.5.1 Event Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 4.5.2 Simulated Eligibility . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 ii 5 Conclusion 133 Bibliography 135 Appendices 144 A Appendix to Chapter 2 145 A.1 Appendix Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 A.2 Appendix Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 B Appendix to Chapter 3 175 B.1 Appendix Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 B.1.1 Risk-based pricing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 B.1.2 Basic trade-off . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 B.1.3 Risk preference heterogeneity . . . . . . . . . . . . . . . . . . . . . . 177 B.1.4 Labor productivity and non-labor income . . . . . . . . . . . . . . . 178 B.1.5 Tax penalty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 B.1.6 Uncompensated care . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 B.1.7 Welfare formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 B.2 Appendix Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 B.3 Appendix Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 C Appendix to Chapter 4 199 C.1 Appendix Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199 C.2 Appendix Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 iii 1 Introduction Payments to health insurance programs constitute a large fraction of GDP in the US. These programs aim to improve the health of vulnerable groups by improving their access to healthcare. The success of insurance programs depends on the ability to target individuals most likely to benefit from health services, and the effectiveness of these investments on health. I study the effect of policy reforms that either expanded access or linked payments to performance measures on health. I first examine the effect of Quality Bonus Payment on private Medicare insurers and enrollees. The reform represents a general shift from volume-based payment models to value-based models across the US health system. Insur- ers receiving higher quality rating—driven primarily by enrollee health outcomes—receive larger bonus payments. The intention is that the financial incentive should spur quality investments that ultimately benefit enrollee health. On the contrary, I do not find substantial pass-through of bonus payments to enrollees, either in terms of lower out-of-pocket costs or greater health improvements. Moreover, the ultimate incidence of bonus payments is mediated by the price setting of private insurers. In this case, because health outcome measures drive the quality rating, failing to adjust for the baseline severity of measured conditions and the presence of other complicating con- ditions creates incentives for insurers to favorably select healthier individuals. Although pass-through of bonus payments to enrollees is minimal on average, across service areas, it increased for healthier enrollees in low-risk counties but decreased for those in high-risk counties. The distribution effect negatively affects the well-being of high-risk enrollees, if access to quality is important for the health outcomes of vulnerable patient groups. I then consider the cost-effectiveness of policies that induce health insurance take-up among the uninsured. I focus on the premium subsidy and the tax penalty implemented in the ACA pilot in Massachusetts. These policy instruments lowered the out-of-pocket cost of insurance for the uninsured, but increased government spending in healthcare. In the case of ACA, for example, healthcare spending is projected to exceed 20% of GDP in 2020. On the other hand, the benefits of health insurance coverage to the uninsured and potential other parties in society are less transparent but important inputs to the analysis of expansion policies. I show that the effectiveness of insurance expansion is specific to the status quo coverage options to the uninsured. Compared to the de facto safety net coverage through the emergency room, enrollment in formal insurance lowers the social cost of charity care. This is desirable if the availability of charity care lowers the demand for insurance below 1 own cost, resulting in inefficiently low take-up. Alternatively, if the safety net program is intended as a means of redistribution to the poor, then subsidized health insurance should achieve a better public finance of healthcare costs than the status quo safety-net mandate. I exploit policy variation in Massachusetts to understand take-up and the cost changes in the insured and the uninsured population. I use the cost changes to characterize the pricing benefits on the insurance premium and the social burden of charity care. I compare the benefits with the fiscal externality of the program, depending on behavioral responses. I bring together evidence on choice, cost and prices to numerically inform the cost-benefit analysis. I find that the effectiveness of the subsidy policy depends critically on the cost saving in the charity care program, which roughly offsets the efficiency cost of the subsidy transfer. On the other hand, a near complete pass-through from cost to premium is needed for the mandate penalty to effectively address adverse selection in pricing. Finally, I consider a case where behavioral responses to policy magnified the health benefits over the long run. It occurred in the context of early life insurance coverage for children. I compare cohorts impacted in-utero by the onset of the Children’s Health Insurance Program (CHIP). I find pregnant mothers impacted by the roll-out reduced smoking and drinking during pregnancy, and their children had higher birth weight. Fourteen years later, the same children are less likely to experience cognitive difficulties at school. The health benefits at birth are driven by parental investment responses that complement public investments in children. Over the long run, the investment increases the cost-effectiveness of public insurance for children. 2 2 Quality, Quality Payments, and Risk Selection in Private Medicare 2.1 Introduction Growing healthcare costs and insufficient quality are a major issue for the US healthcare system. To this end, the Affordable Care Act (ACA), as well as large private insurers, have recently introduced several performance payment schemes to efficiently stimulate the provision of high quality care. 2 This is often accomplished through subsidies and penalties for respecting certain quality targets. Yet, there is limited evidence on the effectiveness of such incentive schemes in health insurance markets (McClellan, 2011). Further, difficulties in measuring care quality and insurers’ private information can constrain an insurer’s effort to the dimensions of quality that are rewarded (Holmstrom and Milgrom, 1991), and limit the provision of care to certain individuals (Finkelstein and Poterba, 2004). For these reasons, the Congressional Budget Office advocates a smaller role for performance pay in health insurance markets (Congressional Budget Office, 2018). This trade-off is central for Medicare, the primary source of health insurance for the US elderly, whose costs increase 5% per year. 3 This paper studies a national value-added policy adopted in private Medicare, or Medicare Advantage (MA), as part of the legislative changes introduced in the Quality Bonus Payment (QBP) demonstration. Every year, each plan demands a subsidy to CMS through a bidding system. To encourage low bids, CMS compares them to a benchmark: if a bid is lower than its benchmark, CMS grants the plan a portion of the difference to be passed along to consumers in terms of better services or lower premiums. Moving from volume to value of care, QBP awarded financial incentives in terms of greater benchmarks and rebates according to a quality scoring system called star rating. We examine the effect of differential payments to quality over the QBP period relative to a pre-QBP baseline where payment is indiscriminate of quality. The reform did not differentially affect the way high- and low-quality plans priced their services. Nevertheless, we show that high-quality plans improved their risk-pool in the post-QBP period as a result of risk-selection: high-quality plans charged greater premiums in risky counties after the reform. Further, we provide evidence that the selection arises from how quality is measured. We demonstrate that the star rating carries information 2 For instance, many large insurance companies such as Aetna switched to value-based care reim- bursement contracts to achieve cost reductions. Source: https://healthpayerintelligence.com/news/ private-payers-follow-cms-lead-adopt-value-based-care-payment. 3 Medicare enrolled more than 45 million beneficiaries in 2015. Spending on Medicare reached $500 bn in 2010, or 20% of total health spending, and continues to grow at a rate of 5% per year (CMS, 2016). 3 not only on the effectiveness of the medical treatments but also on enrollees’ pre-existent conditions. This result stresses the importance of carefully measuring value-added in policy design to avoid unintended consequences. 4 Methodologically, our difference-in-difference strategy tracks baseline high and low- quality contracts over time. The absence of an effect on average pricing masks substantial pricing variation facing enrollees across locations. Looking within the contract market set, we uncover significant pricing differentials — higher incidence of zero-premium pricing and enrollment in counties with lower fee-for-service (FFS) risk score — for baseline high-quality contracts, but not for baseline low-quality contracts, explaining the risk im- provement of high quality contracts despite a null effect on average premium. Consistent with the risk selection mechanism (e.g., Brown et al., 2014, Newhouse et al., 2012), we find baseline high quality contracts with lower FFS risk score across service areas experience even greater reduction in enrollee risk after QBP . This result complements existing ev- idence on contract pricing and benefit design as potential mechanism of advantageous selection in high quality MA plans (Decarolis et al., 2017, Decarolis and Guglielmo, 2017). There is no strong evidence for other mechanisms that may lead to improved risk profile of high quality contracts. For example, market coverage is stable for treatment and control groups: high quality contracts did not differentially exit counties with higher FFS risk score, or enter counties with higher benchmark; nor did they vary the number of plans offered, or the number of counties served. Within contract, the data show no significant differences in prescription drug deductibles across service areas, although differentials in other aspects of benefit design which are not observed cannot be ruled out. Overall, evidence points to the importance of the premium variation across markets in the advantageous selection of high quality MA contracts. To understand why quality payments incentivize risk selection, and ultimately how the selection can be undone, the final part of the paper investigates the relationship between enrollee risk and contract quality. Conceptually, if lower enrollee risk contributes to higher quality, then quality payments can lead insurers to favor enrollments from low-risk regions, with stronger incentive on high quality contracts eligible for larger bonuses. This is consistent with the finding that only baseline high quality contracts engage in risk selection after QBP . The paper exploits the same difference-in-differences variation to characterize the correlation between risk score and quality, and the associated selection incentive. In 4 The issue that we highlight in this paper appears also in other domains. For instance, concerns on how to optimally pay teachers is a central question in the economic education literature (see Chetty et al. (2014) for measures of value-added in education and Biasi (2018) for an empirical study of the impact of performance pay in education). 4 particular, we focus on a subdomain measure used to compute the star rating called “patient outcome measures,” which includes details on an enrollee’s chronic conditions. We show that contracts with higher risk score in the baseline are less likely to perform well in patient outcome measures in the quality rating. Furthermore, high quality contracts with high baseline risk scores experience smaller improvements in outcome ratings relative to low-quality contracts with a more favorable risk-pool, consistent with the selection incentive. We further document a negative correlation between contemporaneous risk score and outcome rating as well as the final rating, particularly for baseline high quality contracts. For these contracts, performance in the patient outcome domain is the most predictive of the final rating. The selection suggests an outcome-based quality rating is biased due to baseline differences in health conditions: contracts enrolling patients with more complicated diagnoses perform less well in outcome measures, and are disadvantaged in the quality rating. A better measurement of quality would compare outcomes only across patients with a similar case-mix of diagnoses, or adjust outcome by baseline risk. The adjustment is lacking in the current rating for outcome measures in the “managing chronic conditions” domain. Removing the risk bias in quality rating will lessen the selection incentive associated with quality payments. The selection result contributes to a nascent literature that recognizes the limitation of standard risk adjustment on all aspects of “cream-skimming” contract design (Einav et al., 2016, Veiga and Weyl, 2016). For example, drugs receiving low payments relative to costs are more likely placed on higher cost-sharing tier (Geruso et al., 2016), and so are drugs treating diagnosis rendered unprofitable by technological change (Carey, 2017, 2018). Similarly, the design of drug formularies is also used to select enrollees upon risk (Lavetti and Simon, 2017). In this paper, we highlight a novel aspect of insurance contract, i.e., quality rating, as a potential source of risk selection. This paper also contributes to a growing literature on the economic incidence of government payments to health insurers and providers (e.g., Dafny, 2005, Clemens and Gottlieb, 2014). After QBP, high quality contracts increased bids by nearly the full size of the bonus benchmark, and as a result, do not pass on more rebates to enrollees relative to low-quality contracts. In the MA context, a near zero pass-through to enrollees is striking, but not unprecedented (e.g., Cabral et al., 2018). Looking at year 2007-2011, Duggan et al. (2016) finds a near zero pass-through across neighboring urban and rural counties. We continue this line of research by showing that the recent quality payments similarly have minimal pass-through to enrollees. Taken as a whole, our evidence from private Medicare emphasizes the role of insurer 5 selection on the distributional incidence of quality payments, with mixed implications for welfare. 5 Since high quality service is made less (more) costly to enrollees in low (high) risk counties, the benefit of bonus payments is not evenly felt across beneficiaries. 6 The overall welfare effect, however, depends on the economic incidence between enrollee benefits and costs to the government, intermediated by insurer selection and pass-through. Curto et al. (2014) estimates that during 2006-2011 two thirds of the payment surplus is in the form of insurer profit and one third goes to enrollees who suffer some disutility from managed care. The benchmark and rebate variation in QBP further complicates the split between insurers and enrollees, and among enrollees, those with high and low potential gains from quality. Hence we view evidence in this paper as constructive inputs to a normative characterization of value-based payments in health insurance markets, which we leave for future work. The remainder of the paper is organized as follows. The QBP reform is introduced in Section 2.2. Section 2.4 describes the data. Sections 2.5 and 2.6 report the contract- and plan-level analyses respectively. The risk-selection mechanism is described both theoretically and empirically in Section 2.7. Section 2.8 discusses the results. Section 2.9 concludes. 2.2 Medicare Advantage Government subsidies are the largest source of revenue for MA insurers (Newhouse and McGuire, 2014). This section describes how insurers applied for the subsidy before QBP, and how QBP tied the subsidy with care quality. 2.3 Subsidies Before the Quality Based Payment Demonstration Before QBP, subsidies were disboursed based on a competitive bidding model introduced in the Medicare Modernization Act (MMA) of 2003. The bid reflects the projected cost for each MA enrollee plus an administrative load. Under this model, MA insurers submit bids (denotedb) to CMS individually for each plan they own. Bids are evaluated against a plan specific benchmark (denotedB). To determine this benchmark, CMS sets statutory benchmarks capitation rates for each county from historic FFS costs.B is assigned to each plan as a weighted average of the county benchmarks in its 5 In related work, Gupta (2017) discusses the welfare implications of supply-side selection in hospital readmissions. 6 Echoing the disparity concern, there is evidence that plans serving enrollees with disability (MedPAC, 2015), in low socio-economic status (SES) (NQF, 2014) and certain geography (Soria-Saucedo et al., 2016) are disfavored in the quality rating. 6 service area. If a bid is below its benchmark, then CMS pays offb, and in addition returns a fraction of cost saving below the benchmark – in MMA it is 75% of (Bb) – to insurers as rebate. The rebate is then passed on to enrollees in the form of premium reduction or extra benefits. When the bid exceeds the benchmark, the insurer is paid the benchmark from the government but receives no rebate. The excess costbB is passed on to enrollees as extra premium. For MA enrollees with an average FFS risk score, the rebate to plani is rebate MMA i = 8 > > > < > > > : 0 ifb i B i ; 0:75 (B i b i ) ifb i <B i : The vast majority of plans submit a bid below their benchmark, providing enrollees with more generous coverage than traditional Medicare at little extra cost above Part B premium. 7 Hence government spending per enrollee is capped at the benchmark, and plans with more cost saving offer more generous coverage. In practice, bid, benchmark, and rebate are multiplied by the enrollee risk score to calculate the final payment. The risk adjustment is intended to make different risk types equally profitable for plan payment (Brown et al., 2014; Newhouse et al., 2012). 2.3.1 Subsidies After the Quality Based Payment Demonstration As part of the 2010 Affordable Care Act, QBP was signed into law in November 2010 and revised the ACA bonus payments in the demonstration period from 2012 to 2014. 8 To this end, the rebate to a plan whose bid is below the benchmark was set as a function of the star rating using two tools. First, the rebate percentage ((q)) is no longer 75%, but it monotonically increases with quality (q). Second, QBP introduced a bonus payment ((q)) to adjust a plan’s overall benchmark for quality. Hence the QBP rebate to an insurer of qualityq i serving enrollees comparable to the FFS risk pool is given by rebate QBP i = 8 > > > < > > > : 0 ifb i (q i )B i ; (q i ) ((q i )B i b i ) ifb i <(q i )B i : In this paper, we refer to “bonus payments” as the sum of extra payments from either the bonus benchmark or the bonus rebate to higher quality plans. Table 2.1 shows the variation of bonus benchmarks,(q i ), and rebate percentages,(q i ), over the star rating. 7 In our data 41% of the plans charge zero premium (above standard Part B premium), and 84% require no deductible for prescription drug (see Table 2.2). 8 Total bonus payments are more generous under the demonstration. In 2013-2014, payments under the demonstration are blended with the ACA levels, and bonus payments decreased over time. In 2015, payments fully transition to the lower ACA levels. 7 Although rebates are lower than in the 2009-2011 period, the reduction is smaller for higher quality contracts due to more generous quality bonus payments. In addition to the quality payments, QBP gradually reduced the benchmark faced by MA contracts to a level closer to FFS spending. The new benchmarks range from 95% of FFS cost in counties in the top quartile of FFS cost, to 115% in those in the lowest quartile. Moreover, QBP revised MMA benchmark bonuses by designating a set of double bonus counties. 9 While our main focus is on the bonus payment differential at the contract level, we also examine if high quality contracts differentially entered double bonus counties after QBP, and control for both the ACA and the QBP benchmark in the cross-county analysis. Table 2.1: Bonus and rebates by quality scores for the period 2009-2014 Star Rating Year 1 - 2.5 3 3.5 4 4.5 5 Benchmark Bonus(q j ) = 1 + % 2009/11 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 2012 0.0% 3.0% 3.5% 4.0% 4.0% 5.0% 2013 0.0% 3.0% 3.5% 4.0% 4.0% 5.0% 2014 0.0% 3.0% 3.5% 5.0% 5.0% 5.0% Rebate Percentage(q j ) 2009/11 75.0% 75.0% 75.0% 75.0% 75.0% 75.0% 2012 66.7% 66.7% 71.7% 71.7% 73.3% 73.3% 2013 58.3% 58.3% 68.3% 68.3% 71.7% 71.7% 2014 50.0% 50.0% 65.0% 65.0% 70.0% 70.0% 2.3.2 Measures of quality While more disaggregated quality information was previously available to potential en- rollees in the “Medicare and You” handbook (CMS, 2008), 2009 is the first year when performances on multiple domains are aggregated in a single star rating reported to con- sumers (on a scale of 1 to 5). 10 Previous research identified modest effect of quality rating on enrollment in 2009, but not significant effect in 2010 (Darden and McCarthy, 2015). 9 For example, contracts eligible for a 5% benchmark bonus receive a 10% bonus in these counties. Extra bonus is awarded to counties based on a number of criteria, including population size, MA penetration rate, and FFS costs relative to the national average. Layton and Ryan (2015) find that extra bonus counties (greater than 80%) are not associated with higher quality, but with a greater number of plans offered. 10 This initiative was supported by theoretical analysis showing that a functioning quality reporting system is as important as risk adjustment in correcting market inefficiencies (Glazer and McGuire, 2006). 8 The weaker demand effect in later years is partly due to supply side pricing response (McCarthy and Darden, 2017) – a mechanism we show has intensified when payment is later linked to quality – or due to consumer inattention to quality information after the policy phase-in. 11 The star rating summarizes overall plan performance across eight domains, five con- cerning Part C coverage and three concerning prescription drug coverage. To calculate the domain rating and the final star rating, plans receive scores on specific quality measures within domains. Measure scores are based on performance data from a number of CMS administrative datasets. 12 Depending on the percentile rank, each measure is assigned a star rating. The measure ratings are then averaged to generate the final rating, or within domain to generate the domain rating. The composition of measures in the final rating changed from year to year. Continuing measures are subject to higher quality standards. 13 Broadly speaking, in 2011-2014, quality measures are divided into the following eight domains (the data source is in parenthesis): 1. Staying Healthy: screening, vaccine, BMI (HEDIS), and self-reported physical and mental health (HOS); 2. Managing Chronic Conditions: percent of enrollees diagnosed with diabetes, high blood sugar and cholesterol, etc., who have the condition controlled (HEDIS); 3. Plan Responsiveness: ease of getting needed care, setting up appointment, etc (CAHPS) 4. Consumer Complaint: rate of complaint received, number of enrollees leaving the plan and reported difficulty in care access (CTM); 11 For example, a 2011 poll by Kaiser Permanente shows that almost 60% of Medicare eligible seniors are unaware of the 5 Star Ratings (Harris Interactive, 2011). 12 Sources of performance data include the Healthcare Effectiveness Data and Information Set (HEDIS), the Consumer Assessment of Healthcare Providers and Systems (CAHPS), the Health Outcomes Survey (HOS), the Complaints Tracking Module (CTM), the Independent Review Entity (IRE), the Medicare Beneficiary Database Suite of Systems (MBDSS), the Call Center, the Medicare Advantage and Prescription Drug System (MARx), the Prescription Drug Event (PDE), among others. Most measures reflect previous year plan quality, with most of the health outcome measures further lagged by two years. For a detailed list of measures in the 2013 rating, the source of each measure and the time frame of measurement, see Appendix Table A.1.12 and A.1.13. 13 For example, final rating in 2011 is based on 51 measure stars, 50 measure stars in 2012, and 49 measure stars in 2013. The measure “access to primary care doctor”, in particular, is dropped in 2013 because nearly all MA plans meet high quality status (above 85% for 4.0 star and 95% for 5.0 star), and the measure “plan all-cause (30-day) re-admission” revised the threshold for 5.0 rating from below 5% in 2012 to below 3% in 2013; only a handful of local coordinated care plans (CCP) with very small enrollments ever obtained 5.0 rating on this measure, and average readmission is 15% for MA enrollees (or 20% in FFS). 9 5. Timely Service: call center availability and timely response and satisfactory resolution on consumer appeal (IRE); 6. Part D Timely Service: call center availability, timely response and satisfactory resolu- tion on consumer appeal (IRE) and timely enrollment in drug plan (MARx); 7. Part D Experience: ease of getting information on drug coverage and the needed drug from plan and member rating on plan (CAHPS); 8. Part D safety and Adherence: percent choosing high-risk drug instead of a safer option and percent taking drugs as directed (PDE). Measure stars are aggregated to the final rating through a weighting procedure that assigns higher weights to outcome measures and penalizes high variance across measures. 14 Possible weights are 1.0, 1.5 and 3.0. Measures of patient satisfaction and access typically receive the 1.5 weight, and measures of medical process such as screening, testing and vaccination, receive the 1.0 weight. Patient outcome measures, on the other hand, receive the highest weight (3.0), and are important predictors of the final rating. 15 Due to the data collection effort, the quality rating for enrollment periodt, released in the fall oft 1, is based on performance measured overt 2, especially for outcome measures in “staying healthy” and “managing chronic conditions”. These measures capture the improvement in health conditions relative to a baseline. 16 Intended as a measure of quality, patient outcome is biased due to differences in baseline risk. While low-risk enrollees with milder, more manageable conditions tend to speak to high quality of care, high-risk enrollees with more complicated conditions may see smaller improvement in outcomes despite the high quality care applied in the treatment. For measures of quality in a different context, consider the case-mix index (CMI) in- tensely applied in the payment adjustment to hospital discharges. Patients are categorized by their “severity-adjusted diagnosis-related group (DRG)”, which considers up to eight additional co-morbidities in addition to the principal diagnosis, up to six procedures performed in hospital, and adjusts for socio-demographic factors such as age and sex. 14 The weighting procedure started in 2012, although nearly all measures are already present in 2011. 15 For instance, in 2013 only outcome measures received the 3.0 weight. These measures are “improving physical health” and “improving mental health” in the“staying healthy” domain, management of blood sugar, blood pressure, and cholesterol, and all-cause re-admission measure in the “managing chronic condition” domain, and all the drug safety and adherence measures in the “Part D safety and adherence” domain. 16 For example, measures of “maintaining and improving physical (mental) health” in the “staying healthy” domain come from respondent self-reports in the Health Outcomes Survey (HOS), adjusted by socio- demographic factors. Outcome improvement measures in the “managing chronic conditions” domain come from clinical records in the Healthcare Effectiveness Data and Information Set (HEDIS), unadjusted by baseline case-mix or socio-demographic factors. 10 The case-mix adjustment for baseline severity is nearly non-existent in clinical outcome measures in the MA quality rating. 17 2.4 Data Summary Our data come from administrative CMS registry of all MA-PD plans offered over 2009- 2014 (the Landscape File). The data contains detailed insurer (or contract) information, such as quality ratings, and within-insurer plan availability across service areas (counties). Moreover, within each county, plan characteristics, such as premium and drug deductibles are observed. A separate enrollment file contains plan-county-month enrollment counts, which have been aggregated into plan-county-yearly counts. Because the quality rating varies at the level of contracts, to understand how quality- linked payments affect pricing and product design, we aggregate plan-county character- istics to the contract level, weighted by enrollment shares. Plans in counties with fewer than 10 enrollees are dropped, since CMS masks the exact enrollment count in these cases. We further restrict contracts to those with at least a 3.0 rating in the baseline (2009-2010). Because contracts failing to obtain a 3.0 or above rating for three consecutive years are subject to suspension, low performing contracts face additional incentive of risk selection not generalizable to higher performing contracts. Table 2.2 summarizes key contract-year statistics in Panel A for contracts with non- missing quality rating in the previous year. 18 The treated group is defined as high quality contracts with at least a 4.0 rating in both 2009 and 2010, and the control group as low quality contracts with at most a 3.5 (but no less than 3.0) rating in 2009-2010. Column (1)-(2) pool over both treated and control contracts: an average MA-PD contract has 3 plans serving over 25 counties. Baseline high quality contracts are more likely to remain high quality (star 4:0) in the sample period, bid closer to the benchmark, and receive smaller rebates. 19 They are also more likely to charge higher premium, and less likely to offer zero-premium plans. Differences in drug deductible, on the other hand, are small and not significant. Panel B shows more disaggregated variation at the level of contract, year, and location (county). Because contracts can design plan characteristics differentially across service 17 Girotti et al. (2013) presents a case study of how adjusting for complication severity can meaningfully alter the quality ranking in the context of vascular surgeries. 18 For the main analysis we restrict attention to continuing contracts, since bonus payment for enrollment yeart is determined by yeart 1 quality rating. New contracts, or contracts with missing previous rating, are eligible for bonus payments according to a different rule. In robustness analysis we examine how the entry of new contracts may respond to region characteristics (FFS risk and benchmark rate, for example) after QBP . We do not find differential entry response at the contract level. 19 Standard errors are clustered at the level of contract linked over time. 11 Table 2.2: Summary statistics (I) (II) (III) (IV) (V) (VI) (VII) Full Sample Low Quality High quality (V)-(III) Mean S.E. Mean S.E. Mean S.E. p-value Panel A: contract-year observations Risk Score 0.97 0.0075 0.97 0.0093 0.96 0.012 0.55 Star 4:0 (%) 0.35 0.025 0.17 0.018 0.75 0.032 0.00 Star Score 3.55 0.031 3.35 0.027 3.99 0.041 0.00 # County 25.09 5.40 28.19 7.74 18.18 2.21 0.22 # Plan 3.40 0.23 3.53 0.31 3.12 0.28 0.33 Enrollment (k) 334.75 34.95 328.35 39.19 349.06 71.56 0.80 Benchmark 874.10 5.72 883.08 6.52 854.06 10.87 0.023 Bid 763.38 6.28 763.65 7.58 762.80 11.25 0.95 Benchmark-Bid 110.72 5.55 119.43 6.90 91.27 8.68 0.012 Rebate 78.37 3.73 83.55 4.68 66.80 5.74 0.025 Premium 49.07 3.38 35.25 3.59 79.93 5.70 0.00 Zero Premium (%) 0.41 0.029 0.51 0.035 0.19 0.035 0.00 Drug Deduc 32.62 4.42 32.85 5.72 32.11 6.40 0.93 Zero Drug Deduc (%) 0.84 0.019 0.85 0.024 0.83 0.031 0.68 N 1,122 775 347 Panel B: contract-year-location observations Enrollment (k) 18.25 2.35 17.00 2.48 21.57 4.64 0.35 # Plan 1.76 0.073 1.59 0.088 2.22 0.093 0.00 Premium 52.69 3.78 42.93 4.21 78.55 6.71 0.00 Zero Premium (%) 0.33 0.036 0.39 0.047 0.16 0.039 0.00 Drug Deduc 28.65 5.88 30.05 7.69 24.92 6.23 0.60 Zero Drug Deduc (%) 0.85 0.030 0.84 0.040 0.87 0.031 0.44 N 20,472 14,861 5,611 Notes: Table shows summary statistics for the full sample (column 1-2) and the treated (baseline high quality, column 5-6) contracts and control (baseline low-quality, column 3-4) contracts. Plan characteristics are aggregated to the contract-year level in Panel A, and to contract-year-county level in Panel B, both weighted by enrollment. Standard errors are clustered at the level of contracts in Panel A, and two-way clustered at the level of contract and county in Panel B. Details are in the text. 12 areas, the within-contract cross-location variation is one margin of selection overlooked in cross-contract comparisons. We cluster standard errors two-way at the level of contract and county in Panel B. We therefore allow a given contract to be arbitrarily correlated over time within county, and a given county arbitrarily correlated over time within contracts. As in Panel A, high quality contracts charge higher premium and somewhat lower drug deductible, and the difference in deductible is not statistically significant. 2.5 Contract-Level Evidence We start the empirical analysis with a contract-level difference-in-difference model. Base- line high quality contracts experience higher bonus payments after the reform, and they form the treated group. Because the ACA was signed into law in April 2010, and MA contracts do not submit bid and benefit design for 2011 enrollment until June 2010, insurer response to quality payment incentives may already be detectable in 2011 con- tract design and enrollment outcomes. We hence define the variablepost = 1 for year 2011 and after, and inspect the timing of the effect in detail in event studies below. The difference-in-difference model is given by y ct =high c post t + c + t + ct where we compare contracts (c) with differential baseline quality ratings over time (t). We assume that the trending of high and low contracts is parallel absent the policy. To sharpen the identification of trends, we include dummies of longitudinal contract id’s ( c ), and use within-contract variation over time to isolate the effect of bonus payments. The contract fixed effects importantly sweep out baseline heterogeneity across contracts, such as differences in service area, enrollee characteristics and provider networks, among others. However, to the extent that confounding factors may vary around the same time as the reform, the difference-in-differences estimate will be biased. The absence of time-varying confounds is not directly testable. As with most difference-in-difference analyses, we rely on visual inspection of parallel trends before the reform to assess the validity of the model. We then apply the model to study the effect of bonus payment on risk score (Section 2.5.1), market characteristics (Section 2.5.2) and pricing (Section 2.5.3). 2.5.1 Risk Score Table 2.3 shows the effect on risk score following the reform in 2011. We show the robustness of the result to different controls and level of analysis. Columns (1)-(2) measure 13 risk score aggregated at the contract level using plan enrollment weights. Columns (3)-(4) measure risk score as unweighted average of plan risk scores. Columns (5)-(6) measure risk score at the raw plan level. For each measure, we study the robustness of results with and without contract or plan level fixed effects. Table 2.3: Effect of QBP on the risk score (I) (II) (III) (IV) (V) (VI) High Post -0.026*** -0.041*** -0.035*** -0.042*** -0.020*** -0.045*** (0.0082) (0.014) (0.012) (0.015) (0.0074) (0.015) Weights plan enrollment equal weights unweighted y mean 0.97 0.97 0.97 0.97 0.96 0.96 Fixed Effects contract contract plan R 2 0.86 0.0068 0.76 0.012 0.79 0.0089 N 1,122 1,127 1,122 1,122 4,549 4,549 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows difference-in-difference estimates on risk score, aggregated at contract level using enrollment weights in column (1)-(2), using equal weights in column (3)-(4), and measured at the raw plan level in column (5)-(6). For each measure we show results with and without fixed effects. Standard errors clustered at the contract level (column 1-4) or plan level (column 5-6) in the parenthesis. We find significant improvement in the risk pool of higher quality contracts, relative to the control. The preferred specification in column (1) shows a 2.6 percentage point reduction in risk score by high quality contracts, and the effect is similar if not larger under alternative measures. Figure 2.1 examines the validity of the difference-in-difference design and the timing of the effect. Before the proposal of quality bonus payments became law in 2010, risk scores for both treatment and control groups stay parallel. The trend departs visibly in 2011: for the first time high quality contracts have lower risk score than low quality contracts, and the gap widens in later years. Similarly, in the event study, although the effect in 2011 is not statistically significant, the effect becomes stronger and more significant over time. We further look at the changes in the distribution of risk scores. Figure 2.2 plots the kernel density of risk scores separately for high and low quality contracts, both before (2009-2010) and after (2011-2014) the QBP . 20 Compared to the mean effects shown in Table 2.3, the distributional effect clearly suggests that high quality contracts decreased 20 We further show year-disaggregated density plot in Appendix Figure A.2.1, and plan-level risk score distribution in Appendix Figure A.2.2. 14 Figure 2.1: Effect on risk score, event study (a) Raw Trend .92 .94 .96 .98 1 2009 2010 2011 2012 2013 2014 ACA QBP low quality high quality (b) Event Study −.08 −.06 −.04 −.02 0 .02 2009 2010 2011 2012 2013 2014 ACA QBP Notes. The left panel shows the raw trend of risk score for baseline high and low quality contracts. The right panel shows the event study estimates of the difference-in-difference model, controlling for contract and year fixed effects, with 95% confidence intervals based on robust standard errors clustered at the level of contract. Risk score is aggregated at the contract level weighted by plan enrollment. the share of median-to-high risk enrollees, but enrolled more low-risk individuals after the reform; at the same time, the risk score distribution for low-quality contracts is largely unchanged. Figure 2.3 plots the distributional effect for each decile of risk score, based on quantile difference-in-difference estimates in Panel (a) and changes-in-changes estimates (Athey and Imbens, 2006) in Panel (b). Both sets of estimates suggest that the reduction is concentrated in the 20%-40% decile, where risk score decreased by 0.06 for high quality contracts. We then turn to the potential mechanism of risk selection by high quality contracts. Although risk pool on average improves, contracts facing different market structure and enrollee base may differ in their ability to risk-select. If the improvement in risk pool is concentrated among the set of contracts with certain characteristics, then the heterogeneous effect is indicative of the mechanism of risk selection. We consider two dimensions of heterogeneity based on baseline characteristics: risk composition across service areas, and market competition measured by the Herfindahl-Hirschman index (HHI). To proceed, we first define the market set for each contract as the union of counties served in baseline 2009-2010. For each county in the market set, we attach the average FFS risk score over the baseline as a measure of potential gain from risk selection in this county: lower FFS risk score implies new enrollments into private Medicare likely have 15 Figure 2.2: Effect of the QBP reform on risk score, kernel density (a) Low-Quality Contracts 0 1 2 3 4 5 .6 .8 1 1.2 1.4 1.6 risk score 2009 −2010 2011 −2014 (a) low −quality contracts 0 1 2 3 4 5 .6 .8 1 1.2 1.4 1.6 risk score 2009 −2010 2011 −2014 (b) high −quality contracts (b) High-Quality Contracts 0 1 2 3 4 5 .6 .8 1 1.2 1.4 1.6 risk score 2009 −2010 2011 −2014 (a) low −quality contracts 0 1 2 3 4 5 .6 .8 1 1.2 1.4 1.6 risk score 2009 −2010 2011 −2014 (b) high −quality contracts Notes: Graph plots the kernel density of risk scores for low-quality contracts (left) and high-quality contracts (right). In each case, the solid line plots the density in the pre (2009-2010) period. The dotted line plots the density in the post (2011-2014) period. lower risk, making the county more advantageous for risk selection. We then average over counties to derive a risk selection measure at the contract level. The median across contracts is 0.99, and the 15th (85th) percentile is 0.90 (1.07). We hence posit that high quality contracts with service area risk score below the median are more likely to succeed in risk selection, relative to low quality contracts with service area risk score above the median. Furthermore, we expected to see even stronger effects at the 15% tails, where we compare high quality contracts below the 15th percentile of service area risk with low quality contracts above the 85th percentile. 21 Columns (1)-(2) in Table 2.4 show significant reduction in the risk score of high quality contracts more advantageous for risk selection: those in the lower 15% of service area risk distribution decreased risk score by 4.5 percentage points, relative to low quality contracts in the upper 15% of the distribution. Figure A.2.3 shows that the effect is visible starting in year 2011, and becomes stronger and more significant in later years. Alternatively, one might believe contracts with greater market power are better able to risk select. To derive a measure of market power at the contract level, we first compute the HHI for each county over the baseline, and then average over the market set for contracts. The median HHI in the baseline is 0.44, and the 15th (85th) percentile is 0.31 (0.61). Columns (3)-(4) in Table 2.4 show the same effect by market competition. In the full sample, baseline high quality contracts with high market power reduced their risk score 21 Bauhoff (2012) conducted a field experiment to test supply-side selection among private insurers in Germany. He finds that plans are less likely to follow-up on applications from high-risk regions. 16 Figure 2.3: Effect of the QBP reform on risk score, distributional effect (a) Differences-in-Differences −.15 −.1 −.05 0 .05 .1 10 20 30 40 50 60 70 80 90 quantile (%) (a) quantile diff −in −diff −.15 −.1 −.05 0 .05 .1 10 20 30 40 50 60 70 80 90 quantile (%) (b) change −in −change (b) Changes-in-Changes −.15 −.1 −.05 0 .05 .1 10 20 30 40 50 60 70 80 90 quantile (%) (a) quantile diff −in −diff −.15 −.1 −.05 0 .05 .1 10 20 30 40 50 60 70 80 90 quantile (%) (b) change −in −change Notes: Graph plots the quantile difference-in-difference estimates for each decile of risk score on the left, and changes-in-changes estimates on the right. The deciles are compared across four groups: high vs low quality before and after the QBP . 95% confidence intervals are plotted. by 1.8 percentage points relative to low quality contracts with low market power, although the effect is not significant. The result is more tenuous when comparing the 15% tails. Figure A.2.4 shows the corresponding raw trends and event study estimates. 22 Overall, unlike service area risk score, there is no clear evidence that market competition has strong bearing on risk selection in this context. Therefore, in the within-contract analysis in Section 2.6, we utilize the risk score variation across service areas to detect any differential pricing response that may have contributed to changes in the risk pool. 2.5.2 Market Characteristics A possible explanation to the increase in risk selection in the previous section concerns a change in the characteristics of the service area served by high- and low quality contracts following the reform. For example, high quality contracts may have expanded their service areas to include more counties with low risk scores, or may have increased the number of plans offered in these counties. The exact mechanism of risk selection has implications for the empirical strategy suitable for the analysis. In particular, if characteristics of service area responded endogenously to bonus payment incentives, then strategies exploiting the within-contract cross-location differences in risk may yield biased estimates on the selection response of insurers. To detect changes in market characteristics attributable to changes in the market set, 22 Comparing high quality contracts with HHI below the 15th percentile with low quality contracts with HHI above the 85th percentile also renders insignificant estimate. 17 Table 2.4: Effect of QBP on risk score, by service area risk and compe- tition (I) (II) (III) (IV) Treat Post –0.036*** –0.045** –0.018 0.0094 (0.011) (0.019) (0.012) (0.018) Treat low risk, high quality high HHI, high quality Control high risk, low-quality low HHI, low-quality Sample +/-median 15% tails +/-median 15% tails y mean 1.00 1.02 0.98 0.99 R 2 0.88 0.90 0.85 0.89 N 534 211 506 191 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows difference-in-difference estimates on risk score, aggregated at contract level using enrollment weights. In column (1), treated group is baseline high quality contracts with service area risk below the median, and the control group is baseline low-quality contracts with service area risk above the median. In column (3), treated group is baseline high quality contracts with HHI below the median, and the control group is baseline low-quality contracts with HHI above the median. Column (2) and (4) further limits the sample to 15% tails. Construction of baseline characteristics measures is described in the main text. All regressions include contract level fixed effects. Standard errors clustered at the contract level in the parenthesis. we replace yearly county characteristics with values in 2012, and then average over the market set for contract-year observations. The resulting variable captures the effect of market composition on contract characteristics, rather than the temporal variation in these characteristics. For example, if the service area FFS risk score improved, then there is reason to believe contracts may have entered low risk counties or exited high risk counties, depending on the change in market size. Table 2.5 shows little change in market set characteristics. There is some evidence of high quality contracts expanding their market set over time, but the effect is not significant. The number of plans offered also did not change (column 5). Importantly, service area risk (column 2) barely changed for high quality contracts after the reform, suggesting contracts did not differentially enter or exit counties based on the baseline risk. While counties with low FFS risk score are more advantageous for selecting low cost enrollees, there is no evidence of extensive margin selection whereby high quality contracts expand their market set to include more of these low risk counties. In addition, since the QBP also varied the county benchmark rate by quality rating, we 18 check if contracts differentially select into counties with higher ACA benchmark (column 3) unadjusted by quality, or double-bonus counties (column 4) under the QBP where benchmark bonus to 5-star contracts is over 8%, or more than a 60% top-off. We see no evidence of differential selection by high quality contracts along these margins. Table 2.5: Effect of QBP on market characteristics (I) (II) (III) (IV) (V) # Counties Risk Benchmark High-Bonus # Plans County High Post 8.70 0.0024 1.80 –0.020 –0.17 (8.39) (0.0024) (2.94) (0.021) (0.23) y mean 25.09 0.99 799.15 0.72 3.40 R 2 0.73 0.98 0.96 0.90 0.87 N 1,122 1,122 1,122 1,122 1,122 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows difference-in-difference estimates on market size and characteristics. Outcome is at the contract-year level. Numbers of counties and plans are counted within contract-year.risk and benchmark are contract-year averages of 2012 characteristics over the market set, and hence reflect differential county entry or exit by these characteristics. If a county later on receives an above 8% bonus benchmark for 5-star contracts, it is assigned the high-bonus status. Column (4) looks at if contracts cover more of these counties after the reform. All regressions include contract level fixed effects. Standard errors clustered at the contract level in the parenthesis. 2.5.3 Bid, Rebate, and Pricing Since market characteristics did not change substantially to explain the risk selection result, we now turn to assess how pricing responded to quality bonus payments. Under the law, higher quality contracts face higher benchmark. In principle, contracts can increase the bid to receive higher payments without reducing the rebate to enrollees. Furthermore, the rebate bonus to quality allows the bid to increase more than the benchmark for a fixed amount of rebate. Of course, rebates need not stay constant if part of the bonus payment is passed on to enrollees in the form of lower premium or cost-sharing. Table 2.6 studies the effect of QBP on bids and rebates. As a result of the bonuses introduced by QBP, benchmarks for high quality contracts increased by $27:84 (cf Table 2.1). In response, high quality contracts raised their bids by $37:01, resulting in a net narrowing of the benchmark-bid gap by $9:17. However, adjusting for the rebate bonus in the QBP, the final rebate accruing to enrollees did not change significantly. At the contract 19 level, enrollees in high quality contracts received $0:40 more in rebate, but this effect is not significant. 23 Table 2.6: Effect of QBP on bidding and rebate (I) (II) (III) (IV) Benchmark Bid Benchmark-Bid Rebate High Post 27.84*** 37.01*** –9.17 0.39 (7.10) (7.50) (6.07) (3.68) y mean 874.10 763.38 110.72 78.37 R 2 0.83 0.84 0.83 0.87 N 1,122 1,122 1,122 1,122 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows difference-in-difference estimates on benchmark, bid and rebate. Outcome is at the contract-year level. We aggregate plan level benchmark (an enrollment- weighted average of county benchmarks, higher for higher quality contracts after QBP), bid, and rebate (inclusive of rebate bonus after QBP) to the contract level using enrollment weights. All regressions include contract level fixed effects. Standard errors clustered at the contract level in the parenthesis. Absent large changes in the rebate, changes in premium and cost-sharing are also small. Table 2.7 shows null effects on premium charged for the Part C coverage, prescription drug coverage (Part D), and total premium (column 1). The effect on drug deductible is also small. Specifically, since the vast majority of contracts have zero deductible, we show that the mass of zero deductible plans did not change after the reform. Average drug deductible decreased by half. On the raw trend, however, the difference appears to be driven by an early increase in deductible by low quality contracts in 2011: starting in 2012 both high and low quality contracts increased deductible at a similar rate (Figure A.2.5). The effect, moreover, is only marginally significant. Therefore aggregated at the level of contracts, we do not find significant pricing variation consistent with the risk selection effect. In particular, high quality contracts did not increase or decrease premium or deductibles of prescription drugs, and rebate to enrollees is unchanged. The contract-level comparison, however, does not capture the within-contract cross- location pricing and benefit design. The within-contract selection can potentially alter the risk pool composition without revealing any marked changes in contract-level pricing, if, for example, price increases in higher risk areas are offset by decreases in lower risk areas. 23 Since the QBP reform is essentially a supply-side shock directly affecting revenues rather than marginal costs, the estimates in Table 2.6 suggest that bids do not only depend on marginal costs (Song et al., 2012, 2013) – a finding highlighted in Curto et al. (2014) as well. 20 Table 2.7: Effect of QBP on premium and drug deductible (I) (II) (III) (IV) Premium Zero Premium Drug Deduc Zero Deduc High Post 3.14 0.032 –16.98* 0.051 (3.56) (0.025) (8.98) (0.045) y mean 49.07 0.41 32.62 0.84 R 2 0.91 0.88 0.69 0.63 N 1,122 1,122 1,122 1,122 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows difference-in-difference estimates on premium and drug deductible. Out- come is at the contract-year level. We aggregate plan level data to the contract level using enrollment weights. All regressions include contract level fixed effects. Standard errors clustered at the contract level in the parenthesis. To further probe this possibility, in the analysis below, we investigate how pricing varies within contract across counties above and below the median risk county in the market set. 2.6 Within-Contract Cross-County Evidence This section investigates the nature of the risk selection that emerged in the previous section by looking inside the market set. We examine if contracts are more likely to deploy high-premium, high-deductible plans in service areas less advantageous for risk selection. For identification, we assume that absent the policy, the distribution of insurance pricing across risk regions is parallel for both high and low-quality contracts. As with most difference-in-differences analysis, we assess the plausibly of the identifying assumption by inspecting the pre-trend commonality for high and low risk regions within and between high and low-quality contracts. To measure average prices at the contract-year-location level, we weight plan premiums and deductibles by enrollment. To measure a county’s relative standing in the overall risk composition across service areas, we measure the distance of county risk to the median risk in the market set, and use the distance to median as the driving variation in the within-contract analysis. 24 24 In particular, we use baseline market set characteristics in 2009-2010 to derive the distance measure. We rank all counties served by the same contract in the baseline by their baseline FFS risk score. Comparing counties with the median county gives the distance-to-median measure. Note that this measure is fixed for a given contract-location pair, and does not vary over time. 21 We estimate the following triple-difference design y clt = 0 risk cl high c post t + 1 risk cl post t + 2 high c post t +X lt + cl + t + clt ; where the unit of observation is at the level of contractc, location (county)l, and year t. risk cl is the distance-to-median measure. We include year indicators t to control for common temporal shocks, and contract-county indicators cl to absorb unobserved heterogeneity across contracts and service areas, and the baseline selection between the two. The fixed effects would not be adequate to address time-varying selection response if contracts are shown to enter or exit service areas based on local risk factors after the reform. This is not the case, however, as discussed in Section 2.5.2. In addition, we control for time-varying location-specific factors inX lt . Most notably, since the raw county benchmark is time-varying, and since QBP increased the bonus benchmark for some counties (commonly known as “double-bonus counties”), the set of which is also time varying, we include the raw benchmark, the bonus payment rate, and their interaction. 25 We hence allow for separate pricing response to benchmark variation in local markets. We cluster standard errors two-way at the level of county and contract: contracts observed in different counties are correlated, and so are counties entering different con- tracts’ service areas; within contracts (counties), counties (contracts) are assumed to be independent. Clustering at the intersection of county and contract gives similar standard errors. Table 2.8 displays the premium response to QBP across risk regions for baseline low quality (column 1) and high quality (column 2) contracts, and the differential response by high quality contracts (column 3). Low quality contracts increased premium in lower risk counties, whereas high quality contracts increased premium in higher risk counties. Both effects are visually perceptible on the raw trend (Figure A.2.6, Panel a), where the market set of each contract is divided into high (above median) and low (below median) risk regions. Hence the results are not driven by the parametric assumption that effects are linear in the deviation from median. The triple-difference estimate suggests that high quality contracts increased premium by $0:31 per one percentage point increase in risk above the median. The event study shows parallel pre-trend: high and low-quality contracts charged premium similarly across risk regions. After the passage of QBP, high quality contracts significantly increased 25 The interaction measures the maximum benchmark faced by 5-star contracts serving the county in a given year. 22 Table 2.8: Effect of QBP on premium, within-contract cross-county variation (I) (II) (III) (IV) (V) (VI) Risk High Post 30.51** 35.61** (12.54) (14.26) Risk Post –14.99* 20.99* –14.75* –13.49 21.64 –14.66* (8.73) (11.20) (8.66) (8.87) (13.37) (8.60) High Post –1.25 –1.16 (4.86) (4.83) Counties all 15% Tails Sample low high full low high full y mean 42.93 78.55 52.69 42.44 75.47 51.42 R 2 0.85 0.85 0.87 0.85 0.86 0.87 N 14,861 5,611 20,472 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract premium variation across risk regions. Column 1-2 show the difference-in-difference estimates on premium across risk regions for baseline low-quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, which gives the differential pricing response by high quality contracts in higher risk counties. Column 4-6 repeat the analysis, but only include counties in the lower and upper 15% of the risk distribution within a contract’s market set. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. premium in higher risk counties, whereas response by low-quality contracts is small and not significant in most years (Figure A.2.6, Panel b). The same pattern holds when we only include counties in the lower or upper 15% of the risk distribution given contract (columns 4-6). We further investigate the impact of quality payments on the Part C and D premiumm. While we do not find significant pricing variation for Part C (see Table A.1.1 in Appendix), for Part D, we do find that high-quality contracts significantly decreased premium in higher risk counties relative to low-quality contracts (see Table A.1.2 in Appendix). Specifically, the premium increased by $1:24 per ten percentage points increase in county risk. 26 We then investigate any differential pricing in drug deductible. The contract-level analysis suggests that both low and high quality contracts increased deductibles after 26 The effect becomes stronger once we focus on the extreme tails of the risk distribution: in column 4-6, we only include counties with risk score in the lower and upper 15% tails of a contract’s service area. Consistent with the hypothesis, part D premium increased more ($1:70 per ten percentage point risk score) in these counties. 23 QBP (see Table 2.7 and Figure A.2.5, Panel c). Looking within contract across risk regions, Table 2.9 shows that both contracts raised deductibles more in regions less advantageous for risk selection, and that this expansion is not significantly larger for high quality contracts (see also the raw trend in Figure A.2.7). We similarly do not detect any significant pricing differential by quality when looking at the 15% tails of the market set. Table 2.9: Effect of QBP on drug deductible, within-contract cross-county variation (I) (II) (III) (IV) (V) (VI) Risk High Post –7.06 –22.48 (46.09) (52.21) Risk Post 36.54* 48.45 39.25** 33.62** 29.33 35.50** (18.78) (46.03) (19.46) (16.87) (51.01) (17.26) High Post –12.10 –13.42 (10.02) (9.52) Counties all 15% Tails Sample low high full low high full y mean 30.05 24.92 28.65 28.62 24.91 27.61 R 2 0.70 0.61 0.67 0.68 0.67 0.68 N 14,861 5,611 20,472 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract variation in drug deductible across risk regions. Column 1-2 show the difference-in-difference estimates on zero-premium pricing across risk regions for baseline low-quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, which gives the differential pricing response by high quality contracts in higher risk counties. Column 4-6 repeat the analysis, but only include counties in the lower and upper 15% of the risk distribution within a contract’s market set. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. Put together, we find evidence for within-contract pricing adjustments across risk regions, which might explain the risk pool improvement for high quality contracts absent any significant change in average prices (including rebate) at the contract level. The ad- justment mostly affects premiums, not drug deductibles, although it may also affect other contract characteristics outside our sample. The premium adjustment, in particular, illus- trates one potential mechanism of favorable selection into Medicare Advantage: premium is higher in markets where new enrollees have higher risk, and lower in markets with lower risk. Because premium did not vary at the contract level but varied across service regions, the rebate effectively varied across enrollees in high versus low risk counties. 24 Specifically, low-risk enrollees obtained more rebate in the form of cheaper drug coverage than high-risk enrollees, even if average rebate pass-through is indistinguishable from zero. 2.6.1 Robustness Checks One concern with the analysis in Section 2.6 is that, as premium decreases, more indi- viduals in good health conditions are likely to enroll. The adjustment by enrollment is necessary for analyzing the incidence of public payments. However, it can bias the analysis of insurer price setting behavior across locations if the enrollment weights amplify the price changes. Appendix Table A.1.3 to A.1.5 shows that results are largely unchanged if we purge the enrollment effect in the measure of contract-county level premium and deductible. Pricing variation unweighted by enrollment indicates a significant increase in part D premium in high-risk counties, with similar magnitude in Table A.1.2, supporting the conclusion of the previous section. 27 Moreover, the results are robust to alternative measures of risk variation across service areas. The main analysis focuses on the distance to median. Alternatively, Appendix Table A.1.6 to A.1.8 use the distance to the mean county risk, and find similar variation in part D premium and insignificant effects on part C premium and drug deductibles. Appendix Figure A.2.13 plots the event study trends for this set of estimates. Across service area, instead of looking at the upper and lower percentiles of county risk, Appendix Table A.1.9 focus on counties more than one standard deviation away from the mean county risk. The effects are comparable to those in the main analysis across 15% tails. Finally, Appendix Table A.1.10 looks at the offering of zero-premium and zero-deductible plans across service area. Consistent with results from the main analysis, high-quality contracts are significantly less likely to offer plans with zero part D premium in high-risk counties, and enrollment decreased in these counties (Appendix Table A.1.11). We do not find similar effect on the plans with zero part C premium or drug deductible, or enrollment in these plans. The differential pricing and enrollment effect specific to part D is evident on the raw trend (Appendix Figure A.2.14). 2.7 Why Does QBP Induce Risk Selection? So far we have shown that high quality contracts significantly improved their risk pool after QBP, and linked the improvement to differential premium pricing across risk regions. 27 The corresponding event study and raw trends are in Appendix Figure A.2.8 to A.2.12. 25 We suggested mechanisms for how risk selection is accomplished, but have been silent on why risk selection is incentivized in first place: what are the design features of QBP that made it profitable for high quality contracts to select low risk enrollees? This section suggests the incentive may lie with the design of the quality rating system, and is activated by the financial reward to high quality introduced in QBP . When high quality contracts are able to keep most of the quality bonus payment as profit rather than rebate to consumers, risk selection becomes profitable if lower enrollee risk contributes to higher quality and hence continued bonus payment. We highlight the linkage between risk, quality and insurer profit in a stylized model below. We then empirically characterize the correlation between risk and quality. We exploit the same difference-in-differences variation as in Section 2.5 to show that contracts with higher risk score in the baseline are less likely to perform well in patient outcome measures in the quality rating. Furthermore, high quality contracts with high baseline risk scores experience smaller improvements in outcome ratings relative to low quality contracts with a more favorable risk pool. We further document a negative correlation between risk score and the Star score, particularly for baseline high quality contracts. For these contracts, performance in outcome-related domains are the most predictive of the final rating. 2.7.1 A Model of Risk, Quality, and Insurer Profit To illustrate how the risk-quality linkage can affect insurer pricing, we build a simple two-period model where the risk pool in the first period affects quality rating and payment in the second period. We focus on the decision of baseline high quality contracts. Revenue in each period is the sum of a benchmark and a premium charged to the enrollees. The benchmarkB is fixed in the first period. In the second period, high quality contracts receive higher benchmarkB h >B l . Contract quality is not constant: with probability(h;r), baseline high quality contracts remain high quality in the second period. We model the risk-quality linkage by allowing the transition probability to depend on the risk scorer from period one. If low risk score contributes to future high quality, then there is incentive to attract low risk enrollees in the current period. We assume risk adjustment is perfect, so that risk selection would have no bearing on insurer profit, absent the dynamic linkage on quality and quality payments. A high quality contract chooses premium (p;p 0 ) in both periods to maximize profits (p;p 0 ) = (pc +B h )s h (p) + X j2fl;hg (j;r) p 0 c +B j s j (p 0 ); 26 where the insurer faces constant marginal costc, and demand is allowed to differ by quality ins j (p),j2fl;hg. The optimal premium set in the first period is p = (cB) 1 1 j h j 1 + 0? s h (p) d dr |{z} Policy dr dp |{z} Selection 1 ; (1) where 0? > 0 is the optimized profit difference between high and low quality in the second period, 28 and h < 0 is the premium elasticity of demand for high quality contracts. Absent the risk-quality linkage, we have the standard result that optimal premium equals marginal cost plus a mark-up inverse to demand elasticity. When d dr , 0, however, optimal premium responds to the selection term dr dp . Specifically, a market is more advantageous to risk selection, if lower premium attracts enrollees below the average risk of the contract, or dr dp > 0. In these markets, when enrollee risk lowers contract quality, the term 0? s h (p) d dr dr dp is signed negative, pushing premium below the standard level where d dr = 0. 29 Hence, observed premium variation is consistent with insurer risk selection, if lower enrollee risk contributes to higher quality rating. Under this configuration, we should expect lower (higher) premium in lower (higher) risk regions, relative to the baseline period where d dr = 0. We then examine the risk-quality linkage in detail, signing d dr empirically. 2.7.2 Empirical Evidence on the Risk-Quality Mechanism Before empirically characterizing the risk-quality correlation, we present difference-in- differences evidence on the nature of the correlation using similar variation as in the previous sections. One challenge is that, because the rating algorithm underwent substan- tial revision in 2011, the same year quality bonus payment was introduced, differential trending in the quality rating after the reform may reflect mechanical differences in the rating computation, rather than insurer risk selection response to payments. On the other hand, the selection response implies that low risk enrollees with fewer diagnoses are associated with higher quality, possibly through improvements in outcome measures in the quality rating. We hence focus on outcome measures, and document the 28 That is, 0? = (p 0 h c +B h )s h (p 0 h ) (p 0 l c +B l )s l (p 0 l )> 0, wherefp 0 h ;p 0 l g is the vector of optimal premiums to be charged in the second period. In addition, we assume that high quality firms have higher profits than low quality ones. 29 Alternatively, in markets less advantageous to risk selection with dr dp < 0, premium is set higher than the benchmark level. 27 relationship between ratings in these measures and insurer risk profiles in the baseline. One particular advantage of this strategy is that outcome measures are relatively stable over the sample period, allowing for a difference-in-differences characterization of the risk-outcome channel unaffected by changes in rating measurement or computation. Specifically, we focus on outcome measures that are consistently measured from 2009 to 2014. They are improving physical health, improving mental health, diabetes controlled– blood sugar, diabetes controlled–cholesterol, and blood pressure controlled from Part C. 30 We average over these measures to derive a summary star rating of patient outcome. When we regress the final rating on the constructed outcome rating, the coefficient before the outcome rating mechanically increases after 2012 (Table 2.10). Although the estimate may also reflect changes in the rating computation other than the weighting, the importance of outcome in the final rating generally increased over the 2012-2014 period (Figure A.2.15). 30 Part D outcome measures in the “drug safety and adherence” domain are not consistently present over the sample period. In particular, three new outcome measures (medication adherence for diabetes, hypertension and high cholesterol, respectively) are added in 2012. 28 Table 2.10: Weight increase of outcome measures in quality rating (I) (II) (III) (IV) (V) (VI) Star Rating 4.0 Star 4.5 Star Outcome Post 0.18*** 0.24*** 0.14*** 0.19*** 0.18*** 0.20*** (0.030) (0.031) (0.036) (0.033) (0.024) (0.030) y mean 3.59 3.59 0.35 0.35 0.16 0.16 Post 2011 2012 2011 2012 2011 2012 R 2 0.77 0.78 0.57 0.58 0.57 0.57 N 1,080 1,080 1,080 1,080 1,080 1,080 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the difference-in-difference estimate of average outcome measure rating (outcome) in the final star rating before and after the reform year in 2011 (odd columns) or in 2012 (even columns). All outcome measures averaged inoutcome are present in the rating and measured consistently throughout 2009-2014. Starting 2012, these measures receive a 3.0 weight in the computation of the final rating. Column 1-2 estimates the effect of weight increase on the final star rating, which ranges from 1.5 star to 5.0 star at 0.5 star increments. Column 3-4 estimates the effect on the binary outcome of having at least 4.0 stars. Column 5-6 estimates the effect on having at least 4.5 stars. Unlike other difference-in-difference analysis in the paper,outcome is measured at the same period as outcome, and is not fixed at baseline (2009-2010) value. The purpose is to confirm the mechanic weight increase in the rating computation. All regressions include contract and year fixed effects. Standard errors clustered at the contract level in the parenthesis. 29 We then examine the correlation between risk and outcome measures using the difference-in-difference variation, comparing the outcome rating of contracts serving low versus high-risk enrollees in the baseline. Table 2.11 shows a ten percentage point increase in baseline risk score in 2009-2010 reduces outcome rating by 12.2 percentage points. When we group outcomes by health improvement measures in the Health Outcome Survey (column 2) and chronic condition measures in the Healthcare Effectiveness Data and Information Set (column 3), the risk-outcome correlation appears entirely driven by the chronic condition measures, with significant advantage to contracts serving low-risk enrollees in the baseline (Figure A.2.16). To better understand the correlation, we zoom in on the three HEDIS measures of chronic conditions: Diabetes Care – Blood Sugar Controlled, Diabetes Care – Cholesterol Controlled, and Controlling Blood Pressure. The two diabetes measures are based on MA enrollees diagnosed with diabetes (the denominator). In the case of blood sugar control, HEDIS calculates the share of diabetic MA enrollees “whose most recent HbA1c level is greater than 9%, or who were not tested during the measurement year (numerator)”, and then subtract the ratio from 100 to calculate the percent of high blood sugar conditions managed. In the 2013 rating, the threshold for a 4-star (5-star) rating in this measure is 80% (88%). 31 Cholesterol Controlled is measured by the percent of diabetic enrollees with a LDL-C testing below 100 mg/dL during the enrollment period. Controlling Blood Pressure is measured by the percent of enrollees diagnosed with hypertension who lowered blood pressure below 140/90 during enrollment. Although all three measures focus on the improvement in outcomes, the clinical thresholds do not account for the baseline severity of measured conditions, or the presence of other diagnoses that complicate the management of the measured conditions. Specifically, the denominator in each measure is the simple sum of diabetes and hypertension enrollees, unadjusted by other diagnoses in the risk score or the severity of these conditions. Given uniform clinical thresholds in the numerator, enrollees with fewer diagnoses and milder conditions are associated with higher outcome ratings, and are favorably selected by insurers. We further inspect the risk-outcome correlation by baseline quality status. Specifically, the treated group is the set of baseline high quality contracts where enrollee risk score over 2009-2010 is above the median of all contracts, and the control is baseline low quality contracts with risk scores below the median. 32 If high risk enrollees are associated with 31 Details of the 2013 star ratings are available in the Technical Notes, accessible at https: //www.cms.gov/Medicare/Prescription-Drug-Coverage/PrescriptionDrugCovGenIn/Downloads/ 2013-Part-C-and-D-Preview-2-Technical-Notes-v090612-.pdf 32 Baseline risk score ranges from 0.74 to 1.46 in 2009-2010 for treated and control contracts, with a 30 Table 2.11: Risk score and outcome rating (I) (II) (III) Outcome Health Diabetes & Mean Improved Blood Pressure Risk Post –1.22** –0.11 –1.37** (0.48) (0.27) (0.58) y mean 3.45 3.28 3.60 R 2 0.63 0.22 0.69 N 997 888 991 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the difference-in-difference estimates on outcome rating, across contracts with different enrollee risk scores in the base- line (2009-2010). Column 1 looks at the effect of baseline risk on the average rating of outcome measures. Column 2 and 3 divide the outcome measures by data source. Self-reported improvement in phys- ical and mental health, collected from HOS, is the focus of column 2. HEDIS measures of having diabetic conditions and high blood pressure controlled are the focus of column 3. All regressions include contract and year fixed effects. Standard errors clustered at the contract level in the parenthesis. worse outcomes measured in the quality rating, then a more favorable risk pool may help baseline low quality contracts obtain high quality standing. High quality contracts serving high risk enrollees, on the other hand, risk losing bonus payments if outscored in outcome measures by control contracts with low risk enrollees. Table 2.12 suggests that the risk-outcome correlation disadvantages high quality con- tracts with high risk enrollees, where loss of high quality standing to low-risk low-quality contracts is more likely. In odd columns, we compare high and low quality without interacting with baseline risk. Overall, high quality contracts are less likely to retain high outcome rating as low quality contracts are likely to obtain it, in particular due to the difficulty in consistently managing chronic conditions related to blood pressure and diabetes. The difference is more striking, once we compare baseline high-risk high-quality contracts with low-risk low-quality contracts in even columns: falling in the upper half of the risk score distribution exposed high quality contracts to an additional 20 percentage point slippage in outcome rating, relative to low-quality competitors in the more favorable half of the risk distribution (Figure A.2.17). median of 0.97. 31 Table 2.12: Quality, risk and outcome rating (I) (II) (III) (IV) (V) (VI) Outcome Mean Health Improved Diabetes & Blood Pressure High Post –0.26*** –0.44*** 0.026 0.043 –0.42*** –0.62*** (0.074) (0.10) (0.044) (0.070) (0.098) (0.13) y mean 3.45 3.38 3.28 3.30 3.59 3.49 Treated high quality (+ high risk) high quality (+ high risk) high quality (+ high risk) Control low-quality (+ low risk) low-quality (+ low risk) low-quality (+ low risk) R 2 0.63 0.65 0.23 0.20 0.70 0.71 N 1,089 525 952 456 1,083 522 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the difference-in-difference estimates on outcome rating, across contracts with different baseline quality in odd columns, and further across baseline enrollee risk scores in even columns. Specifically, in even columns, treated contracts are baseline high quality with enrollee risk score higher than median (0.97), and control contracts are baseline low-quality with risk score below the median. Column 1-2 look at the quality and risk differential on average outcome ratings. Column 3-4 look at the effect on self-reported improvement in physical and mental health collected from HOS. Column 5-6 look at the effect on diabetes and high blood pressure control measures from HEDIS. All regressions include contract and year fixed effects. Standard errors clustered at the contract level in the parenthesis. 32 2.7.3 Characterizing the Risk-Outcome Correlation The difference-in-difference evidence suggests a strong negative correlation between en- rollee risk score and outcome rating, in particular, rating in managing chronic conditions such as diabetes and blood pressure measured in HEDIS. Instead of using baseline enrollee risk, in this section, we directly empirically characterize the correlation between contem- poraneous risk score and outcome rating. To do this, we note that outcome measures relevant for yeart quality rating, announced in the fall of yeart 1, are collected from patients enrolled in the contract two years prior int 2. Due to the timing, we expect a strong negative correlation only between yeart rating and yeart 2 risk score, but not across other lag or lead periods. Table 2.13 shows the correlation between yeart outcome rating in diabetes and blood pressure control and risk scores from multiple periods. That is, in addition to contempora- neous correlation, we also examine correlation with risk score one year in lag (riskscore t3 ) up to two years in lead (riskscore t ). We further stratify the exercise by baseline contract quality. The risk-outcome correlation is weak among low quality contracts (baseline 3.0- 3.5 stars), but becomes more negative and significant as we restrict the sample to higher quality contracts. For the set of contracts achieving at least a 4.5 star rating in the baseline, a ten percentage point increase in enrollee risk score is associated with 33 percentage point decrease in chronic outcome rating. There is, however, no clear and consistent pattern between risk score and outcome for different lag and lead periods. Similar results hold for the correlation between risk score and final star rating (Table 2.14). 33 Table 2.13: Risk-outcome correlation across periods (I) (II) (III) (IV) (V) (VI) (VII) (VIII) (IX) (X) (XI) (XII) Risk Score t3 0.90 1.56* 0.97 (1.14) (0.90) (1.20) Risk Score t2 0.39 –1.07 –3.34** (0.75) (1.09) (1.44) Risk Score t1 1.30* –0.67 0.68 (0.69) (0.73) (1.08) Risk Score t 1.53*** –0.47 –1.54 (0.53) (0.71) (1.32) Baseline Star 3.0-3.5 4.0 4.5 3.0-3.5 4.0 4.5 3.0-3.5 4.0 4.5 3.0-3.5 4.0 4.5 R 2 0.67 0.85 0.66 0.65 0.84 0.84 0.63 0.78 0.80 0.63 0.75 0.74 N 336 146 46 472 210 70 611 269 91 760 340 126 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows OLS-estimated correlation between outcome rating in diabetes and blood pressure management and enrollee risk score across multiple periods. Contemporaneous correlation occurs between yeart outcome rating and yeart 2 risk score. Correlation with lag risk score int 3 and lead risk score up to yeart is also examined. For each time pair, table shows separate correlation for baseline low-quality (3.0-3.5 stars), high quality (4.0 stars and above) and very high quality (4.5 stars and above) contracts. All regressions include contract and year fixed effects. Standard errors clustered at the contract level in the parenthesis. 34 Table 2.14: Risk-quality correlation across periods (I) (II) (III) (IV) (V) (VI) (VII) (VIII) (IX) (X) (XI) (XII) Risk Score t3 0.94* 1.15 1.72 (0.55) (0.81) (1.61) Risk Score t2 –0.18 0.28 –2.76*** (0.49) (1.00) (0.80) Risk Score t1 0.49 1.54** –1.23* (0.37) (0.59) (0.60) Risk Score t 0.44 0.88 –1.06 (0.29) (0.54) (0.77) Baseline Star 3.0-3.5 4.0 4.5 3.0-3.5 4.0 4.5 3.0-3.5 4.0 4.5 3.0-3.5 4.0 4.5 R 2 0.73 0.66 0.14 0.67 0.64 0.71 0.65 0.62 0.73 0.66 0.57 0.66 N 337 136 38 479 202 64 618 259 82 792 338 118 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows OLS-estimated correlation between final star rating and enrollee risk score across multiple periods. Contemporaneous correlation occurs between yeart star rating and yeart 2 risk score. Correlation with lag risk score int 3 and lead risk score up to yeart is also examined. For each time pair, table shows separate correlation for baseline low-quality (3.0-3.5 stars), high quality (4.0 stars and above) and very high quality (4.5 stars and above) contracts. All regressions include contract and year fixed effects. Standard errors clustered at the contract level in the parenthesis. 35 2.8 Discussion Central to our finding is the premium variation within contract across risk regions. Specifi- cally, high quality contracts differentially attract low-risk enrollees with lower premium in low-risk counties, and improve risk score relative to low quality contracts. Absent changes in average prices, selection implies that within contract, rebates are transferred from high risk to low risk enrollees. The insurer selection thus calls into question the distributional incidence of quality payment, with immediate policy and welfare implications. First, the benefit of quality improvement may be weighed down by the social cost of unequal access to quality, as high-performing contracts disproportionately serve low-risk enrollees in low-risk counties. To the extent that enrollees in worse health conditions benefit more from quality care, return to care is higher if enrollment increased more in areas with higher baseline risk. In this context, disparity is worsened by a negative correlation between risk and quality, which tends to penalize contracts serving high- risk enrollees. Alternatively, a positive risk-quality correlation may improve equity by encouraging more high-quality entry in high risk counties, although it is not clear why a quality measure should respond to the risk composition. More fundamentally, one may question the role of baseline health in measures of plan quality – if quality reflects the “value-added” of health care on health, with higher quality improving health more, then baseline health should be swept out of quality ratings as a fixed effect. In that sense, the empirical correlation with pre-enrollment risk measures should be tenuous. However, using health improvement as indicator of quality may still fall short, if baseline health affects treatment efficacy interactively. This would be the case if, for example, patients with milder conditions recover sooner than those with more severe conditions, and the difference narrows but does not disappear at higher quality. Therefore to effectively take out the risk confound in quality measures, the current outcome measures need to be adjusted by the severity of baseline health conditions. The idea is that, conditional on a similar case-mix of diagnoses and potential interactions, differential improvements in patient outcome are more plausibly attributable to health care quality rather than baseline health. A simple fix may begin with better use of risk scores in the quality measures, to adjust for co-existing complications that hamper the management of diabetic conditions. More sophisticated adjustments should also control for the severity of baseline conditions, similar to the case-mix index that adjusts for hospital quality used in payments. In our analysis, we find the risk-quality correlation operates through the outcome measures in the “managing chronic conditions” domain. Most of these measures treat health improvement, i.e., having chronic conditions controlled, as indicator of quality. 36 For example, a measure of blood pressure control is given by the fraction of baseline hypertension patients (denominator) with blood pressure below 140/90 (numerator) in the measurement period. However, these measures do not adjust for the severity of baseline conditions. This allows baseline risk score, a crude measure of severity, to correlate significantly and negatively with performance in chronic outcome measures. By contrast, self-reported health improvement in HOS is adjusted by respondent socio-demographic characteristics, and we find little residual correlation between risk score and health improvement measures in the “staying healthy” domain of the quality rating. Hence our results suggest the risk-quality correlation, as well as its perverse incentive on risk selection, is largely suppressed if chronic outcome measures in HEDIS are appro- priately adjusted for baseline severity of conditions. Currently, any such adjustment is lacking for these outcomes. An alternative approach is to adjust for enrollee risk score ex-post at the stage of quality payment. In practice, it would require policy makers form the right belief as to the bias in quality rating ( d dr in Equation 1) and the magnitude of the selection response ( dr dp ) to effectively offset the selection incentive. We thereby argue that adjusting for ex-ante risk in the quality rating is simple to implement, recovers a less biased measure of quality, and can go a long way in reducing the selection incentives associated with quality payments. 2.9 Conclusion We examine the insurer selection response to quality bonus payments in the Medicare Advantage (MA) market. The 2012 onset of the Quality Bonus Payment (QBP) demon- stration varied payment generosity by quality rating, which we exploit in the difference- in-difference analysis. We find that pass-through of the bonus payments to enrollees is minimal: high-quality contracts eligible for higher bonus payments increased bids by nearly the full amount of the bonus, leaving rebate to enrollee unchanged. Correspond- ingly, premium did not differentially decrease, or generosity increase, for enrollees in high-quality contracts. Within contracts, however, we uncover significant premium variation across high- and low-risk counties for high-quality contracts, but not for low-quality contracts. Corre- spondingly, risk pools improved significantly for high-quality contracts after the payment reform. We provide evidence that a negative correlation between enrollee risk score and patient health outcome can explain the selection response. In this case, low-risk enrollees contribute to continued high-quality rating and bonus payments. Our results have important normative implications for quality bonus payments in the 37 MA market and similar value-based models elsewhere. One fundamental issue is the measurement of quality. To take out the risk confound, the current outcome measures need to be adjusted for the case-mix and severity of baseline conditions and ideally be made conditional on risk scores. Failure to do so raises complicated distributional issues. For example, high-quality care is made less accessible to the low-income, high-risk population, who potentially benefit more from quality. The equity concern among inframarginal enrollees and a near zero pass-through on average illustrate the subtle but critical role of insurer selection in the welfare incidence of value-based payments. 38 3 Expanding Health Insurance with Mandate and Subsidy: Theory and Evidence 3.1 Introduction Governments worldwide invest large amounts of resources in the health, education, and well-being of its citizens. In the United States, health care spending exceeds 18% of GDP in 2017. The 2010 Affordable Care Act expanded health insurance coverage with premium subsidy and mandate penalty, which is expected to increase health care spending to 20% of GDP by 2020 (Keehan et al., 2011). In contrast, the benefits and beneficiaries of insurance expansion, and the motivation they present for government mandate and subsidization of health insurance relative to cost, remain less understood. The canonical motivation for government intervention in the insurance market is adverse selection (Akerlof 1970; Rothschild and Stiglitz 1976). Selection drives out low- cost individuals and increases insurance premium. In private insurance markets, welfare loss from the pricing inefficiency alone tends to be small, however, and does not fully justify government subsidy of premium (Einav et al., 2010). Moreover, while information asymmetry can potentially explain the lack of private insurance for certain risks (Hendren, 2017), the sweeping nature of a mandate also limits the empirical variation useful for understanding the role of adverse selection, if any, in these contexts. A second motivation recognizes the fact that the uninsured do not bear the full financial risk of health care expenditure; much of the cost is borne by third-party payers (Finkelstein et al., 2015). When the government cannot pre-commit to not providing care to the uninsured—effectively implementing an implicit social insurance mandate, demand for formal insurance falls below own expected cost imposed on society (Finkelstein et al., 2017). Expanding formal insurance lowers the social cost of uncompensated care, and depending on the distribution of the cost across economy, may outperform the implicit safety-net coverage in the public finance of the mandate. This paper evaluates both inefficiencies in determining the desirable scope of social insurance. I first examine the motivations they present for a universal insurance mandate, under the stylized setting of perfect competition and zero behavioral responses. In this case, the pricing externalities are only subject to the resource cost of insurance. Adverse selection alone presents weak justification for a mandate, unless full redistribution across risk in addition to income is desirable (Blomqvist and Horn, 1984). Formalizing an implicit safety-net mandate with tax-financed subsidy always improves welfare, if the tax base achieves better distribution of healthcare costs than the risk. In lieu of uncompensated 39 care, a subsidized universal mandate maximizes welfare with equity. Although both adverse selection and uncompensated care potentially motivate an insurance mandate, I assess their empirical relevance in rationalizing the 2006-2007 insur- ance expansion in Massachusetts. I focus on two policy instruments that expanded formal insurance in the state: subsidy of insurance premium, and tax penalty on uninsured indi- viduals ineligible for subsidy. I develop a welfare framework where the incentive effect on take-up, the selection effect on cost, and the resulting pricing effect on premium versus the surcharge burden of uncompensated care, provide sufficient statistics to characterize and compare the motivating benefits. The net benefit relative to cost indicates the desirability of incremental expansion of insurance using policy generosities. The cost-benefit framework traces out the social externality of policy incentives across economy. The take-up response changes the cost composition in formal insurance and uncompensated care. Adverse selection implies expanding formal insurance lowers the average cost in both. The cost change is passed onto insurance premium and the charity surcharge, according to premium regulations and the public finance of charity care. In practice, I consider a range of pass-through from costs to prices. In particular, I shut down the effect on either premium or charity surcharge to examine the relevant rationale across expansion groups. The flexible range of pass-through nests behavioral responses from insurers and en- rollees to policy incentives. Insurer capture of policy generosity weakens the link between cost and premium. I calibrate the mark-up response to subsidy over costs using estimates in Jaffe and Shepard (2018). Spending increase in formal insurance lowers the net saving in uncompensated care. The spending increase is approximately 25% based on moral hazard estimates in Massachusetts (Chandra et al., 2011), although I consider larger spend- ing responses across contexts. The pricing benefits are subject to the cost of expanding insurance using policy incentives, or the fiscal externality on the government budget. With price-linked subsidy and penalty, the fiscal cost decreases with the premium benefit. Although behavioral responses to subsidies and taxes are well examined in the litera- ture, hence allowing for a pure calibration exercise on welfare, I estimate incentive effects specific to the Massachusetts expansion for non-elderly adults in the American Community Survey (ACS). The estimates serve two purposes. First, they show that incentive effects in Massachusetts fall within the range of existing evidence under similar contexts. The cost-benefit calculation provides informative bounds on the potential net benefit that applies in Massachusetts. Second, they reveal heterogeneous effects across sub-groups. I exploit differences in subsidy generosity across income groups, within rating com- munities where premium does not vary by individuals. Given premium, lower-income 40 individuals are eligible for subsidy at a higher percent of premium. Following standard practice, I correct for the selection response to subsidy with simulated generosity from a pre-reform reference sample unaffected by the subsidy (Currie and Gruber 1996a; Currie and Gruber 1996b). The simulation isolates generosity differences across demographic groups within rating communities. I use the simulated measure to instrument for the endogenous subsidy exposure of Massachusetts enrollees. The cross-group variation in subsidy (and hence premium) has been widely applied to estimate demand in the individual market (Tebaldi, 2017), and the incentive effect on take-up (Frean et al., 2017). Based on the simulated measure, increasing subsidy generosity by 10 percentage points (above the 70% baseline) is estimated to increase take-up by 1 percentage point over 2008-2011, and by 1.7 percentage point in 2011. The magnitude aligns well with the take-up response around subsidy discontinuities in Massachusetts (Finkelstein et al., 2017). Employment response to subsidy is indistinguishable from zero, but increases significantly with subsidy in the near-elderly (55-64). Take-up increases the most with subsidy at younger ages (below 30). In both groups, coverage by employer-sponsored insurance (ESI) decreases with subsidy. By contrast, estimates based on the endogenous exposure measure are wrong-signed for the effect on take-up, and indicate substantially larger non-employment effect than the existing literature. Combining behavioral responses with the composition effect on costs, I calculate the pricing benefits of a dollar increase in policy incentives. The trade-off with the fiscal externality evaluates the desirability of further expansion using stronger incentives. I approach the trade-off first from a pure efficiency standpoint: expanding formal insurance alleviates the adverse selection in premium and the social cost of uncompensated care, but does not serve redistribution purposes through the tax-subsidy incentives or the resulting effects on prices. In this case, net benefit from incremental expansion is close to zero for the range of estimates considered. If only the pricing inefficiencies are considered, then at 95% insurance rate in Massachusetts, benefits from additional take-up do not recover the cost of expansion with more generous incentives. Between the two motivations, uncompensated cost saving amounts to 7 cents per dollar increase in subsidy. Allowing for substantial mark-up capture, the premium benefit that is around 2 cents on a dollar of subsidy (or 30% of the full benefit without mark-up capture) would still recover the fiscal externality on pure efficiency grounds. Accounting for redis- tribution preferences for the low-income, the direct benefit on subsidy recipients increases, lowering the fiscal externality term. Expanding subsidy generosity and take-up in the low-income group becomes desirable even with a small dosage of equity consideration. By contrast, the mandate penalty is primarily motivated by the selection effect on 41 premium. Enrollees responsive to the penalty are 70% less costly than existing enrollees in unsubsidized individual plans. The potential premium benefit, absent any mark-up capture, is around 2.6 cents on a dollar of penalty. Avoided uncompensated cost is 1 cent on a dollar, driven by high insurance rate and low uninsured cost in the unsubsidized population. Government collects little revenue from the penalty due to the take-up response. Shutting down the access to uncompensated care in this population, premium benefit alone recovers the cost of penalty on the uninsured, if the mark-up capture is close to zero. Factoring in uncompensated care benefits and equity, desirability of a higher tax penalty depends on the relative welfare weights between the uninsured and the insured patients bearing excess burden, and is less clear-cut than the subsidy transfer to the low- income. In general, the efficiency argument for the penalty decreases with the premium benefit, and the equity argument weakens with smaller social cost of uncompensated care. Earlier studies have separately identified the benefit to uncompensated payers in the subsidized market (Finkelstein et al., 2017), and the selection effect on premium in the individual market (Hackmann et al., 2015). This paper develops a framework to understand the desirable scope of social insurance, motivating both benefits as potential rationales. I then empirically assess the welfare implications using incremental expansion in Massachusetts. Quantified welfare effects reveal the desirability of expanding current scope of coverage, the relevant rationale, and the relative importance of efficiency versus equity arguments. Differences across expansion groups have further implications for the global design of social insurance. This paper is more broadly related to understanding the universal nature of government- mandated social insurance. The trade-off with private options has long received theoretical attention (Vickers and Yarrow 1991; Hart et al. 1997), with rising empirical interest following the privatization of social insurance around the world (Gruber 2017; Reynolds 2013). Landais et al. (2017) shows risk-based selection alone does not rationalize mandated unemployment insurance in Sweden. This paper suggests a universal health insurance mandate may ultimately involve redistribution arguments, but does not directly test for its desirability. Instead, incremental expansion featuring flexible policy incentives potentially improves welfare on pure efficiency grounds. 3.2 Massachusetts health insurance reform Massachusetts enacted its comprehensive health reform law, Chapter 58 of the Acts of 2006, in April, 2006. The law brings together individuals, private insurers and employers to improve health care access in the state, and is the blueprint of the 2010 Patient Protection 42 and Affordable Care Act (ACA) enacted nationwide in 2010. Key provisions of the law are an individual and employer mandate, subsidized insurance to the low-income through Medicaid and private insurance markets, and rating regulation. Together, the structure of the reform resembles a “three-legged stool”. I introduce each component of the law below. 3.2.1 Mandate Compared to subsidized insurance and rating regulation, the individual mandate is an innovative and controversial part of the law without a historic precedent. The mandate requires individuals over eighteen years old to purchase health insurance, or face a tax penalty. The insurance purchased should meet the “minimum creditable coverage” standards, satisfied by most public and commercial health insurance but not plans that only cover catastrophic events. The penalty was first implemented in 2007. Individuals without proof of eligible insurance by December, 2007 lose their personal income tax exemption. In later years, the tax penalty equals 50% of the monthly premium in the lowest-cost plan available for the individual, adjusted by the number of uninsured months during the year. The penalty is waived for low-income groups below 150% FPL, a population eligible for free insurance either from the Medicaid or from the Exchange marketplace. For middle- to high-income groups, the penalty increases the outside cost of uninsurance, increasing the willingness to pay for insurance. Focusing exclusively on the market of unsubsidized private insurance, (Hackmann et al., 2015) finds the mandate increased the take-up rate to near-universal, and resulted in lower enrollee cost and insurer price. A second mandate imposes fines on employers who fail to provide adequate funds for employee health insurance. The motivation is that uninsured workers generate excessive uncompensated care costs financed by society. Specifically, firms with more than 10 full- time workers are charged a “fair share contribution” fee of $295 per uninsured worker, where the dollar amount reflects the expected charity care cost incurred by these uninsured workers. In addition, for employers whose workers generate particularly high costs of charity care (more than $50,000 annually), and firms who do not provide health insurance on a pre-tax basis, an additional “free rider surcharge” applies. The exact amount of penalty depends on a formula that weighs the employer cost of sponsorship against the charity cost to the state. Due to administrative costs to firms, however, the employer mandate was repealed in July, 2013. 43 3.2.2 Subsidized insurance The second component of the law addresses the financial difficulty of purchasing health insurance that impedes take-up in the low-income group. Previously, individuals with income below 133% FPL are covered in the Medicaid program (MassHealth). Enrollment in MassHealth also needs to satisfy certain demographic criteria that typically leave childless adults ineligible for coverage, regardless of income. To make insurance more affordable for more individuals, including those in the low- to middle-income groups, the state instituted the Exchange program, called the Commonwealth Care (CommCare), that subsidizes the insurance premium to eligible individuals below 300% FPL. Recipients are not eligible for Medicaid coverage or employer sponsorship. To determine subsidy, enrollees are first assigned an “affordable” amount of out-of- pocket cost for purchasing insurance. Individuals are not required to pay for insurance premium above their affordability limit. Affordability is set as an increasing step function over ranges of income in percent of FPL. For instance, for individuals with income less than 150% FPL, affordability is zero, and premium is fully subsidized. Affordability then increases with income. In 2011, for instance, it increases to $39 per month in the 150-200% bracket, $77 per month for the 200-250% bracket, and $116 for the 250-300% bracket. For individuals with income over 300% FPL, subsidy no longer applies. A separate program, called the Commonwealth Choice, provides private insurance plans for individuals who wish to purchase from the Exchange but are ineligible for subsidy. The difference between enrollee cost (affordability) and the premium price set by insurers is the subsidy. In percent of premium, subsidy is roughly 90% of premium in the 150-200% income bracket, 80% for the 200-250% bracket, and 70% for the 250-300% bracket. The increased generosity of subsidy explains the substantial coverage expansion in the low-income eligible population. As Figure 3.1 shows, the below 300% FPL group has significantly lower coverage rate before the reform, but increased coverage rate sharply after. Specifically, coverage in the low-income group surged to catch up with the average rate in Massachusetts, whereas the trending remain parallel in the national sample across income. 33 3.2.3 Rating regulation The final piece of the law is regulation on premium. Although formally a component of the law, the major regulation on premium, such as guarantee issue and community rating, 33 Similarly, the Massachusetts Health Reform Survey estimates that uninsurance below 300% FPL is 23.8% in fall, 2006, and decreased to 12.9% in fall, 2007. Above 300% FPL, uninsurance decreased from 5.2% in 2006 to 2.9% in 2007. 44 Figure 3.1: Insurance coverage trends .7 .8 .9 1 2000 2002 2004 2006 2008 2010 2012 all income low income (a) Massachusetts .7 .8 .9 1 2000 2002 2004 2006 2008 2010 2012 all income low income (b) Non−MA states Notes. Figure compares coverage trends in Massachusetts (panel a) with the rest of the US states (panel b), for the full sample and the low income group where family income is less than or equal to 300% FPL. Coverage rates are aggregated from micro data for the 27-64 age group in the CPS March supplement, adjusted by insurance weights. 95% confidence intervals are plotted. is already effective in Massachusetts before the passage of Chapter 58. Guarantee issue ensures that individuals with pre-existing conditions are not denied coverage. Community rating in Massachusetts allows premium to differ by enrollee age and location of residence, but not by other demographics. Building on existing regulation, Chapter 58 further requires that the maximum premium variation across age does not exceed a ratio of 2, further limiting the the extend of premium discrimination. In addition, Chapter 58 merged the small-group and non-group risk pools in July 2007. Premium of individual plans and small-group plans are subject to the same set of regulation based on the joint risk pool. Since previously the individual market has low market share and high costs, the merger significantly reduced the premium of individual plans without meaningfully increasing the rate for group plans (Graves and Gruber, 2012). I take the rating regulation as given for my 2008-2013 sample period, but exploit policy variation in subsidy and penalty to analyze the effects on enrollees and society in general. Table 3.1 gives a sense of the relative contribution of different programs to the increase in coverage rate in Massachusetts. Consistent with Figure 3.1, Commonwealth Care is the leading contributor to new enrollments in the first two years of the reform: of the 442,000 new enrollees by July 2008, around 40% received premium assistance from the program. Including subsidized enrollees in the free MassHealth aprogram, more than half (68%) of new enrollees received some premium assistance before take-up. 45 Table 3.1: New enrollment by source of coverage 6/30/2006 12/31/2006 6/30/2007 12/31/2007 6/30/2008 diff. from 6/30/06 Private Group 4,274,000 4,338,000 4,378,000 4,406,000 4,421,000 147,000 Individual Purchase 40,000 39,000 36,000 65,000 80,000 40,000 MassHealth 705,000 741,000 732,000 765,000 785,000 80,000 Commonwealth Care 0 18,000 80,000 158,000 176,000 176,000 Total 5,020,000 5,136,000 5,226,000 5,394,000 5,462,000 442,000 Notes: Table shows administrative enrollment counts published in Health Care in Massachusetts: Key Indicators, November 2008. These numbers exclude Medicare enrollees; the MassHealth category only includes enrollees who list MassHealth as the primary insurer. For more details on the administrative records used in compiling the numbers, see the original report athttp://archives.lib.state.ma. us/bitstream/handle/2452/36763/ocn232606916-2008-11.pdf?sequence=1&isAllowed=y. 3.2.4 Uncompensated care programs Outside the main frame of Chapter 58, the state’s Uncompensated Care Pool (UCP) is a safety net insurance program that disburses the cost of care incurred by the uninsured population. The program is relevant for the motivation of Chapter 58, because a key premise of the reform is that formal insurance subsidy lowers the social cost of charity care, solving a “free-rider problem”. The problem originates from the 1986 Emergency Medical Treatment and Active Labor Act (EMTALA), which mandated hospitals and ambulatory services to provide emergency care to the uninsured regardless of ability to pay. When the care receiver does not pay for the medical costs she incurs, the cost is borne by third-party members. Before Chapter 58, the UCP is the program that finances the charity care to the insured in the state. The UCP charges assessments on providers and private sector employers who con- tribute to the financing of charity care. The assessment can be considered as a tax on the revenues of firms. In 2005, UCP was billed $739 million for uncompensated care, and paid out $530 million through assessment and general fund appropriation. 34 Chapter 58 renamed the UCP program to the Health Safety Net (HSN), and reformed the financing of charity care. First, Chapter 58 received permission from the federal government in 2005 to redirect the funding for uncompensated care programs to premium subsidy on the Commonwealth, under the premise that the cost of expanding subsidy programs is partially offset by savings in uncompensated care. The additional costs are to be shared evenly between the state and the federal government. The reduction in the charity care cost is significant. Within five years of the reform, charity cost decreased by $243 million from $739 million in 2005 to $496 million in 2011. 35 34 Number based on Health Safety Net (Uncompensated Care Pool) annual reports, available at http: //archives.lib.state.ma.us/handle/2452/392113 35 UCP paid $530 million in 2005, and $412 million in 2011. In addition to reduced UCP spending, hospitals bear smaller profit loss for the residual uncompensated care cost not paid for by the government. 46 Under Chapter 58, HSN reimburses providers of uncompensated care with a mix of assessments, surcharges and revenues from the general fund. A service surcharge is applied to payments made by insured patients for medical services. Assessments on hospitals are uncompensated profit loss for treating the uninsured. In 2011, for example, service surcharge on payments to hospitals and assessment on hospital profits each contribute $160 million to the fund, with an additional $100 million from other revenue sources. Although displaced funding from the charity care program lowers the financing burden of subsidy, the Commonwealth Care program still requires generous contribution from the state and federal government. The additional costs when replacing safety net coverage with subsidized insurance represents the efficiency cost of the reform, or the fiscal externality due to behavioral responses. In 2011, $913 million is budgeted for CommCare subsidies to 176,500 enrollees. 36 Over time, as enrollment increased, CommCare spending increased from $628 million (with a $472 million budget) in 2008 37 to $872 million (with a $913 million budget) in 2011. 3.3 Motivating insurance expansion Although the motivations for expanding health insurance programs are a subject of constant debate, many social insurance programs are universal in scope. Individuals contribute to the financing of the program and receive transfer benefits once eligible. Participation is often mandatory or assumed by default. One typical example is the unemployment insurance program. Given universal enrollment, researchers are interested in the optimal benefit design that balances equity and efficiency for the economy. The case of health insurance expansion offers a unique opportunity to examine the desirability of universal coverage, and to shed light on the optimal scope of social insurance when take-up is voluntary and incomplete. This section develops a stylized framework to understand the desirability of a health insurance mandate. I focus on the corner solution where individuals with zero demand for insurance are required to enroll. When premium reflects the average cost of enrollees, a universal mandate mitigates adverse selection and lowers premium for high-cost individu- als. The desirability of a mandate then depends on the society’s redistribution preferences across risk. Moreover, when the society cannot avoid providing uncompensated care to the uninsured, a formal insurance mandate improves upon the status quo if it achieves 36 The budget combines the Commonwealth Care and the Commonwealth Care Bridge, a similarly struc- tured subsidy program enrolling eligible legal immigrants. Source: http://budget.digital.mass.gov/ bb/h1/fy11h1/exec_11/hbuddevhc.htm 37 Source: http://budget.digital.mass.gov/bb/h1/fy10h1/exec10/hbuddevhc.htm 47 better distribution of healthcare costs in the economy. I delineate conditions under which formal insurance broadens the base of redistribution to the uninsured, and lowers the excess financing burden on third-party payers including patients. 3.3.1 Environment Consider a unit mass of individuals heterogeneous in health type and labor productivity .2 [0;1] is the probability of staying healthy, and 1 is the risk of illness, in which case medical care such as hospitalization is required to restore health. The cost of the medical care isM. Type has expected cost (1)M. Workers produce output of valuew. The opportunity cost of working, such as the value of lost leisure and home production, is captured ing( 1 ) and heterogeneous in type 2 [0;1]. I assumeg 0 ()> 0 andg 00 ()> 0, or the cost of working decreases (and decreases faster) with productivity. I further assume thatg(1) = 0 andg(+1) = +1, or the highest productivity type always works, and the lowest type never works. Distribution of types follows the density functionf (;). I assumef (;) is strictly positive everywhere, and continuously differentiable in both arguments. The correlation between health and productivity type is arbitrary. Risk-averse individuals have von Neumann-Morgenstern utility over consumption. Consumption varies with medical cost in the probabilistic health event. Individuals can insure against the uncertainty with health insurance. Insurance requires premium payment in both states of the world, but no additional payment for medical costM when sick. Uninsured patients access similar medical services, and the cost is paid through a combination of individual earning (if any) and government transfer payments. Government determines the boundary of the insurance market.hi(;)2f0; 1g gives the insurance state of type (;). A universal mandate implieshi(;) = 1. Given the market size, perfectly competitive insurers offer premium p(;). Depending on the information feasible in the pricing, premium can reflect individual or average risk in the market. Government determines income transfert(;) and uncompensated care transfer t(;) specific to uninsured patients. Type (;) has expected utility U(;) =u(c H (;)) + (1)u(c S (;))e(;)g( 1 ); where consumption in the healthy state c H (;) = e(;)w +t(;)hi(;)p(;). e(;)2f0;1g indicates employment choice. Consumption in the unhealthy statec S (;) = c H (;) +t(;)M +hi(;)M, wheret(;) is the additional transfer needed to 48 smooth consumption when the patient is uninsured, or the cost of uncompensated care. t(;) = 0 ifhi(;) = 1. State utilityu() satisfiesu 0 ()> 0>u 00 (). In addition, utility satisfies the Inada con- dition u 0 (0) = +1: marginal utility rises to infinity when agent consumes very little, and so does the social value of income redistribution to the very poor. The concavity implies that income supportt(;) and uncompensated caret are more generous to the low-consumption groups. 3.3.2 Adverse selection I first examine adverse selection as a potential motivation for universal insurance. Adverse selection occurs either because risk is private information, or because price discrimination based on medical records is prohibited. 38 For simplicity, I assume perfect competition among insurers, so that premium equals the average cost of enrollees. I consider mark-up response to cost changes in the empirical analysis below. It is easy to see that take-up is incomplete for low-risk individuals. The lowest risk type, in particular, has zero WTP for insurance. They do not enroll unless subsidy reduces the cost of premium to zero. With asymmetric information, however, risk-based subsidy is not feasible. Premium drives a wedge between the marginal and the average cost, resulting in inefficiently low take-up in the low-cost population. In lieu of risk-based transfer, government arranges for means-tested transfers based on observed employment choice. The (low-income) non-employed individuals receive premium subsidy that lowers premium by a fraction p , in addition to a cash transferA. A is the benefit level of the unemployment insurance (UI) program, assumed mandatory in nature. Subsidized premium cost is (1 p )p. Subsidy and cash transfers are financed with a lump-sum pay-roll tax on the (high-income) employed. I analyze the desirability of extending health insurance to the lowest demand types, holding fixed the benefit design of the UI program. In this setting, the lowest demand types are the perfect health types who have zero medical expenditure risk. A mandate that enrolls the zero-demand types lowers the average cost of insurance in the economy, benefiting infra-marginal enrollees with higher demand for insurance. The social benefit is weighed against the private cost of insurance on the perfect health margin, governed by the society’s redistribution preferences across risk. Specifically, enrollment lowers the 38 The premium regulation reflects some redistribution preferences for the sicker population potentially facing unaffordable level of premium due to pre-existing conditions. When premium is perfectly discrimina- tory in costs, Appendix B.1.1 shows that mandatory formal insurance priced at expected cost dominates uncompensated care. 49 average risk in the insurance programr by dr dn n=1 = r i F (1;1) wherer =E[1jhi(;) = 1] is the average risk,i =E[hi(;)] =F(1;n) the insurance rate, andF (1;1) the mass the ultra-health types. 39 The price change lowers the total cost of insurance by iM dr dn n=1 = pF (1;1). Numerically it is equal to the cost of premium borne by marginal enrollees in perfect health. A mandate reaching the ultra-margin can be considered as a transfer to infra- marginal enrollers with higher cost of coverage. Whether the transfer is desirable depends on the society’s preference for redistribution across risk. The marginal versus infra- marginal trade-off can be expressed as dW dn n=1 / pu 0 (c e ) | {z } social benefit +u(c e )u(c e +p) | {z } marginal utility loss > 0 where the cost of premium in utility terms,u(c e )u(c e +p), is compared with the premium savingpu 0 (c e ) to infra-marginal enrollees. 40 For a given pricep, the trade-off boils down to the relative value ofp to different members of society. In the most simplistic setting, standard result applies that egalitarian outcome is desirable with concave utility. Insurance is universal, and premium is equal to the average cost of the economy ¯ p =E[1]. In particular, for a given set of redistribution preferences, the allocation can be implemented either through rating regulation that prohibits price discrimination, or through risk-based subsidy on premium that varies freely with risk. 41 Formally, enrolling the highest health type = 1 (when lower health types have already enrolled) is at least locally optimal, and is matched with full premium subsidy regardless of the type distributionf (;), employmente(;), or cash transferA: Proposition 1. For an arbitrary income transferA and employment choicee(;), 39 F is the marginal density function of. 40 Subscripte denotes the employed population. The trade-off applies specifically to the employed when unemployed individuals receive fully subsidized insurance financed by an earning tax on workers. 41 Of course, different pricing and subsidy regulations may have different efficiency implications depending on behavioral responses. In this section, I conceptualize the pricing benefits and costs of expanding insurance to the ultra-margin (i.e., a mandate), but abstract from behavioral responses to price changes on the infra- margin. For instance, employment rate is assumed invariant to changes in the tax burden of subsidy. The empirical framework in Section 3.4 fully accounts for these behavioral responses and efficiency costs. 50 (a) with full insurance subsidy p = 1, enrolling = 1 increases welfare: dW dn n=1 > 0 (b) universal insurance implies full insurance subsidy: dW d p n=1 0 I provide the proof in Appendix B.1.2. With utilitarian social welfare and risk averse individuals, complete redistribution across risk implemented by a mandate is socially desirable. Absent the mandate, adverse selection leads to incomplete take-up and premium priced above ¯ p, an inefficient outcome when enrolling lower-risk individuals would have generated positive social surplus ( dW dn n=1 > 0). 42 With more complicated preference structures, complete redistribution across risk may not be desirable. For example, if risk tolerance increases with risk, then the low-risk population can efficiently bear more risk without purchasing insurance (Appendix B.1.3). When redistribution is evaluated by the marginal utility of consumption, heterogeneity in earnings and consumption complicates the redistribution across risk through the correlation with risk (Appendix B.1.4). Moreover, tax incentives, such as a mandate penalty, have additional redistribution consequences that interact with an insurance mandate (Appendix B.1.5). I introduce these additional elements step by step in the Appendix. 3.3.3 Uncompensated care Suppose the government provides uncompensated care to uninsured patients, and enroll the uninsured in formal insurance at the site of care. The non-employed can either purchase formal insurance with full premium subsidy, or remain uninsured but utilize uncompensated care free of charge. The government finances the premium subsidy with a linear tax on payroll, and a tax penalty imposed on the working uninsured. Specifically, let the tax penalty bekp, wherek gives the level of penalty relative to premium. The government finances the uncompensated care by levying surcharge fees on paying customers in healthcare. In the conceptual framework, I focus on a premium surcharge on enrollees in formal insurance, and a potential service surcharge on patients. When there is 42 With risk-based pricing, positive social surplus only requires WTP higher than marginal cost, which is satisfied given risk aversion. With risk pooling, it requires social benefit offset marginal utility loss. 51 negative correlation between risk and productivity, patients on average have lower income than healthy individuals, implying greater burden of premium surcharge on patients. With service surcharge, the excess burden on patients further increases. Relative to the implicit insurance of uncompensated care, expanding formal insur- ance with subsidy is desirable, if it improves the distribution of healthcare costs in the economy. 43 This is the case if subsidy is financed progressively over the tax base, and lowers the excess burden of uncompensated care on low-income groups. The following proposition formulates the distribution implications. Proposition 2. If productivity is negatively correlated with risk (Cov[;]> 0), then tax- financed subsidy reduces the excess burden on patients, and replacing uncompensated care with subsidized insurance always improves welfare. Appendix B.1.6 provides the proof. The negative correlation implies patients on average have lower income than taxpayers. Expanding formal insurance shifts the cost from patients to workers, reducing the excess burden on low-income individuals. Financed by a linear tax over payroll, subsidy further distributes healthcare costs progressively over income. Therefore social insurance financed by tax-based subsidy achieves more equitable distribution of healthcare costs, improving the de facto coverage of uncompensated care. More generally, Appendix B.1.6 shows that tax-based subsidies on premium achieve better public finance of the safety-net mandate, if subsidies broaden the base of public finance and if redistribution is more valuable over productivity than health. Relative to adverse selection, uncompensated care provides stronger motivation for universal health insurance. Although both motivations justify enrolling the lowest-demand individuals, enrollment in formal insurance is desirable at any level of coverage with uncompensated care. Subsidized universal insurance is generally the global optimum. The stronger motivation reflects the fact that in addition to the premium benefit, the distribution of healthcare costs improves when the subsidy is financed over the broader tax base that lowers the burden on vulnerable individuals. 3.4 Incremental expansion in Massachusetts The idea of subsidized universal health insurance has its closest analogue in Massachusetts following the 2006-2007 reform. Instead of a direct mandate, the state achieved near- universal insurance coverage using policy instruments such as premium subsidy and tax 43 The conceptual analysis assumes away behavioral responses that might increase the cost and utilization of individuals who take-up formal insurance. I consider both behavioral responses and distribution (equity) in the empirical analysis. 52 penalty. Exploiting behavioral responses to policy as “sufficient statistics” (Chetty 2006; Chetty and Finkelstein 2013), I formulate and calculate the pricing benefits and costs of expanding current scope of insurance in an empirical framework of the Massachusetts reform. I compare motivations for policy incentives targeting different expansion groups. I present the key elements of the framework below. 3.4.1 Setting The framework adapts the optimal UI benefit design in Chetty (2006) to the case of health insurance subsidies and taxes. In a continuous time economy, individuals enter period t2 [0; 1] with state vector! t . They choose employmente t (! t )2f0; 1g and health insurance hi t (! t )2f0; 1; 2g. e t = 1 for employees. Uninsured individuals havehi t = 0. hi t = 1 for ESI enrollees, and equals 2 for subsidized enrollees in the Medicaid and the Exchange program. Standard in discrete choice models, I assume individuals are subject to an employment shock t each period. The binary employment outcomee t is determined from a single- index equationm e t (s t ; t ;! t ), wheres t is the continuous search effort chosen optimally by individuals in state! t . 44 Similarly, individuals are subject to a taste shock t in the insurance choicehi t . Specifically, the taste shock enters the WTP for insurance. Depending on the sign of t , individuals can either over- or under-insure, for a given state and cost of coverage. Empirically, the taste shock can represent behavioral features (Kling et al. 2012; Barseghyan et al. 2013) and optimization frictions (Handel and Kolstad 2015; Bhargava et al. 2017) associated with low take-up rates in targeted groups. In each state, forward-looking individuals make optimal employment and insurance choices subject to friction t and t . 45 The state vector! t contains choices and outcomes in previous periods, and price variables such as insurance premium, subsidy and a lump-sum payroll tax which individuals take as given. I assume a uniform subsidy generosity that covers p of premium, so that enrollees cost is (1 p )p, wherep is the premium price before subsidy. 46 The subsidy is financed by a small tax burden on payroll, averaging pb per worker, and the penaltykp per uninsured individual ineligible for subsidy. Let i =Prfhi t (! t ) =ig,i = 0;1;2 denote insurance rates in the population. Lete = 44 Employmente t = 1 if the underlying equation crosses a threshold, for example,m e t (s t ; t ;! t ) 0. The continuous search efforts t allows Envelope Theorem to be applied when employment outcome is binary and not continuously differentiable. 45 That is, choices of job search effort and WTP for insurance are optimal in expectation of the friction shocks t and t . 46 The assumption implies the income composition of recipients stays approximately the same when subsidy becomes more generous. Consistent with this view, I do not find significant employment responses to generosity variation in Section 3.5. 53 E[e t (! t )] denote the employment rate. The public finance of premium subsidy implies e pb + 0 kp = 2 p p + 1 1 p (BC:1) where 1 is the tax deduction of ESI. Budget constraint states that a lump-sum tax on payroll, net of ESI deduction and the tax penalty, finances the subsidy expenditures in public insurance programs. The condition relates an increase in subsidy generosity p to the implied change in tax burden pb , holding constant other insurance policies (k and 1 ). Specifically, the relationship d pb d p is derived from total differentiation of EquationBC:1. Let e 1 measure employees enrolled in ESI. They purchase insurance for own coverage and for non-employed dependent enrollees. The premium payment is deductible from the tax. The following condition captures the transfer payment in ESI e 1 pr = (1 1 ) 1 p (BC:2) where pr is the private transfer made by ESI payers. I assume ESI payers are limited to working enrollees in the main analysis. I allow for increasing contribution by dependent enrollees in the robustness analysis. Uncompensated care in Massachusetts is financed by a surcharge fee on paying patients and an assessment on providers. I model the share of total uncompensated cost financed by patient surcharge to be. Per capita excess burdenuc p is given by 0 >0 uc p = 0 0 gOOC (BC:3) where 0 0 ( 0 >0 ) measures uninsured (insured) patients in hospital settings. OOC is the enrollee cost sharing in formal insurance. Informal coverage is less generous, reflected in parameterg 1. 47 Individuals can choose to finance the cost sharing from medical debtdb t (! t ). The uninsured can face a higher interest rate and cost of borrowing, if they accumulate greater amount of medical debt, an externality noted in (Miller, 2016). 48 I consider the potential gains to credit access as an additional benefit of formal insurance. Premium deviates from the average cost in markets with few insurers. I model the market power of insurers using the mark-up parameter( p ;k). Depending on the market 47 Higher spending in formal insurance is also consistent with the moral hazard effect of take-up, which I use to parametrizeg for the main analysis. 48 A growing body of evidence shows public insurance programs improve financial well-being across contexts. See Allen et al. (2017), Brevoort et al. (2017), Argys et al. (2017), Dranove et al. (2016), Gross and Notowidigdo (2011), Gallagher et al. (2018), Hu et al. (2016), among others. 54 power, premium is set above the average cost at p = h 1 + p ;k i r( 0 ) (BC:4) where average costr( 0 ) varies depending on the selection into formal insurance. Adverse selection implies lower average cost in formal insurance as take-up increases. Since subsidy and penalty are linked to insurer-set prices, is allowed to vary with policy parameters directly as a result of pricing responses to policy (Jaffe and Shepard, 2018). 49 3.4.2 Welfare Period utility u t (! t ) = 1 X i=0 2 X j=0 Pr ij t (! t ) E g u(c ijg t (! t )) 1 fi=1g g 1 (2) reflects expected utility from consumption over health stateg, weighted by probabilistic employmenti and insurancej outcome inPr ij t (! t ) from friction t and t . Workers pay a fixed cost of employment decreasing in. Across periods, individuals maximize life-cycle utilityU = R t R ! t u t (! t )dF t (! t )dt with optimal choices of saving, medical debt, employment and insurance, subject to the in- tertemporal budget constraint and friction shocks. Let the maximized individual utility be V . The government then chooses policy parameters K = ( p ;k) to maximize a weighted sum ofV and the burden of uncompensated care on providers (1) 0 0 gOOC. I assume the government is only concerned with minimizing the third-party cost of an unfunded mandate, but is otherwise uninterested in preserving industry profit. Assuming the relative weight on consumer welfare is, social welfare W =V (1) 0 0 gOOC (3) The decision structure implies a small policy change affects welfare only through prices and the friction, with no first-order impact through choice variables (Envelope Theorem). Behavioral responses to policy affect welfare through 1) the effect on prices, and 2), the effect on the distribution of employment and insurance outcomes through friction. In general, the following proposition is true. 49 I focus on the average premium charged by insurers in Massachusetts. The vast majority of Exchange enrollees (90%) are served by three insurers with near-identical premium. 55 Proposition 3. The welfare effect of a small increase in policy K equals dW dK = dV dK ij + dV dK u 0 d(1) 0 0 dK gOOC (4) where ij = E t E ! t 1 fe t =i;hi t =jg is the distribition of employment (i) and insurance (j) outcomes. In first-order approximation, dV dK u 0 ij (K) d ij dK u ij (K) (5) where d ij dK is the policy effect on outcome, of which fraction ij (K) moved due to the policy effect on friction, experiencing utility differenceu. I discuss details in the proof in Appendix B.1.7. Intuitively, dV dK ij captures the pricing externality on consumption through the budget constraints. In addition to changes in the state utility, the state distribution varies both because of individual choice and the policy effect on friction. For a policy effect d ij dK estimated from data, friction accounts for ij (K) of the total effect. Equation 5 captures the first-order utility change for frictional movers in ij (K) d ij dK . From Equation 4, I derive closed-form formulas for the pricing benefits on premium and the burden of uncompensated care. I similarly derive the fiscal cost on the government for implementing the policy, and the fiscal externality net of the recipient valuation. Comparing the pricing benefits with the fiscal externality determines the desirability of expanding current scope of insurance, 50 and the motivations for different expansion policies. I formulate the welfare benefits and costs for premium subsidy and the mandate penalty below. 3.4.3 Premium subsidy Consider a small increase in subsidy generosity d p . Enrollee cost decreases byp d p . New enrolleesd 0 change the average costr( 0 ) in formal insurance. Adverse selection implies lower enrollee cost with higher insurance rate, or" r; 0 > 0. In addition to the cost composition effect, mark-up set by insurers responds to subsidy generosity. Total effect on 50 I subject the evaluation to potential over-insurance from behavioral friction as an additional source of negative externality. 56 premium (BC:4) is given by dlogp d p = " r; 0 0 d 0 d p + 1 1 + @ @ p (6) The pricing effect applies to all payers of insurance premium, adjusted by the out-of-pocket share and the payer’s marginal utility. Standardizing the welfare metric in terms of a dollar increase in worker earning, or = 1 u 0 (c 1 ) , the premium benefit from a dollar increase in subsidy equals dW p d p p = dlogp d p 2 u 0 (c 2 ) u 0 (c 1 ) (1 p ) + p ! + 1 u 0 (c 11 ) u 0 (c 1 ) (1 1 ) + 1 ! + 0 k u 0 (c 0 ) u 0 (c 1 ) 1 ! (7) Enrollmentd 0 reduces the social cost of uncompensated care bygri( 0 ) 1 +" ri; 0 d 0 , 51 of which fraction 1 accrues to providers as avoided profit loss. The remainder lowers patient service surcharge by duc p . In total the saving is valued at dW UC =gri 1 +" ri; 0 d 0 " u 0 (c >0;0 ) u 0 (c 1 ) 1 # gri 1 +" ri; 0 d 0 | {z } excess burden u 0 (c >0;0 ) u 0 (c 1 ) gri 0 1 0 " r; 0 ! d 0 | {z } selection (8) where u 0 (c >0;0 ) u 0 (c 1 ) 1 captures excess burden on patients. The third term adjusts for the selection effect on per capita burden. Normalized in dollar units of subsidy, the welfare benefit on the social cost of uncompensated care equals dW UC d p p . Recipients receive a mechanic reduction in premium cost from an increase in p . The benefit per dollar equals dW B d p p = u 0 (c 2 ) u 0 (c 1 ) 2 (9) the standard valuation of a dollar transfer to beneficiaries from workers. Raising the subsidy transfer through taxes imposes efficiency costs. In addition to the mechanic cost of 51 ri( 0 ) is the average cost of uninsured if enrolled in formal insurance. Cost of charity care to third-party payers is lower at a fractiong. 57 current beneficiaries 2 p d p , new enrollees raise the cost by ( p +k)p d 0 . Switchers from ESI raise the government cost by ( p 1 )p d 1 , and adjust the burden of private transfer. Employment response e adjusts the tax burden. In total, the behavioral responses imply the fiscal cost of raising a subsidy dollar equals dW C d p p = 2 + ( p +k) d 0 d p | {z } new enrollees + 2 6 6 6 6 4 p 1 private transfer d pr d p z }| { u 0 (c 11 ) u 0 (c 1 ) (1 1 ) 3 7 7 7 7 5 d 1 d p + u 0 (c 11 ) u 0 (c 1 ) (1 1 ) 1 e 1 d e 1 d p | {z } selection from ESI + pb p de d p (10) The difference between the fiscal cost of subsidy (Equation 10) and the recipient valuation (Equation 9) gives the fiscal externality. Weighing the fiscal externality with the pricing benefits, the welfare effect of premium subsidy is dW d p p = dW P d p p + dW UC d p p + dW B d p p + dW C d p p + p d d p ! + p dV d p u 0 (11) where d d p evaluates the credit benefit of insurance, and dV d p u 0 the utility loss from friction. 3.4.4 Mandate penalty Consider a small increase in penalty dk. Suppose new enrolleesd 0 affect the average cost in formal insurance according to elasticity" k r; 0 . Allowing for mark-up adjustments, the effect on premium is dlogp dk = " k r; 0 k d 0 dk + 1 1 + @ @k (12) Across payers, the premium benefit is valued at dW p dkp = dlogp dk 2 u 0 (c 2 ) u 0 (c 1 ) (1 p ) + p ! + 1 u 0 (c 11 ) u 0 (c 1 ) (1 1 ) + 1 ! + 0 k u 0 (c 0 ) u 0 (c 1 ) 1 ! (13) 58 New enrollees lower the social cost of uncompensated care according to Equation 8. The benefit per dollar of penalty is dW UC dkp . The cost of penalty on the uninsured is valued at dW B dkp = u 0 (c 0 ) u 0 (c 1 ) 0 (14) where u 0 (c 0 ) u 0 (c 1 ) adjusts for the utility loss from taxing the uninsured (relative to workers) to increase government revenue. In addition, the government loses revenue ( p +k)p d 2 when the uninsured take-up subsidized insurance, and gives out tax rebate ( 1 +k)p d 1 when the uninsured take-up ESI. Total fiscal cost of increasing the mandate penalty is, dW C dkp = 0 ( p +k) d 2 dk | {z } new subsidy enrollees 1 +k + u 0 (c 11 ) u 0 (c 1 ) (1 1 ) d 1 dk + u 0 (c 11 ) u 0 (c 1 ) (1 1 ) 1 e 1 d e 1 dk | {z } new ESI enrollees + pb p de dk (15) The mechanic burden (Equation 14) net of the behavioral effects in Equation 15 gives the fiscal externality of tax penalty. The total welfare effect of raising a dollar penalty is dW dkp = dW P dkp + dW UC dkp + dW B dkp + dW C dkp + p d dk ! + p dV dk u 0 (16) 3.4.5 Model discussion The framework exploits policy incentives to understand the social implications of in- surance expansion on price, cost and welfare. The evaluation is internally valid for incremental expansion from the near-universal baseline in Massachusetts, but does not directly assess the desirability of a universal mandate. The evaluation is also specific to the analysis of externality, specifically the mechanism and distribution of effects across economy. I discuss some of these modeling assumptions below. Uncompensated care – Uncompensated Care Pool in Massachusetts is funded by an assessment on hospitals and a service surcharge on paying patients. When total cost exceeds the program budget, hospitals bear extra cost as profit loss. The share of assessment in the finance of uncompensated care is available in the program reports. The balance sheet may not accurately reflect the economic incidence of charity costs. In general, there is strong evidence that hospitals experience substantial uncompensated cost saving immediately after Medicaid expansions (Dranove et al. 2016; Blavin 2016). 59 The effect on profit, however, depends on the incidence on insurers and enrollees who also bear the burden of uncompensated care. If service fees are irresponsive to the total cost ( duc p dK = 0), then benefits accrue solely to hospitals as higher profit, and = 0. In Massachusetts, the surcharge burden is set to match the financing need of the charity care program, and went down from 2.90% in 2005 to 1.87% in 2013, suggesting at least some benefits to patients. Nonetheless, I vary between [0; 1] in the valuation of Equation 8. In addition to the service surcharge, hospitals can negotiate higher rebates from insur- ers to lower the assessment burden on profit. The loss of rebate increases the enrollee cost of premium. With complete pass-through of assessment on enrollees, hospital profit is not affected by uncompensated care. The benefit to enrollees and patients is dW UC =gri " (1) u 0 (c >0 ) u 0 (c 1 ) 1 1 0 +" ri; 0 ! + u 0 (c >00 ) u 0 (c 1 ) 1 1 0 +" ri; 0 " r; 0 !# d 0 (17) I discuss the robustness to different distribution of uncompensated care costs in the welfare calculation. Private transfer – In the model, private transfer refers to ESI coverage of non-employees who receive premium transfer from current employees on the job, rather than subsidy from the government. The distinction is relevant for understanding the fiscal externality of subsidy, which affects welfare through interaction with private insurance transfers. Measuring private transfer is empirically challenging. Enrollees in extended ESI cover- age receive subsidy from the Medical Security Program at similar rate as Commonwealth Care. 52 Since these enrollees would have paid full premium, reported ESI need not repre- sent private transfer to non-employees. This results in an over (under) estimate of private transfer (subsidy) recipients. In particular, change in private transfer recipients d 1e 1 dK = (1#) [ d 1e 1 dK d 1e 1 d# dK where# is the share of subsidy recipients among self-reported ESI enrollees d 1e 1 . The bias in [ d 1e 1 dK increases with#. I discuss alternative valuation of d 1e 1 dK in the empirical analysis. Friction – The welfare framework admits two sources of friction shock. The first shock t affects employment outcome through demand-side behavioral response to policy. The second shock t represents behavioral features in the WTP for insurance that deviates from the expected utility (EU) decision model. 52 Unemployed individuals not eligible for ESI extension can also purchase subsidized insurance directly from the Medical Security Program. 60 Overall evidence of strong employment effect of subsidy, both in Massachusetts and more recently in the ACA, is weak. In the welfare calculation, I focus on the behavioral friction on insurance choice. More specifically, to form a stress test on the net benefits of insurance expansion, I focus on the case where t increases with policy generosity and generates over-demand for insurance. The sub-optimal insurance take-up by low-risk individuals results in utility loss in dV dKp u 0 , which I quantify using taste-shock estimates in Saltzman (2018). 3.5 Estimation I apply the framework to conduct a cost-benefit analysis of premium subsidy p and mandate penaltyk. To estimate the incentive effect of subsidy generosity, I exploit differ- ential subsidy eligibility across demographics within pricing communities. 53 I quantify the incentive effect of mandate penalty based on enrollment and cost variation in the unsubsidized individual market (Hackmann et al., 2015). This section estimates the incentive effect of subsidy generosity. I instrument subsidy exposure in Massachusetts with simulated generosity measures from a reference national sample not subject to the policy. Estimated effects are comparable to those found under similar contexts in the literature, and a pure calibration exercise would leave welfare analysis largely unchanged. Still, the micro evidence is directly relevant for the policy impact in Massachusetts, as I discuss below. 3.5.1 Sample summary Estimation sample consists of adults of age 27-64 living in Massachusetts in 2008-2011 waves of the American Community Survey (ACS). Health insurance variables are added to the survey in 2008, including any insurance, Medicare, Medicaid, other types of public insurance, employer sponsored insurance (ESI), and privately purchased insurance plans. I assume ESI is the primary insurance whenever it is reported. Individuals report labor force participation–either employed or searching for employ- ment, and current employment status. I focus on any insurance, insurance type (ESI or other), employment, and ESI interacted with employment as main outcome variables relevant for welfare. Table 3.2 summarizes the estimation sample. 54 A quarter of the sample does not have 53 Similar identifying strategy has been adopted to study insurer pricing response in Commonwealth Care (Jaffe and Shepard, 2018), and subsidy design in the ACA Exchange (Tebaldi, 2017). 54 I exclude young adults below age 27 eligible for dependent coverage from the ACA dependent mandate (Akosa Antwi et al., 2013). I also exclude 2,863 group-quarter inmates (2.1% of the 27-64 sample). Family 61 Table 3.2: Summary Statistics Full Sample No ESI N=132,360 N=30,389 mean s.d. error mean s.d. error Demographics age 45.39 0.034 44.81 0.074 male 0.48 0.0016 0.48 0.0035 race =White 0.83 0.0013 0.73 0.0032 =Black 0.061 0.00086 0.095 0.0021 =other 0.11 0.0011 0.18 0.0028 Hispanic 0.080 0.0010 0.16 0.0027 education =less than high school 0.072 0.00093 0.18 0.0028 =high school 0.30 0.0015 0.41 0.0034 =some college 0.62 0.0016 0.41 0.0034 married 0.60 0.0016 0.39 0.0033 have child below 18 0.38 0.0016 0.32 0.0032 Insurance outcome have any insurance 0.95 0.00084 0.80 0.0029 have ESI 0.74 0.0015 0 – Labor/insurance outcome in labor force 0.83 0.0012 0.64 0.0033 employed 0.77 0.0014 0.51 0.0035 worked last year 0.83 0.0012 0.62 0.0033 in labor force + ESI 0.66 0.0016 0 – in labor force + no ESI 0.17 0.0013 0.64 0.0033 not in labor force + ESI 0.077 0.00081 0 – not in labor force + no ESI 0.094 0.0010 0.36 0.0033 income in % FPL 567.21 1.73 267.06 2.33 subsidy rate 0.29 0.0014 0.68 0.0028 simulated subsidy rate 0.33 0.00074 0.48 0.0016 Notes: Table summarizes estimation sample of non-institutionalized Massachusetts residents aged 27-64 in 2008-2011, adjusted by ACS sampling weights. Income as percentage of FPL is calculated by summing individual income in a tax filing unit, or nuclear families of parents/care-takers and dependent children below age 18. I then apply subsidy schedules to these units to calculate subsidy rate. Simulated subsidy rate is calculated to reflect a schedule’s generosity over a fixed national sample. Details are explained in the main text. 62 ESI, and may qualify for subsidized insurance from Medicaid or the individual market. These individuals are relatively young, single, more likely to be ethnic minorities, and have lower education and income. Average subsidy rate in this group is 68% (69% excluding the uninsured): enrollees pay about 30% of the lowest premium applicable. 3.5.2 Subsidy rate I calculate subsidy exposure for all individuals in the estimation sample using the Schedule HC Worksheets and Tables. The document is the official guideline for determining mandate penalty, affordability, and subsidy on Commonwealth Care. Figure 3.2 shows a snapshot of relevant tables in 2011. The left side determines affordability based on family income. The right side gives the lowest unsubsidized premium across age band and region. Premium contributed by subsidy enrollees equals affordability. 55 I therefore construct subsidy rate subs = 1 affordability market rate or the discount provided by the subsidy relative to the full price. I discuss the measurement of the numerator and the denominator below. Numerator: affordability – Affordability is zero for low-income individuals below 150% FPL. In 2011, single individuals with income less than $16,344 and married couples with family income less than $22,068 fall in this range. 56 . At higher income, affordability increases at 200%, 250% and 300% FPL, beyond which no subsidy applies. To determine income as percentage FPL, I identify family units relevant for tax and subsidy filing using relationship pointers developed by (Ruggles et al., 2018). Because adult children living with parents cannot be claimed as dependents, I split multi-generation households into single-generation families. I combine total personal income reported by parents to construct family income, 57 which I transform to percentage FPL based on poverty guidelines. 58 Affordability is assigned according to the Worksheets and Tables. income relevant for tax-filing and subsidy application is difficult to construct for these individuals. 55 Premium contribution is zero for the below 150% FPL group. Between 150-300% FPL, Commonwealth Care provides at least one plan (CeltiCare) that charges subsidized premium at affordability level. 56 Married couples living in the same household are instructed to combine individual income when calculating affordability and subsidy. The instruction on the Schedule HC reads: ”If married filing separately and living in the same household, each spouse must combine their income figures from their separate U.S. returns when completing this worksheet.” The instruction applies to Line 11: Eligibility for Government Subsidized Health Insurance. 57 Reported personal income in ACS includes total pre-tax income and losses from all sources, and is used to approximate federal adjusted gross income (AGI). It may differ from AGI because capital losses are not fully deductible from AGI, some social security income is tax-exempt and not included in AGI, and because other AGI deductions are not measured in ACS. 58 Poverty guidelines are published by the Department of Health and Human Services. 63 Figure 3.2: Affordability and Premium in 2010 Schedule HC Worksheets and Tables Note: snapshot of page 3 in 2011 Schedule HC Worksheets and Tables, available athttps://www.mass.gov/ files/documents/2016/08/sr/sch-hc-wksht-tables.pdf 64 Denominator: market premium rate – Premium is community-rated and varies only by region and age band in a year. In 2011, premium is highest in the Berkshire-Franklin- Hampshire region. 59 Within region, premium is twice as large for the near-elderly (55+) as for young adults (27-29). Married couples combine individual premium for family coverage. I assign premium to individuals based on year, location and age. I match rating regions to public use micro-data area (PUMA) in ACS. PUMAs are contiguous geographic units build on census tracts. 60 52 PUMAs in Massachusetts are mapped imperfectly to 14 counties, and then assigned to one of the three rating regions. When PUMA straddles two rating regions, I calculate average premium weighted by population share cross region, and assign the average to all individuals in the PUMA. 61 . Similar strategy is adopted in Frean et al. (2017). In Appendix I show results are unaffected if I assign PUMA to the larger region, or simply drop these PUMAs from the sample. Comparing with administrative records – Based on calculated subsidy rate in the estimation sample, Commonwealth Care enrollees receive an average subsidy rate of 91% in 2011. 62 Based on administrative records, in 2011, subsidy rate calculated using the lowest priced plan (subsidized premium equals affordability) is 1 $46 $405 = 89%, fairly similar to the rate calculated from ACS. 63 The small difference may be attributable to young adults 19-26 of age not included in the estimation sample, and to noisier income measures in ACS. 3.5.3 Empirical strategy The denominator in the subsidy rate captures policy variation in community rating. The numerator captures policy variation in affordability. Within community, insurer premium does not vary; enrollee cost differs by income according to the affordability schedule. The joint variation suggests incentive effect of subsidy policy is identified off income and demographic differences in subsidy exposure within rating communities, as implemented in Jaffe and Shepard (2018) and Tebaldi (2017). In the regression analysis, I absorb premium variation in the denominator with com- munity fixed effects, and affordability variation in the numerator with main effects of income. Interactive variation between premium and affordability identifies the causal 59 The region has the lowest average premium in 2010, and the median premium across regions in 2009. 60 PUMAs differ widely in geographic size, but generally have population between 100,000 to 200,000. 61 Although residents’ exact location within PUMA is unknown, the split of PUMA population by county is available fromhttps://usa.ipums.org/usa/volii/2000pumas.shtml. The weighting generates 7 new rating regions affecting 14% of the state population. 62 I identify CommCare enrollees in ACS has having income below 300% FPL, insured from a non-ESI source, and not eligible for Medicaid (below 133% FPL with children). 63 See Table 1 in Finkelstein et al. (2017). 65 impact of subsidy exposure. However, with selection responses to affordability, subsidy rate calculated from realized income does not isolate exogenous policy incentives, resulting in biased estimates due to reverse causality. Selection over affordability may occur, when reported income exhibits excess mass on one side of subsidy eligibility. 64 Among eligibles, selection occurs when behavioral responses lead to changes in income and realized subsidy. For example, if subsidy relieves the “job lock” of near-elderly eligibles, lower retirement income qualifies them for higher subsidy. Furthermore, equilibrium effects on wage, employment and ESI offer bias the incentive measure based on realized income distribution. I address the selection bias exploiting the fact that remaining states did not undergo the reform – calculated subsidy rate from the national sample is exempt from the selection responses in the Massachusetts sample. The simulation quantifies exogenous, sometimes complex, policy variation in a single parameter (Currie and Gruber 1996a; Currie and Gruber 1996b; Cutler and Gruber 1996). I discuss the application to subsidy generosity below. 3.5.4 Simulated generosity I simulate two instruments to measure generosity. The main instrument captures com- munity rating variation and affordability across demographics. Specifically, I construct instrumentsubiv as the follows: subiv dapt = 1 1 jN da j X i2N da affordability it market rate apt : Applying a PUMA(p)-year(t) schedule to the entire simulation sample, I assign market premium based on age, and affordability based on income. I average individual rates within demographic(d)-age(a) to generatesubiv dapt . The idea is to capture affordability variation through baseline income differences across demographics, rather than selected income responses in Massachusetts. 65 The simulation sample consists of continental US states in 2005-2006, the baseline economy unaffected by subsidy incentives. 66 Within communityapt, generosity varies 64 Using tax return data, Heim et al. (2017) finds bunching below 400% FPL, the phase-out threshold of ACA premium subsidy. At the subsidy phase-in level (138% FPL) in non-expansion states, Kucko et al. (2018) finds bunching above the threshold, but does not detect meaningful labor adjustments by wage and salary workers. 65 For a similar strategy exploiting demographic differences in assets to quantify bankruptcy protection laws, see Mahoney (2015). 66 I choose this time period to avoid the confounding effects of the economic downturn. 66 across 144 demographic groups ind. 67 Table 3.3 shows subsidy is more generous for minorities, the less educated, and singles. Generosity varies substantially more across groups. In Berkshire-Franklin-Hampshire, for example, college-educated White males 30-34 of age married without children receive 7.5% subsidy in 2011 (baseline income 743% FPL). In this age group, African American single mothers with less than high-school education receive 98% subsidy (income 63% FPL). Despite the rich variation, causal interpretation needs to assume that demographic outcomes would have trended similarly absent the policy. If differences in outcomes are explained by unobserved factors correlated with policy, the instrument is invalid. 68 To assess the extent of omitted variable bias, I consider a second instrument sublean apt = 1 1 jN a j X i2N a affordability it market rate apt : where affordability varies over age as with the rating regulation. Within age band, gen- erosity does not differ by demographics. The instrument then identifies policy incentive under weaker assumption than the main instrument, and is plausibly more exogenous. Between the two instruments, the demographic variation provides over-identification. In specification tests, if the policy incentive appears uncorrelated with unobserved differ- ences by age and demographics, then both instruments are likely exogenous. For power concerns I focus more on the main instrument, although results are qualitatively similar using either or both instruments. I use the 2005-2006 US sample to simulate bothsubiv andsublean. In principle, in- struments simulated from the pre-reform Massachusetts sample could similarly address the selection response to subsidy, and possibly provide a stronger correlation with the endogenous exposure in the first stage. In practice, however, because the pre-reform sample does not contain sufficient observations for all demographic groups, the main instrumentsubiv cannot be computed solely from Massachusetts. I compute the second in- strumentsublean ma using the 2005-2006 sample in Massachusetts. I show over-identified estimates usingsublean ma as the second instrument in Appendix Table B.2.4. 67 I include gender, race (White, Black, other), Hispanic origin, education levels (< high school, high school, some college), marital status and presence of children in the simulation. Hispanic origin leads to small number of observations in certain cells. Dropping Hispanic origin barely changes results. 68 In this case, fixed effects may control for baseline differences across demographics, but cannot address time-varying confounds correlated with policy. 67 Table 3.3: Demographic variation in subsidy rate subsidy rate simulated subsidy rate Observation mean s.d. error mean s.d. error age 27-29 8,454 0.40 0.0057 0.45 0.0028 30-34 14,340 0.33 0.0043 0.39 0.0024 35-39 15,407 0.30 0.0041 0.35 0.0022 40-44 18,400 0.28 0.0038 0.33 0.0019 45-49 20,440 0.26 0.0035 0.30 0.0018 50-54 20,423 0.26 0.0035 0.29 0.0018 55-64 34,896 0.27 0.0027 0.32 0.0014 male 62,612 0.27 0.0020 0.32 0.00097 female 69,748 0.31 0.0020 0.35 0.0011 race =White 113,212 0.25 0.0015 0.30 0.00074 =Black 6,518 0.50 0.0066 0.52 0.0032 =other 12,630 0.46 0.0048 0.47 0.0025 Hispanic origin 8,163 0.59 0.0057 0.61 0.0027 non-Hispanic origin 124,197 0.26 0.0014 0.31 0.00071 education =less than high school 7,831 0.69 0.0055 0.74 0.0019 =high school 38,556 0.41 0.0027 0.46 0.0012 =some college 85,973 0.18 0.0015 0.23 0.00057 married 85,843 0.18 0.0015 0.22 0.00065 not married 46,517 0.46 0.0025 0.51 0.0011 have dependent children 51,822 0.28 0.0022 0.32 0.0013 no dependent children 80,538 0.30 0.0018 0.34 0.00091 Notes: Table shows subsidy rate by demographics in the estimation sample. Endogenous subsidy rate is calculated from reported family income. Simulated subsidy rate is calculated from a baseline national sample, and assigned to Massachusetts individuals based on rating community and demographics. Details of the simulation are in the main text. 68 3.5.5 Econometric model I instrument endogenous exposuresubs ipt with generositysubiv dpt . In the reduced form, y iapt = subiv d(i)apt + 1 incb d(i) + a + p + t + b(a) r(p) t + b(a) r(p) incb d(i) + r(p) t incb d(i) + b(a) t incb d(i) + r(p) t X d(i) + 1 UE b(a)t + r(p) t X d(i) UE b(a)t + iapt (18) I control for the mains effects of age a , PUMA p , year t , and baseline income by demographicsincb d(i) . I further control for all three-way interactions across the four, 69 generating community-level fixed effects and differential income trends over region-year, age-year, and age-region. To alleviate omitted variable bias, I control for unemployment rate at the same level of the instrument. 70 Specifically, I include age-specific unemployment rateUE b(a)t , interacted with level effects of demographic variablesX d(i) and region-year indicators. Less aggressive controls of employment shocks yield similar estimates. Table 3.4 shows the first stage. All columns control for main effects of PUMA, age, year, and income, demographic variables, and region-year fixed effects. Column 3-4 in addition include unemployment rate interacted with demographic variables. Identifying power in sublean weakens with added controls. Column 5 corresponds to the main specification with full set of interactions, 71 where the F-statistic is above 500. Because the first-stage estimate is nearly one, I focus on reduced-form results in what follows. 3.5.6 Results Table 3.5 shows estimated effect of subsidy incentives. In Panel A, endogenous exposure is significantly associated with lower insurance coverage and employment. In particular, employment would drop by 4 percentage points following a ten percentage point increase in subsidy. In panel B, reduced-form estimates suggest a modest effect on insurance take-up, and a null effect on employment. Labor participation is similarly unresponsive to subsidy. Stratified by year, the take-up effect is small in the first two years but increases over time. 72 In 2011, ten percentage point 69 I use more aggregated level of age bandb(a) and regionr(p) in the interaction. 70 If the macroeconomic impact across demographics is intermediated by the subsidy policy–for example, individuals with higher subsidy are less likely to lose coverage when they lose jobs, then the instrument is correlated with employment shocks in the error term. 71 In this specification, community-level fixed effects sweep out the variation in the lean instrument. I hence base over-identified results on specifications in column 3-4. 72 Studying the ACA Exchange, Frean et al. (2017) shows subsidy increased take-up by 0.051 percentage 69 Table 3.4: First stage: endogenous subsidy predicted by simulated generosity (I) (II) (III) (IV) (V) subiv 0.94*** 0.94*** 0.99*** (0.044) (0.045) (0.042) sublean 1.04** 0.24 0.86* 0.085 (0.42) (0.43) (0.47) (0.48) region-year FE Y Y Y Y Y region-year-age FE Y UE Y Y Y F-stat 6.06 228.36 3.41 227.99 549.49 R 2 0.28 0.29 0.28 0.29 0.29 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the first-stage regression of endogenous subsidy rate on simulated instru- ments. In all specifications I include main effects of PUMA, age, year, and income, region-year fixed effects, and demographic variables. Column 3-4 additionally include age-band unem- ployment rate interacted with demographic variables. Column 5 corresponds to the main specification with full interaction terms. Robust standard errors clustered at the level of PUMA in the parenthesis. 70 increase in subsidy raises coverage by 1.7 percentage points. Employment outcomes vary slightly over years, but on average, evidence of significant labor supply and employment effect of subsidy is minimal. ESI coverage of workers decreases by 0.26 percentage point per subsidy percent. Rela- tive to the OLS estimate, the reduced form suggests larger selection on ESI benefit rather than employment. Appendix Table B.2.1 further suggests ESI benefit decreased more for young workers than middle-age workers. At the near-elderly, employment selection becomes significant, and ESI decreases more. The overall pattern is consistent with sorting of young workers to small firms less affected by the mandate (Aizawa, 2017), and with the “retirement lock” of ESI alleviated by pubic subsidy. ESI decreases more for the non-employed. To the extent that insurance is sponsored by previous employers for which unemployed enrollees pay full premium, or receive state subsidy, actual selection from private transfer is smaller. Measuring 1e 1 as ESI enrollees out of work for at least a year (and hence less likely enrolled in ESI continuation plans), estimated selection is -0.28 percentage point per subsidy percent. For enrollees out of work for five years, the estimate is further lower at -0.19 (Appendix Table B.2.2). I calculate welfare based on different estimates of private transfer crowd-out. 3.5.7 Robustness The instruments pass the joint exogeneity test in Panel C of Table 3.5. The test is based on the specification with main effects of location, year, age, and demographic variables interacted with unemployment rate. Appendix Table B.2.3 presents similar results for basic specification without unemployment controls, and separate 2SLS estimates using either instrument. Although the weaker instrument produces larger and less precise estimates, over-identified estimates appear similar across specifications. Appendix Table B.2.4 examines the robustness of over-identified results when the second instrument is simulated from the Massachusetts sample. Due to sample size limita- tion, I only simulate generosity across age, location, and year, but not across demographics. When instrumented jointly withsublean, the effects are larger but none significant, similar to results in Appendix Table B.2.3. When instrumented jointly withsubiv, effects on insurance and employment outcomes are comparable to those in Panel C, Table 3.5. Effect on ESI selection is concentrated in the non-employed, but insignificant for workers, differ- ent from the main results in Table 3.5. For the calculation of welfare, I follow the main estimates that likely over-state the selection effect on private insurance. I then vary the point per subsidy percent in 2014, and 0.089 in 2015. 71 Table 3.5: Estimated incentive effect of subsidy (I) (II) (III) (IV) (V) any insurance employed in labor force ESI + ESI + employed not employed Panel A: OLS subs -0.071*** -0.41*** -0.30*** -0.55*** 0.046*** (0.0032) (0.0071) (0.0066) (0.0071) (0.0045) R 2 0.083 0.21 0.18 0.29 0.054 Panel B: reduced form subiv 0.10*** -0.055 -0.056 -0.26*** -0.33*** (0.025) (0.053) (0.044) (0.058) (0.023) R 2 0.071 0.090 0.10 0.13 0.054 2008 0.034 -0.049 -0.017 -0.27*** -0.37*** (0.051) (0.088) (0.084) (0.10) (0.044) 2009 0.072* -0.0063 -0.075 -0.22*** -0.38*** (0.041) (0.081) (0.062) (0.081) (0.042) 2010 0.11*** -0.095 -0.074 -0.30*** -0.29*** (0.037) (0.063) (0.055) (0.067) (0.036) 2011 0.17*** -0.059 -0.053 -0.25*** -0.31*** (0.047) (0.083) (0.070) (0.087) (0.036) Panel C: over-identified 2SLS subs 0.16*** -0.056 -0.046 -0.32*** -0.26*** (0.037) (0.057) (0.047) (0.046) (0.029) F-stat 227.99 227.99 227.99 227.99 227.99 p-value 0.92 0.46 0.92 0.19 0.44 y mean 0.95 0.77 0.83 0.64 0.10 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows estimated incentive effect of premium subsidy. Panel A shows OLS estimates using endogenous subsidy ratesubs. Panel B shows reduced-form effect of simulated generositysubiv, and year-specific effects interactingsubiv with year dummies from a separate regression. Panel A and B are based on the main specification with full interaction terms. Panel C shows 2-stage least square estimates instrumentingsubs withsubiv and sublean, based on a specification with main effects of PUMA, age, year, income, and demographic variables interacted with unemployment rate. P-value from Hansen over-identification test is reported. Robust standard errors clustered at the level of PUMA in the parenthesis. 72 size of selection and the case where subsidy does not affect ESI coverage in the robustness analysis. Appendix Table B.2.5 shows robustness to alternative treatment of border PUMAs. The main analysis first averages premium across rating regions before calculating subsidy. Results are nearly identical if subsidy is calculated for each region and then averaged. Assigning the PUMA to region with larger population yields similar estimates. Moreover, results remain significant at conventional level with standard error clustered at the level of region and age band. I conduct permutation tests over the remaining 50 states that did not undergo the reform. Specifically, I randomly assign Massachusetts rating communities across location, year and age band in control states, and estimate pseudo policy effect using simulated generositysublean. Appendix Figure B.3.1 plots the empirical distribution of reduced- form estimates across states. Appendix Figure B.3.2 plots the distribution based onsubiv, where I also permute Massachusetts affordability across income (demographics). In most cases, the true effect in Massachusetts is significant at 95% level. 3.6 Calculation I calibrate the enrollment effect on formal insurance pricing" r; 0 and uncompensated care" ri; 0 using administrative data on marginal and average costs. I calibrate insurer pricing distortion due to subsidy using simulated evidence from Jaffe and Shepard (2018), the credit channel from Brevoort et al. (2017), and consumption from the Consumer Expenditure Survey (CEX). Pricing effect of subsidy – I focus on the Commonwealth Care program for the effect of subsidy on pricing. Enrollees face discrete changes in subsidy when income crosses thresholds at 150% FPL, 200% FPL, and 250% FPL. In 2011, larger subsidy below 150% FPL increased take-up from 70% to 94%, and lowered average monthly cost from $380 to $333 (?). Cost of new enrollees is $33394%$38070% 94%70% = $196 at 150% FPL. Similarly, cost of new enrollees is $268.2 at 200% FPL, and $280.86 at 250% FPL. Across thresholds, new enrollees lower average cost by dr = $377:3 $232:5 0:05% 94% 0:05% $377:3 = $0:07 where $377.3 is the cost of existing enrollees, and $232.5 the cost of new enrollees. At 94% 73 coverage baseline in the 19-64 population, implied cost elasticity is small " p r; 0 = dr d 0 0 r = $0:07 0:05% 6% $377:3 = 0:023 Recall from Equation 6 that premium also responds to mark-up adjustment subsidy: dp d p = " p r; 0 r 0 " 1 + +r @ @r # d 0 d p + r @ @ p Two pieces of evidence potentially inform the magnitude of @ @r and @ @ p . Jaffe and Shepard (2018) shows price-linked subsidy increased average CommCare premium by 5.4%, rela- tive to a fixed subsidy. The distortion is largest for the cheapest plan directly linked to sub- sidy, where mark-up increased from 17.6% to 25%, the maximum allowed by law. 73 Aver- age mark-up increased from 12.40% to 13.55% with linkage, implying @ @ p = 1:15% 71% = 0:016. Hackmann et al. (2015) finds both mark-up and cost decreased in the unsubsidized individual market, implying @ @r > 0. However, risk pooling with the small-group market may also lower mark-up in the individual market. 74 Evaluated at the lowest premium in 2011, where mark-up binds at the maximum 25%, and @ @r = @ @ p = 0 by the rating regulation, price effect dp d p = 0:023 $323:04 6% 1 + 25% (0:17) =$26:31 Implied saving is dp d p p =0:065 per subsidy dollar. 75 For average enrollee premium ($424) and mark-up (11%), premium saving is similarly small at 0:023 6% (0:17) + 0:016 1+11% =0:051 per subsidy dollar, assuming @ @r = 0. I shut down the premium response and calibrate dlogp d p = 0 in the baseline analysis. I re-calibrate the premium benefit and compare with uncompensated care for a range of mark-up responses in the discussion of reform rationales. Pricing effect of mandate – Enrollment in unsubsidized individual market increased from 1% in 2006 to 3% by 2010. 76 Both the mandate penalty and the rating regulation 73 The minimal medical loss ratio law requires at least 80% of premium spent on medical claims, or a maximum mark-up of 25%. 74 As part of the 2006-2007 reform, Massachusetts pooled the risk base of small-group and non-group market. The pooling significantly reduced the premium of individual coverage, with much smaller effect on group insurance premium. 75 Premium of cheapest plan (CeltiCare) is $403.8 per month, with cost $323.04 (Jaffe and Shepard, 2018). 76 I divide enrollment counts in Key Indicators, available at http://archives.lib.state.ma.us/ bitstream/handle/2452/112747/ocn232606916-2011-05.pdf, by population estimate from ACS. 74 potentially contribute to the increase with lower (net) cost of coverage. The maximum effect attributable to penalty is d 0 dk = 2% 50% =0:04, where penalty is half of the cheapest plan premium. Average cost decreased by 8.7% when take-up increased by 26.5 percentage points in the individual market (Hackmann et al., 2015). 77 New enrollees cost dr d 0 = dlogr d 0 r = 8:7% 26:5% $5;720 12 = $144:18, one-third the baseline average costr = $5;720 12 = $439:17, and 40% less costly than new enrollees in the subsidized program. Absent the mandate, if high-income low-cost individuals did not enroll, average cost would be higher by $377:394%$144:182% 94%2% $377:3 = $5:07. Cost elasticity of mandate" k r; 0 = $5:07 2% 6% $377:3 = 0:040. Assuming penalty has no direct effect on pricing ( @ @k = 0), and the smaller mark-up is completely due to risk pooling (hence @ @r = 0), premium decreases by dp dkp = " k r; 0 0 d 0 dk = 0:040 6% (0:04) =0:027 per subsidy dollar. The baseline analysis calibrates dlogp dk = 0. I introduce the premium benefit and compare with uncompensated care in the discussion of reform rationales. Uncompensated care – In 2011, uncompensated care to the uninsured and the under- insured totals $496 million, of which 90% is spend on the 19-64 population, and 85% of beneficiaries have no primary source of insurance. 78 Average uncompensated cost is $133.47 per uninsured. 79 Assuming a 25% moral hazard of insurance, or 80% actuarial value of uncompensated care (g = 0:8), 80 average cost of enrolling the uninsured isri = $133:47 0:8 = $166:84. Subsidy enrollment lowers the cost by $166:846%+$232:50:05% 6%+0:05% $166:84 = $0:54. Cost elasticity " p ri; 0 = $0:54 0:05% 6% $166:84 = 0:39. Mandate penalty lowers the cost by $166:846%+$144:182% 6%+2% $166:84 =$5:67, with cost elasticity" k ri; 0 = $5:67 2% 6% $166:84 =0:10. Around one-third of uncompensated cost is financed by patient surcharge uc p = 0 1 0 ri r gOOC, where = 0:32, 81 and 0 1 0 ri r = 0:028 is the cost share per patient. For- mal insurance enrollment lowers patient surcharge by 40% every ten percent subsidy, 82 whereas the effect on premium is only dlogp d p 0:10 =0:5%. A ten percent penalty lowers 77 The take-up estimate is specific to eligible individual market enrollees not offered ESI. 78 Detailed accounts of cost and funding sources are in the annual report of Health Safety Net (HSN), the primary payer of uncompensated care in Massachusetts. The 2011 report is available athttps://www. mass.gov/files/documents/2016/07/tp/hsn11-ar.pdf. 15% of the HSN cost is supplemental benefit to Medicaid, CommCare and private insurance enrollees. Adjusting for the supplement would increase the cost of an average insured by $1.48. 79 Total cost $496M 90% 85% divided by uninsured months 12 236;902. 80 The estimate is consistent with spending response to cost-sharing in Commcare (Chandra et al., 2014), and experimental evidence from Medicaid lottery (Finkelstein et al., 2012). 81 Statutory contribution of service charge is $160 million, about $160 $496 = 32% of total cost. 82 Note that dloguc p dK = dloguc p d 0 d 0 dK , where dloguc p d 0 = 1 0 1 1 0 +" K ri; 0 " K r; 0 . 75 surcharge by 6.2% and premium by 0.27%, where new enrollees are less costly than the average uninsured. The remaining cost is financed by state and federal funds (up to $100 million), and assessment on provider. 83 Avoided externality on government budget and hospital profit is (1)gri (1 +" ri; 0 ) per enrollee, or 39% (6%) per ten percent subsidy (penalty). Credit channel – Health insurance reduces medical debt, and improves consumption opportunity cross periods through the household balance sheet. 84 Moreover, interest payment on non-medical debt decreases as credit rating improves. I calibrate both effects based on Brevoort et al. (2017). With optimal choice of medical debt, repayment generates no social benefit. Nonethe- less, liquidity benefit to households is non-trivial. In the ACA context, quarterly saving of new medical debt is $13:50 in states that expanded Medicaid, an increase in liquidity of $13:50 3 1 4:4% 0:17 $424 = $17:39 $424 = 0:041 per subsidy dollar. 85 The externality on non-medical interest payment is welfare-relevant, when individuals do not internalize the cost on credit rating and price. Rating improvement from Medicaid expansion lowers annual interest payment by $14:60, a monthly saving of $14:60 12 1 4:4% 0:17 $424 = 0:011 per subsidy dollar. The calibration does not incorporate dynamic effect on net worth, 86 and may understate the credit benefit. Consumption – To monetize welfare, price externalities are weighted by marginal utility relative to the reference group (workers). Assuming constant relative risk aversion (CRRA),u 0 (c) =c , welfare weight depends on consumption ratio given calibration of . I measure consumption of 27-64 Massachusetts residents in the 2011 CEX panel. I infer insurance (ESI, public, uninsured) from premium expenditure, and hospital utilization (g = 0) from medical expenses. 87 Appendix Table B.2.6 shows average non-medical consumption for beneficiary groups. Based on the statistics, private transfer externality is weighted by 1:03 , subsidy by 0:64 , penalty by 0:57 , surcharge by 0:77 , and interest saving by 0:89 . 83 I only consider the mechanic cost externality on third-party payers. If the externality distorts hospital mark-up on reimbursed services, or crowds-out government funds that generate greater social value–e.g., investment in early education and transfer program to the poor, then the social cost of uncompensated care is under-estimated. 84 Consumption increases less than the liquidity ofbd t (e ! t ), depending on non-medical borrowing adjust- ments ˙ A t (e ! t ). 85 The average saving is driven by a 4.4% take-up, adjusted by the subsidy effect on take-up. 86 The calculation in Brevoort et al. (2017) holds fixed the contract term and the debt structure, simplifying from dynamic adjustments. When choices of medical and non-medical borrowing are assumed optimal, dynamic externality from interest saving is likely small. 87 CEX estimate of insurance is 94%, and 12-month employment rate 82%, both very similar to ACS estimates (Table 3.2). Hospital utilization in CEX is 7.7%. 76 More generally, society may approach insurance externalities based on concerns other than differences in marginal utility of consumption. In this case, a different set of general social welfare weights applies (Saez and Stantcheva, 2016). I discuss implications of different motivations and measurements of welfare weights. 88 Friction – I focus on micro-level friction that places a wedge between consumption smoothing benefit of insurance (the welfare metric) and private utility generating observed patterns of take-up. Individuals optimize based on private utility, but welfare may increase or decrease depending on the direction of friction response to policy. I focus on the case of over-insurance, or potential welfare loss for marginal enrollees, to assess the robustness of standard welfare calculation from a frictionless model. I assume over-insurance is driven by “taste for complicance” that increases with policy generosity for a set of low-risk individuals. Saltzman (2018) estimates the taste shock in terms of WTP: the existence of mandate penalty increases WTP for insurance by $13 (3% of premium) in the Washington Exchange, and $64 (15% of premium) in California. The taste subsidy on demand is 3% 71% = 4:2% the magnitude of premium subsidy in Washington, and 21% in California. Assuming >0 p 2 [4:2%; 21%] of the total effect on take-up is driven by taste, welfare loss is >0 p d 0 d p 0 B B B B B B B B B @ u 0 (c 0 ) (1 p k) >0 p u 0 (c 1 ) + u 0 (c 01 ) (1) bd t +uc p (1g)nM p >0 p u 0 (c 1 ) 1 C C C C C C C C C A where I Taylor-expand the utility of taste-based enrollees at the optimal consumption when uninsured. Friction reduces consumption by the net cost of subsidized premium (1 p k)p. Utility change in the sick state is small if taste-based enrollees are characterized with low risk and low medical debt. 89 Focusing on the premium cost on consumption, the welfare loss is >0 p d 0 d p u 0 (c 0 ) 1 p k >0 p u 0 (c 1 ) = 21% 0:17 0:63 1 2 = 0:018 0:63 (19) where (non-medical) consumption is evaluated at the uninsured average, subsidy assumed zero, and >0 p = 21% to over-state the welfare loss. For mandate penalty, a 15% taste subsidy is >0 k = 15% 50% = 30% the magnitude of penalty 88 For example, using food expenditure variation to reflect redistributive preference for subsistence consumption, social weight of externality is generally smaller (Appendix Table B.2.6). 89 Whenbd t >0 p 0, because formal insurance is more generous, (1g)nM>uc p , consumption increases for taste-based enrollees in bad health state. 77 on premium in the unsubsidized population. Friction is calculated as >0 k d 0 dk u 0 (c 0 ) 1 p k >0 k u 0 (c 1 ) = 30% 0:04 1 2 0:37 = 0:006 0:37 (20) using non-medical consumption. 3.7 Welfare 3.7.1 Subsidy Absent friction to marginal enrollees, subsidy affects welfare through premium assistance to recipients, uncompensated care to third-party payers, interest charges on the uninsured, and the fiscal cost to taxpayers. Table 3.6 calculates welfare ignoring the efficiency gain on insurance premium. Welfare weights vary by the curvature in the utility function ( ), reflecting the value society derives from moving a dollar from high-income to low-income individuals. Given , society may value transfer to cover subsistence needs more than additional transfers–as an example, I consider food consumption. Table 3.6: Welfare effect of premium subsidy (I) (II) (III) (IV) (V) (VI) (VII) dW B d p p dW UC d p p dW d p p dW C d p p (1)+(2)+(3)+(4) dW F d p p u 0 dW d p p Panel A: non-medical consumption = 0 0.232 0.074 0.011 -0.338 -0.021 -0.018 -0.039 = 1 0.368 0.082 0.013 -0.342 0.121 -0.028 0.093 = 2 0.585 0.092 0.014 -0.346 0.345 -0.045 0.3 = 3 0.928 0.105 0.016 -0.350 0.699 -0.071 0.628 Panel B: food consumption = 1 0.249 0.079 0.011 -0.339 0 -0.018 -0.018 = 2 0.268 0.085 0.012 -0.339 0.026 -0.019 0.007 = 3 0.288 0.092 0.012 -0.340 0.052 -0.021 0.031 Notes: Table calculates the benefit of subsidy dollar to enrollees in column 1 (Equation 9), to uncompensated care payers in column 2 (Equation 8), interest saving in column 3 (Brevoort et al., 2017), and the fiscal cost of financing in column 4 (Equation 10). Column 5 calculates the net benefit in column 1 to 4. Column 6 calculates the utility loss from over-insurance (Equation 19), where the preference shock for compliance increases with subsidy according to Saltzman (2018) . Total welfare effect in column 7 is intended as a lower bound. Welfare weights varies by the curvature parameter in the utility, and by the consumption base considered appropriate for redistribution. With linear utility in consumption, social value of redistribution is zero. Compared 78 to the fiscal cost of public funds, and the moral hazard of formal insurance, benefits to subsidy recipients and uncompensated care payers do not outweigh the cost. Further accounting for the external benefit on credit still yields a small welfare loss of 0.02 per dollar. The baseline calculation does not consider the premium response to enrollment, and over-states the efficiency cost of subsidy. Nonetheless, a modest dose of equity considera- tion offsets the baseline cost. Fixing curvature at = 1, when redistribution is based on total non-medical consumption, recipient valuation of a subsidy dollar, net of the welfare loss to taste-based enrollees, outweighs the fiscal cost. When redistribution is based on food consumption only (Panel B), social benefit to recipients is perceived less than the cost to taxpayers, and redistribution is not a moti- vating rationale for subsidy. In this case, social cost of uncompensated care becomes a crucial element in the justification. In Panel B, saving in patient surcharge and provider assessment generates about one-third the direct benefit to enrollees, and roughly closes the gap with fiscal cost at low preference for redistribution ( = 2). Accounting for the interest saving to the uninsured rationalizes subsidy cost for = 1, although welfare loss from taste-based enrollment potentially offsets the benefit. Appendix Table B.2.7 varies the incidence of uncompensated cost. Welfare benefit is largest with complete pass-through to patient, and smaller when uncompensated cost is financed by a premium tax on enrollees. Regardless of incidence, total welfare effect (including friction) balances out under conservative welfare weights for = 2. 3.7.2 Mandate penalty I calculate welfare effect of mandate penalty for the unsubsidized population (income above 300% FPL), where subsidy is zero ( p = 0) and penalty is half the lowest premium (k = 1 2 ). As with subsidy, the baseline calculation ignores the premium response to enrollment, and over-states the efficiency cost of policy. For the unsubsidized group, I further ignore any credit benefit to the uninsured. 90 Table 3.7 calculates the direct cost of penalty on the uninsured in column 1, saving to uncompensated care payers in column 2, and the fiscal impact on government budget in column 3. 91 Absent redistribution concerns ( = 0), a dollar increase in penalty raises government revenue by 0.003 dollar. Efficiency cost of new enrollees k d 0 dk =0:02 nearly offsets 90 Credit cost borne by the uninsured is less than 20% the medical cost borne by third-party payers. In the high-income group, social cost of uncompensated care is calculated to be small, and the effect on credit is negligible. 91 See Appendix Table B.2.8 for consumption and associated welfare weights in this sample. 79 the penalty increase on the uninsured. 92 Benefit to third-party payers is small relative to the subsidized population, reflecting the low cost of coverage in the high-income uninsured. Including this benefit still leaves out a net welfare loss of 0.01 per dollar. When uncompensated care is more restricted to the low-income uninsured, the efficiency loss increases. With > 1, welfare further trades-off the dollar cost of public fund on the uninsured versus taxpayers. If payroll tax on workers is deemed more desirable than penalty, or if the uninsured receive higher welfare weights (based on non-medical consumption in Table 3.7), then welfare decreases with penalty. In contrast, food consumption implies public fund is slightly less costly if financed by penalty. The small redistribution value, however, does not offset efficiency cost: net effect on welfare is a 0.006 loss per dollar for = 2, similar to the case when = 0. Therefore equity concerns and uncompensated care do not rationalize the baseline efficiency cost of penalty. The remaining gap (0:015 net of friction when = 0) is small, and is potentially offset by the efficiency gain in insurance premium excluded from the baseline analysis. I discuss the relative importance of uncompensated care and premium response to the efficiency argument of policies below. Appendix Table B.2.9 shows the baseline calculation is robust to alternative incidences of uncompensated care. 3.7.3 Robustness The baseline analysis calculates welfare for the 27-64 age group, combining ACS estimates with calibrated values from the literature. The robustness analysis conducts a full calibra- tion exercise for the 18-64 age group based on (the range of) estimates in the literature. I continue to exclude any premium response to enrollment, but re-calculate the fiscal cost varying the incentive effects on insurance and employment, and the uncompensated care benefit varying the spending response to formal insurance. 3.7.4 Fiscal cost of subsidy The fiscal cost of a dollar subsidy is calculated based on Equation 10. In the main analysis in Table 3.6, I calibrate de d p = 0. Alternatively, I benchmark the employment response to the effect on take-up, and consider cases where zero (baseline), 50%, and 100% of new enrollees reduced employment. Calibrated de d p = 0, -0.08, and -0.16, respectively. 93 92 In practice, contribution of mandate penalty to the Commonwealth Care trust fund is less than 1.5%. 93 Estimated effect of subsidy on employment tends to be small, except for sub-groups such as the near- elderly. One of the larger estimates suggests childless adults losing Medicaid eligibility in Tennessee (Garthwaite et al., 2014) increased employment by 4.6 percentage points. In the Massachusetts context, 80 Table 3.7: Welfare effect of mandate penalty (I) (II) (III) (IV) (V) (VI) dW B dkp dW UC dkp dW C dkp (1)+(2)+(3) dW F dkp u 0 dW dkp Panel A: non-medical consumption = 0 -0.023 0.011 0.003 -0.009 -0.006 -0.015 = 1 -0.062 0.013 0.003 -0.046 -0.016 -0.062 = 2 -0.168 0.014 0.003 -0.151 -0.044 -0.195 = 3 -0.454 0.016 0.003 -0.435 -0.118 -0.553 Panel B: food consumption = 1 -0.022 0.012 0.003 -0.007 -0.006 -0.013 = 2 -0.021 0.012 0.003 -0.006 -0.006 -0.012 = 3 -0.020 0.013 0.003 -0.004 -0.005 -0.009 Notes: Table calculates the welfare effect of mandate, restricted to the unsubsidized population with income above 300% FPL. Column 1 calculates the penalty cost to the uninsured (Equation 14), column 2 calculates the benefit to uncompensated care payers (Equation 8), and column 3 the fiscal cost to the government (Equation 15). Column 4 gives the net benefit in column 1 to 3. Column 5 calculates the welfare loss from over-insurance (Equation 20), where the preference shock for compliance increases with penalty according to Saltzman (2018). Total welfare effect in column 6 is intended as a lower bound. Welfare weights varies by the curvature parameter in the utility, and by the consumption base considered appropriate for redistribution. 81 I calibrate d 0 d p based on the take-up response in the low-income eligible population not enrolled in ESI (?). Take-up is 19.3% lower across income thresholds where subsidy decreases by ten percentage points. Implied incentive effect is 19:3% 10% = 1:93 per subsidy percent in the eligible population, and d 0 d p = 19:3%8:3% 10% = 0:16 in the 19-64 population, where 8.3% are eligible for subsidy. 94 The calibration is similar to the ACS estimate in 2011 (-0.17). Implied cost of new enrollees is ( p +k) d 0 d p = 0:139. 95 Moreover, selection from ESI increases taxpayer cost by p 1 , and decreases private payer cost by 1 1 . With welfare weight u 0 (c 11 ) u 0 (c 1 ) 1, cost of insurance to society is lower by 1 p , the cost sharing of subsidized enrollees. Switchers who are payers of private transfer ( e 1 ) increase the transfer cost pr on remaining payers, mitigated by larger tax rebate 1 . 1 is calibrated at 62% in the main analysis, based on average income of Massachusetts ESI payers ($70;000 in 2011) and simulated ESI exemption in Gruber (2010). 96 Calibrated 1 is similar for average workers or ESI enrollees in Massachusetts. d e 1 d p =0:25 in the ACS sample, d 1e 1 d p =0:27 for ESI beneficiaries (enrollees not in labor force), and d 1 d p =0:52 in the main analysis. 97 Given null effect on employment, I calculate cost under alternative crowd-out in the robustness analysis. Appendix Table B.2.10 calculates welfare effect of subsidy, varying the cost calibration. With null effect on employment, result is comparable to the 27-64 group in Table 3.6. Employment response half the size of take-up increases taxpayer cost by 0.064 per subsidy dollar. Generous welfare weights are needed to balance the cost implied by an employment losing subsidy on average decreases employment by 1 2 74 0:16 = 5:92 percentage points when employment effect is half the size of take-up, already larger than most estimates in the literature. 94 Following ?, I construct eligible population in ACS as having income between 135% and 300% FPL and not enrolled in ESI. Eligibles are 333;964 4;007;524 = 8:3% of the 19-64 population. Average subsidy is 74% for those who enroll. 95 Note thatk = 1 2 (1 p ) = 0:13 by regulation. Penalty is set at half of affordability, and subsidy lowers enrollee contribution to equal affordability. 96 ESI exemption is also calculated in Tax Expenditures for Health Care submitted to the Senate Commission of Finance for the hearing “Health Benefits in the Tax Code: The Right Incentives” (https://www.jct. gov/publications.html?func=startdown&id=1193). ESI return to the $ 50,000-$74,999 income group is $3,106, or $3;106 $42412 = 61% of premium. 97 ESI selection on the extensive margin is0:74 0:52 =0:38 per subsidy enrollment. The magnitude is in the middle-to-lower range of estimates surveyed in Gruber and Simon (2008), possibly due to the employer mandate. The remaining selection reflects (small) firm ESI offer and worker sorting across firm size and benefits (Aizawa, 2017). Sommers et al. (2018) provides empirical evidence from Massachusetts consistent with firm response to mandate and subsidy, and worker sorting (low ESI take-up) when eligible for subsidy (see also Lyons (2017) for increased subsidy take-up in Massachusetts small firms). Including ESI enrollees (59%) as eligibles (?), take-up inclusive of selection is approximately (1 59%) 19:3% 10% = 0:79 across income. Restricting selection to below 500% FPL (64% of 19-64 population), for example, implied take-up is 64% (1 59%) 19:3% 10% = 0:52. 82 effect the full size of take-up, although actual employment effect is likely well below the calibrations considered. I then shut down ESI selection from workers, non-workers, or both. Fiscal cost increases the most when selection is driven by ESI payers ( d 1e 1 d p = 0), with the effect similar to an employment response half the size of take-up. By contrast, selection from private transfer beneficiaries lowers fiscal cost with increased enrollee share of premium. In this case, welfare effects from take-up and selection response roughly balance out, implying a near-zero efficiency cost (not including the external benefit on credit) when = 0. Shutting down all interaction with private transfer yields a cost of 0.381 per subsidy dollar, 98 which balances out benefits for between 2 and 3 under conservative weights. Additional selection improves welfare if it increases enrollee premium contribution with- out increasing the transfer burden on ESI payers. Welfare gain is smaller, however, if private beneficiaries already bear some cost of premium, and the fiscal saving from the selection is smaller. 3.7.5 Uncompensated cost saving Since sizable benefits of (formal) insurance expansion accrue to third-party payers, I next assess the sensitivity of uncompensated cost saving to a range of moral hazard estimates, modeled as the relative generosity of informal insuranceg. In the main analysis,g = 0:8 implies spending increases by 25% with formal insurance. In the robustness analysis, I re-calibrate uninsured cost elasticity" ri; 0 assumingg = 1 (no moral hazard) org = 0:7 (43% spending increase). Absent moral hazard, avoided uncompensated cost is 0.09 per subsidy dollar (Appendix Table B.2.11). Including interest savings, benefits roughly equal the fiscal cost at = 0, implying small efficiency loss of subsidy. The efficiency cost increases with moral hazard. A 43% (25%) spending response lowers uncompensated cost saving by 0.027 (0.015) per subsidy dollar. In both cases, benefits outweigh the cost under conservative welfare weights and mild redistribution preference ( = 2). Moral hazard similarly has modest impact on the welfare of mandate penalty (Appendix Table B.2.12). For the range of spending response considered, net welfare effect is near-zero for = 3 under conservative weights, but becomes more negative as the uninsured receive larger weights (consistent with lower consumption). Although moral hazard generally lowers the net benefit of formal insurance, welfare conclusion remains robust to large spending response in the calibration. 98 including only mechanic cost (0.242) and new enrollee cost (0.139) 83 3.7.6 Reform rationale The previous cost-benefit analysis assumes expansion has zero effect on premium. I find that the cost-effectiveness of premium subsidy does not particularly rely on the premium benefit. For mandate penalty, the premium benefit potentially plays a larger role. Here, to formally understand the relative importance of premium versus uncompensated care for policies targeting different expansion groups, I include the premium benefit dW P dKp . Table 3.8 adjusts the fiscal cost of subsidy by the maximum premium benefit calculated from Equation 7 assuming zero mark-up response. I highlight the potential extent of cost saving attributable to the composition effect on premium. Absent equity considerations ( = 0), saving in enrollee premium is 6:5%95% = 0:062 per subsidy dollar, or 0:062 0:074 = 84% of the saving to uncompensated care payers. The joint benefit on formal insurance pricing and uncompensated care is sufficient to dominate the fiscal cost, without resort to the credit benefit or redistribution preferences. Table 3.8: Welfare effect of premium subsidy, allowing for premium benefit dW P d p p (I) (II) (III) (IV) (V) (VI) dW B d p p dW UC d p p dW d p p dW C d p p (1)+(2)+(3)+(4) dW d p p u 0 dlogp d p = 0 =6:5% dW P d p p dlogp d p = 0 =6:5% Panel A: non-medical consumption = 0 0.232 0.074 0.011 -0.338 -0.276 0.062 -0.021 0.041 -0.018 = 1 0.368 0.082 0.013 -0.342 -0.278 0.064 0.121 0.185 -0.028 = 2 0.585 0.092 0.014 -0.346 -0.278 0.068 0.345 0.413 -0.045 = 3 0.928 0.105 0.016 -0.350 -0.280 0.070 0.699 0.769 -0.071 Panel B: food consumption = 1 0.249 0.079 0.011 -0.339 -0.277 0.062 0 0.062 -0.018 = 2 0.268 0.085 0.012 -0.339 -0.277 0.062 0.026 0.088 -0.019 = 3 0.288 0.092 0.012 -0.340 -0.278 0.062 0.052 0.114 -0.021 Notes: Table calculates the benefit of subsidy dollar to recipients in column 1, uncompensated care payers in column 2, and interest payment in column 3. Column 4 calculates fiscal cost assuming full mark-up response to cost saving ( dlogp d p p = 0 as in the main analysis), or zero mark-up response implying dlogp d p p =6:5%. The difference in cost, dW P d p p , gives the risk pooling benefit of premium. Column 5 calculates net benefit with and without the premium benefit. Column 6 shows the friction cost from the main analysis. Adjusting for welfare weights increases the return to subsidy. The equity gain reflects rising benefit to low-income recipients relative to payers. The basic trade-off, however, centers around the efficiency gain of insurance pricing versus the enrollment cost. Although the premium benefit is weakened in less competitive markets, one-third of the potential benefit offsets the baseline efficiency loss without friction (0:021), and two-thirds of the 84 benefit offsets the loss including friction (0:039). Corresponding mark-up adjustments range from @ @ p = 0:029 to 0.049. For the small mark-up increase (0.016) in CommCare, including the premium benefit rationalizes subsidy on pure efficiency grounds. Welfare argument for mandate penalty strengthens with the risk pooling benefit in premium (Table 3.9). Without the premium effect (Equation13), uncompensated care alone does not justify penalty: the 0.20 dollar net cost on the uninsured is twice the benefit to uncompensated care. Without uncompensated care, the premium benefit (2:7% 97:7% = 0:026 per penalty dollar) alone offsets the cost on the uninsured from an efficiency standpoint. Adding equity concerns, unless the society places substantially higher weights on the uninsured, return to penalty remains positive. Table 3.9: Welfare effect of mandate penalty, allowing for premium benefit dW P dkp (I) (II) (III) (IV) (V) dW B dkp dW UC dkp dW C dkp (1)+(2)+(3) dW F dkp u 0 dlogp dk = 0 =2:7% dW P dkp dlogp dk = 0 =2:7% Panel A: non-medical consumption = 0 -0.023 0.011 0.003 0.029 0.026 -0.009 0.017 -0.018 = 1 -0.062 0.013 0.003 0.029 0.026 -0.046 -0.020 -0.028 = 2 -0.168 0.014 0.003 0.030 0.027 -0.151 -0.124 -0.045 = 3 -0.454 0.016 0.003 0.034 0.031 -0.435 -0.404 -0.071 Panel B: food consumption = 1 -0.022 0.012 0.003 0.029 0.026 -0.007 0.019 -0.018 = 2 -0.021 0.012 0.003 0.028 0.025 -0.006 0.019 -0.019 = 3 -0.020 0.013 0.003 0.028 0.025 -0.004 0.021 -0.021 Notes: Table calculates the penalty cost to beneficiaries (the uninsured) in column 1 and benefit to uncompensated care payers in column 2. Column 3 calculates the fiscal effect on government budget assuming full mark-up response to cost saving ( dlogp dk = 0 as in the main analysis), or zero mark-up response implying dlogp dk =2:7%. The difference in cost, dW P dkp , gives the risk pooling benefit of premium. Column 4 calculates the net benefit with or without the premium benefit. Column 5 shows the friction cost from the main analysis. Sample is restricted to the unsubsidized population above 300% FPL. The baseline efficiency gain roughly offsets the friction loss of taste-based enrollees. Smaller benefit to third-party payers, consistent with more limited access to uncompen- sated care in the high-income, lowers the efficiency gain. Overall desirability of mandate penalty relies on the risk pooling benefit on premium, whereas uncompensated care is the leading motivation for subsidy. When both rationales are considered, including realis- tic friction in insurance pricing and take-up, subsidy and penalty in Massachusetts are 85 justified on largely efficiency grounds for a range of plausible effect sizes. 3.7.7 Discussion The efficiency argument follows from the incentive-based policy design. When subsidy is less than full, part of the social cost of insurance is internalized as enrollee cost sharing. With optimal take-up response, subject to only a small degree of behavioral preferences, the efficiency cost on the margin of choice is negligible. Unlike a universal mandate that trades-off marginal and infra-marginal benefits, the choice-based design is evaluated based on the net social benefit given behavioral responses. The net benefit compares the efficiency gains of formal insurance with the implied fiscal cost of expansion using policy instruments. Perhaps not very surprisingly, four years into the reform, at a 95% insurance rate, the net benefit of further increasing the premium subsidy and the mandate penalty falls close to zero for a range of estimates. Put differently, efforts to achieve higher coverage rate with the current set of incentives are unlikely to be cost-effective. While social externalities may motivate government expansion of health insurance, universal health insurance is hard to justify on pure efficiency grounds. In the cost-benefit framework, an increase in subsidy increases take-up, but also decreases the cost of social insurance internalized as enrollee cost-sharing. For policy generosities consistent with (near-)universal insurance, 99 the social benefit decreases below the current level whereas the social cost increases, implying universal insurance may not be desirable in this setting without some preference for redistribution. 100 3.8 Conclusion This paper analyzes two potential rationales for expanding health insurance in the US: adverse selection and uncompensated care. With adverse selection, social insurance is constrained by the desirable scope of redistribution across risk: a mandate maximizes welfare only when complete pooling is socially desirable. With uncompensated care, formal insurance improves welfare through the distribution of health care cost in the economy. When the scope and value of redistribution are larger across income than risk, tax-financed formal insurance mandate maximizes welfare. I assess the welfare implications of two rationales exploiting policy incentives that incrementally expanded formal insurance in Massachusetts. Quantified welfare trade-off 99 Based on WTP for insurance in the low-income uninsured, a 90% subsidy would still leave 1% of the population uninsured (Finkelstein et al., 2017). 100 Additional efficiency cost may arise with higher tax rates to finance the subsidy increase. 86 suggests the social cost of uncompensated care motivates premium subsidy without sub- stantial redistribution preference for low-income recipients. Mandate penalty is motivated solely by the pooling benefit of premium, and the motivation is weakened by the friction loss to taste-based enrollees. Without strong redistribution preferences for the already insured, a universal mandate based on risk pooling alone may not be desirable. The analysis showcases the trade-off in the global design of social insurance. Relative to flexible incentives across expansion groups, a universal mandate relies more on redistri- bution arguments and may be less viable in practice. Including both rationales, current level of penalty and subsidy in Massachusetts improves welfare from a pure efficiency standpoint. The analysis focused on the desirable scope of formal insurance. Additional trade-offs over health care quality and delivery, and contract designs when social insurance is delegated to private insurers, are important topics for future research. 87 4 Long-run impact of in-utero investment response to CHIP 4.1 Introduction The importance of health care access in the human capital development of the future generation has come under increasing attention from scholars and policy makers. In the US, since the seminal Medicaid reform in the 1980s that significantly increased coverage eligibility for pregnant women, social transfer programs have continually targeted mothers and children in early critical periods of life. Researchers have documented strong, positive effects on health as well as human capital acquisition both contemporaneously (Currie and Gruber 1996a; Currie and Gruber 1996b) and over the longer life span (Wherry and Meyer 2016; Brown et al. 2015; Cohodes et al. 2016; Levine and Schanzenbach 2009). The consistency of these findings suggests that there is substantial benefit to insuring low-income children, and that the benefit persists well into the long run. In this paper, I study the in-utero and long-run investment impact of the Children’s Health Insurance Program (CHIP), one of the largest public insurance programs since Medicaid to cover specifically low-income children. Unlike previous Medicaid expansions, adult coverage eligibility barely changes during the roll-out of the program, making it an ideal context to study dynamic investment inputs in the next generation. I first investigate whether increased investment opportunities and returns that accrue to the child in later ages affect parental in-utero investment and birth outcomes. I identify program impact utilizing the year-month, state-by-state roll-out of CHIP in 1997-2001. I estimate birth weight production function to evaluate the influence of different investments on health at birth. I then investigate whether or not in-utero investment induced by CHIP has significant long-run effects on the cognitive skill of school age children. I present evidence that in-utero exposure to CHIP during the first and second trimester has lasting impact consistent with dynamic complementarity between fetal and early childhood investments. I make several novel contributions to the literature. First, I add to the growing literature on the long-run impact of early life exposure to social welfare programs. Compared to extreme episodic shocks such as heat, flood or famine, the positive impact of social safety net is only recently gaining attention in the literature. Second, I inspect closely the types of health investments parents make during pregnancy, and find smoking cessation to be most responsible for the improved birth outcomes. That public insurance program can “crowd-in” private investments that improve the health outcomes for children is novel in the literature. Third, I show the long-run benefits of in-utero investments. Unlike previous studies that find joint importance of the fetal and early childhood period, I am able to isolate the effect of investments in just the fetal period. Lastly, I suggest that the 88 mechanism behind this long-run influence is consistent with the dynamic complementarity between the fetal and early childhood investments: greater investments in-utero amount to greater return to early childhood investments, but do not have an independent effect on long-run cognitive outcome. Although the multiplier effect is often mentioned as a potential pathway, here I provide direct evidence of its relevance. This finding is important for the evaluation of social insurance programs: as opposed to the often discussed idea of “crowd-out”, public investment can in fact “crowd-in” private investment in the next generation, and that may happen even before the child is born and eligible for enrollment in the program. This dynamic, complementary interaction between social program and parental investment further magnifies the long-run welfare gain of social insurance for children. I proceed using two empirical strategies. In an event study design, I compare cohorts differentially exposed to CHIP onset in-utero and their birth and later life outcomes. I control for location specific trends to address potentially endogenous timing of program establishment. My main sample is composed of non-college educated, single mothers who are most likely to utilize CHIP in future investment in the child. For birth outcomes I additionally stratify the sample by county characteristics in 1996: impact is much larger on low educated mothers residing in traditionally high welfare transfer counties. The suggestive evidence from the event study need not be causal for a number of reasons. Most prominently, natality data contain only live births. If CHIP also affects fertility and fetal death rates, then I have biased estimates due to sample selection. Further- more, gestation can vary endogenously to the program if mothers time birth around the program onset date. Longer gestation automatically brings about more opportunities to invest in-utero, biasing estimated treatment effects. Finally, county-specific trends may not adequately control for the non-random temporal and spatial roll-out of CHIP . To address all these concerns, and in addition, to better understand the dose response to program generosity as opposed to the binary onset, I simulate projected CHIP eligibility over early (age 0-5) and later (age 6-18) childhood, using in-utero program rules over a fixed length of gestation. In the simulation, I track the income distribution of all women by age and demographic cells in the 2000 decennial census, assuming each woman were to deliver a child at the age when sampled. I assign corresponding eligibility for the hypothetical child by adding up mother’s age over the life course. Although the simulation strategy has been adopted in numerous studies in the literature, this paper emphasizes that following the entire female sample, rather than just mothers or children, serves to purge any residual bias from selective fertility and live birth. The constructed measure captures the belief of CHIP 89 eligibility held by a woman of a given demographics for her child at later ages. Both methods provide consistent evidence that in-utero investment responds to future insurance provision to the child, and birth outcomes improve in terms of birth weight, Apgar score, and gestation length. The effect is generally largest for those impacted in the first trimester. There is little change in the number of doctor visits, although care seeking is more intensive if impacted in the third trimester, and the onset of pre-natal care is in fact later with CHIP . Most of the birth outcome is then attributed to changes in health behavior: incidence and intensity of smoking, and to a lesser extent, drinking, are lower after CHIP, and weight gain during pregnancy is larger. Coverage expansion for older children between age 6 and 18 is most responsible for the improvement in most birth outcomes, although reduced smoking and drinking during pregnancy seems more driven by coverage expansion for small children below age 5. Doctor visits, on the other hand, are most responsive to mother’s own coverage eligibility, which barely changes over the roll-out period of CHIP . To formally explore the relationship between health inputs such as smoking and care utilization and the birth outcome, I estimate birth weight production function using exogenous CHIP eligibility as instruments. While a large literature estimates the causal effect of health care utilization on birth weight, with recent evidence leaning to a null result, not all studies consider health behavior and care utilization jointly. Identifying off differential investment response to CHIP and maternal eligibility, estimates from a Cobb-Douglas production function suggest greater importance of smoking to birth weight, and complementarity between the inputs: return to doctor visit diminishes at higher intensity of smoking. The model survives the specification test when I use interactive eligibility for over-identification. Turning to long-run impacts, I look at school age children sampled in the 2010-2014 waves of American Community Survey (ACS). I use the state of birth and year-quarter of birth to link cohorts to the historic CHIP roll-out. I rely on reported difficulty in remembering as a proxy measure of low cognitive skill. I find large and significant long- run impact of CHIP exposure in the first and second trimester, when the development of the fetal brain is most rapid, and the effect is strong only for children born to low-educated mothers. To understand the mechanism of the long-run impact, I codify realized CHIP eligibility for children at different ages, in addition to CHIP eligibility as perceived in- utero. I allow level effects of eligibility in-utero, in ages 0-5, ages 6-18, and all two-way interactions to enter the equation of long-run outcome. This way I detect complementarity between investment across different stages of childhood, over and above any level effect that persists into later ages. 90 Interestingly, with interaction terms, the only level effect that remains significant is CHIP eligibility in ages 0-5, whereas contemporaneous and in-utero eligibility has virtually no effect on cognitive ability in later childhood. On top of the level effect, there is significant complementarity between eligibility in-utero and in ages 0-5, but the relation disappears in later childhood. The stronger complementarity in early critical period of development is consistent with theoretical models in (Cunha and Heckman, 2007). Hence the long-lasting impact of fetal investment completely materializes itself through added investment return in early childhood, rather than having an independent effect immutable to future efforts. The distinction is often not made clear in previous studies of fetal conditions and long-run impact. The remainder of the paper proceeds as follows. Section 4.1 introduces the background surrounding the roll-out of CHIP and previous evaluation of the program. Section 4.2 introduces the data sources and provides descriptive statistics of key explanatory and outcome variables in the empirical analysis. Section 4.3 establishes the in-utero investment response to CHIP, and its effect on birth outcomes, presenting event study estimates followed by instrumental variable estimates using simulated eligibility in-utero. Section 4.4 documents the long-run impact on cognitive ability, again using event study and simulated eligibility over the course of childhood. Section 4.5 wraps up the discussion and concludes. 4.2 Background and Previous Literature Because my research design exploits public insurance expansion that differentially affects otherwise similar children across state and over time, this section provides a historic overview of the federal Medicaid program and state options in expanding coverage to pregnant women and children since the first major expansion of this kind in the 1980s. I then direct more focus on coverage options immediately before and after the legal establishment of CHIP in 1997. I survey earlier literature on the effects of these expansions where relevant. Medicaid, established by the Social Security Amendments of 1965, is the major public insurance program that provides medical services to the low-income population in the US. Coverage is initially tied to cash assistance eligibility in the Aid to Families with Dependent Children (AFDC) program. The two programs were de-linked in 1986, and between 1987 and 1992, a number of legislative options are enacted to expand Medicaid coverage of pregnant women and small children. More notably, Omnibus Budget Reconciliation Act (OBRA) of 1989 enforces the federal mandate that all states cover pregnant women and 91 children less than 6 years old in families with income less than 133% of federal poverty level (FPL), and that states already adopting a higher eligibility level must maintain that level. OBRA 1990 further requires states to cover all children below age 19 who are born after Sep. 30th, 1983 and living below the federal poverty level. The law significantly increases the cumulative Medicaid eligibility years during childhood, and was largely unanticipated by families of eligible children at the time. The short-run impacts of these coverage expansions have been closely examined in the literature. Cutler and Gruber (1996) estimates that because of these expansions, nearly one in every three children become eligible for public insurance coverage by 1992. Eligible low-income pregnant women are more likely to utilize technology such as ultrasound, fetal monitor and C-section labor (Currie and Gruber, 2001), practice timely and adequate pre-natal care, and have children with lower probability of being low on birth weight (Currie and Gruber, 1996b), lower mortality rates (Currie and Gruber, 1996a) and lower incidence of avoidable hospitalization (Dafny, 2005). Using the coverage discontinuity around Sep. 30th, 1983, Card and Shore-Sheppard (2004) finds a modest increase in insurance rate. The same strategy is adopted by Wherry and Meyer (2016) who finds reduced teen mortality among black children who gained coverage due to the policy. Because the above Medicaid expansions are relatively dated, recent studies are able to present long-run impacts on adult outcomes. Currie et al. (2008) suggests early childhood Medicaid coverage improves the health of teenagers. Wherry et al. (2015) focuses more specifically on in-utero and first year of life, and finds a significant positive effect on adult metabolic and immune disorders, obesity, and educational attainment. These effects are attributed jointly to maternal and infant coverage because the two groups share the same eligibility historically until the onset of CHIP . For this reason it is difficult to interpret the effect as solely from in-utero investment in the child, when own coverage also encourages greater health care utilization. Cohodes et al. (2016) similarly finds improvements in schooling outcomes. Labor market payoff of early life Medicaid coverage is substantial: using tax return registry, Brown et al. (2015) shows that those eligible for Medicaid in childhood in fact pay more in taxes in adulthood, increasing the long-run efficiency of the Medicaid program. Prior to the Balanced Budget Act (BBA) of 1997, states rely on Section 1902(r)(2) options and Section 1115 waivers to cover more pregnant women and children beyond the federal mandate in Medicaid. Title XXI of BBA 1997, effective Oct. 1st, 1997, offers states the option to further expand coverage to children either by upgrading the eligibility limit in current Medicaid programs or by setting up a separate program for children: the State Children’s Health Insurance Program (SCHIP), later simply known as CHIP . Some 92 combination of the two is also possible. To this end, the federal government will earmark an annual amount of $ 4 billion to states that opt to expand coverage to children over the first five years of Title XXI. During this period only children can be covered with these funds. As a result, children’s coverage eligibility increased significantly over the period 1997- 2002. Many states continually expanded eligibility following the initial onset of the program. These later expansions tend to target older children who were previously ineligible for Medicaid, or have much lower income limit relative to smaller children. Appendix Table C.1.1 tabulates the enrollment start date of CHIP, and the associated eligibility change in childhood. Prior to CHIP, Medicaid eligibility for pregnant women and infants were identical in almost all states. While some states also increased eligibility for pregnant mothers, which I mark with the asterisk in theinfant column, the majority of states did not. My event study focuses on the latter set of states, but I include all states in the simulated eligibility strategy where I directly control for maternal eligibility. 101 In the birth sample, I attach relevant Medicaid/CHIP eligibility over the 19 years of childhood to birth cohorts differentiated by conception year-month, gestation, and state. I average simulated eligibility over childhood ages and gestation months to derive the in-utero exposure to CHIP by cohorts (Figure 4.1). Eligibility is simulated using the entire female sample, stratified by integer ages and other key demographics. Over time we see more states covering more children in childhood: in the beginning of 1997, only a handful of states are covering newborns for more than 28% of childhood, or an cumulative eligibility of at least 5.32 years, but the majority of the states are able to do so by the end of 2000. Nation-wide eligibility doubled from 0.16 (3 years of childhood) to 0.32 (6 years) over this period. Given this large increase in childhood coverage eligibility, it is interesting to see how in-utero investments respond; in particular, whether mothers seek to magnify the long-run benefit by engaging in investments that improve birth outcomes. This is distinct from studies on the contemporaneous effects on coverage gain (Sasso and Buchmueller, 2004) or health care access (Joyce and Racine, 2003). A related paper is Levine and Schanzenbach (2009). They find a long-run impact of CHIP on verbal test scores, and they mention higher birth weight, or better health in general, can be an operative channel. However, CHIP is unlikely to have direct impact on birth weight unless through in-utero parental responses, which they do not discuss. Nor do they explore the dynamics between in-utero exposure and later life investment in shaping the long-run outcome. Although their 101 I exclude Tennessee throughout because the program contracted in 2002 and before 2002 there is no definite upper income limit for enrollment. 93 Figure 4.1: CHIP Eligibility In-utero Jan−1997 conception cohort Jan−1998 conception cohort Dec−1999 conception cohort Dec−2000 conception cohort (.39,.51] (.28,.39] (.16,.28] [.05,.16] Notes. To compile the map I average childhood CHIP eligibility as perceived in utero, calculated from programs rule in effect during the gestation period, over all children being born in the state who are conceived in the same year-month as indicated in the title. Eligibility is simulated using the entire female sample in the 2000 census, reflecting income distribution over the life cycle and differences by key demographics. 94 motivation is very different from that in this study, I perceive the effects shown in Levine and Schanzenbach (2009) as complementary to findings in this paper. This paper is also related to a growing literature on the long-run impact of social welfare programs from a historic period where government transfers are more often in cash (including, e.g., food stamp) than in kind (such as insurance). Hoynes et al. (2016) uses the roll-out of food stamp program in the 1960s and finds in-utero to early childhood family resources have significant impact on later life health and economic sufficiency. The pattern that early life exposure to social safety net meaningfully changes adult outcomes similarly emerges from looking at the roll-out of Medicaid in the 1960s (Boudreaux et al., 2016), and the first government cash transfer program in the early twentieth century (Aizer et al., 2016). More broadly, policies affecting pollution levels (Isen et al., 2017) and alcohol availability (Nilsson, 2017) early in life have also been shown to exhibit long- run impact on health and earnings of the next generation. As social insurance becomes an ever more important component of the safety net, behavioral responses to insurance programs provide timely and forward-looking evidence on the role of government policies in promoting the well-being of future generations. 4.3 Data I use the US natality data for information on prenatal investments and birth outcomes. The datasets contain birth certificate records of all live births born to US citizens in a given calendar year. I restrict the sample to births occurring in the 50 states and District of Columbia over the period 1996 to 2002. From these birth records I derive conception year- month using birth date and estimated gestation length, and keep only cohorts conceived between 1996 and 2001 whose mother is between age 21 and 40 at the time of delivery. This is the birth sample that forms the basis for more specific samples used in event study and simulated eligibility analysis of in-utero responses. I establish in-utero exposure to CHIP by linking conception cohorts to state-specific CHIP onset based on mother’s state of residence in the birth sample. Program parameters including expansion dates and changes in eligibility for small and older children and for pregnant women, if applicable, are collected from the National Governors Association (NGA) website and multiple years of Maternal and Child Health (MCH) Updates. My long-run model also controls for realized childhood eligibility over period 2002 to 2014, and for this information I additionally consult historic CHIP/Medicaid eligibility charts compiled by the Kaiser Family Foundation. Although the roll-out is at the state level, I merge in annual economic characteristics 95 from the Regional Resource File and monthly unemployment rate published by Bureau of Labor Statistics (BLS) to counties identified in the birth sample. Larger counties with over 100,000 residents are identified. I attach the state average to the set of unidentified counties in a state. County-level controls are meant to sharpen the research design, but are not themselves essential to identification: using only state variation does not substantially change the results. Results are more credible, however, if one additionally addresses any pre-existing heterogeneity and trending by local areas. In the end, I have a total of 497 county-equivalent areas identified in the birth sample. This rich locational variation is useful for identifying the set of mothers most likely to enroll their children in CHIP once they are born. The income eligibility of the program suggests children born to mothers of low socio-economic status (SES) are more likely to qualify for CHIP . Conditional on maternal SES, children born in counties with traditionally large welfare transfers are more likely to take up the program if eligible. Because of the high program penetration rate in these counties, intent-to-treat (ITT) estimates tend to be larger and closer to the treatment-on-the-treated (TOT) estimates, which are generally more informative. In the data I define high welfare-dependence counties as having per capita welfare transfer above the median in 1996 ($3237). Table 4.1 displays means of key outcome and explanatory variables in the full and higher impact samples, averaged over all conception cohorts between Jan., 1996 and Dec. 2001. Birth outcomes worsen down the SES gradient: birth weight is lower and gestation shorter among non-college educated single mothers in high welfare transfer counties. In-utero investments show similar pattern. A higher fraction of low SES mothers are smokers and drinkers during pregnancy, and their consumption of tobacco and alcohol on a daily basis is also more frequent. Patterns care utilization, care onset and overall care adequacy are worse for low SES mothers. I also characterize in-utero exposure to other ambient variables, such as county-level unemployment rates and state tobacco tax, 102 both of which have been shown in previous literature to influence later life outcomes. I average these time varying variables over a fixed eleven month of gestation with the sixth month coinciding with the mid-month of the actual gestation. This is to purge any endogenous timing of fertility and birth in anticipation or avoidance of these specific environments. I provide more details in Section 4. Because these variables should only vary by location, the mean differences suggest counties with more low SES mothers on welfare often have higher unemployment rates and somewhat higher tobacco tax rates. To evaluate the extent fertility and fetal death rates have changed in response to CHIP– both introducing potential bias to the event study, I aggregate count data of live births 102 State tobacco tax rates are collected from The Tax Burden on Tobacco, Historic Compilation, 2014. 96 Table 4.1: Descriptive Statistics, Birth Sample Full Sample Low SES Low SES+High Welfare Observations Mean Observations Mean Observations Mean Birth Outcomes: birth weight (gram) 18,954,719 3339.32 3,265,720 3219.63 1,915,086 3210.50 low birth weight (<2500 grams) 18,954,719 0.072 3,265,720 0.10 1,915,086 0.10 very low birth weight (<1500 grams) 18,954,719 0.013 3,265,720 0.019 1,915,086 0.021 Apgar score (0-10) 14,798,767 8.92 2,505,048 8.88 1,565,110 8.90 gestation (weeks) 18,965,232 38.80 3,267,758 38.64 1,916,224 38.61 pre-term birth (<38 weeks) 18,965,232 0.19 3,267,758 0.23 1,916,224 0.23 male child 18,965,232 0.51 3,267,758 0.51 1,916,224 0.51 singleton birth 18,965,232 0.97 3,267,758 0.97 1,916,224 0.97 In-utero Investment: Health behavior: smoking 15,891,668 0.11 2,709,431 0.25 1,556,515 0.26 # cigarettes per day 15,713,250 1.14 2,639,474 2.59 1,516,104 2.71 drinking 16,263,817 0.010 2,761,668 0.021 1,591,532 0.021 # drinks per day 16,216,695 0.022 2,743,725 0.057 1,580,709 0.060 weight gain (lbs.) 15,271,641 30.46 2,520,171 29.73 1,462,125 29.61 Healthcare utilization: # doctor visits 18,379,938 11.75 3,131,916 10.54 1,836,010 10.53 month care started 18,538,897 2.43 3,167,966 3.07 1,850,451 3.05 care started in first trimester 18,538,897 0.86 3,167,966 0.71 1,850,451 0.72 care is adequate 18,233,239 0.79 3,098,107 0.62 1,812,779 0.63 In-utero Environment: avg. unemployment rate (%) 18,965,232 4.63 3,267,758 4.98 1,916,224 5.60 avg. tobacco tax (dollar) 18,965,232 0.45 3,267,758 0.45 1,916,224 0.50 Mother Demographics: age 18,965,232 28.70 3,267,758 26.09 1,916,224 26.24 race =white 18,965,232 0.81 3,267,758 0.66 1,916,224 0.62 =black 18,965,232 0.13 3,267,758 0.30 1,916,224 0.35 =other 18,965,232 0.062 3,267,758 0.039 1,916,224 0.031 Hispanic origin 18,760,367 0.19 3,243,384 0.30 1,899,671 0.29 married 18,965,232 0.75 3,267,758 0 1,916,224 0 education =less than high school 18,707,336 0.15 3,267,758 0.40 1,916,224 0.40 =high school 18,707,336 0.31 3,267,758 0.60 1,916,224 0.60 =some college 18,707,336 0.54 3,267,758 0 1,916,224 0 birth order =1 18,891,491 0.34 3,256,993 0.27 1,910,810 0.26 =2 18,891,491 0.35 3,256,993 0.31 1,910,810 0.31 >2 18,891,491 0.31 3,256,993 0.41 1,910,810 0.43 Full Sample High Welfare Aggregate Rates fetal death rate 298,262 0.013 162,338 0.014 fertility rate 298,256 0.0081 162,333 0.0080 Notes. Tabulation based on micro data on cohorts conceived between 1996 and 2001, to mothers between age 21 and 40. Low SES sample consists of mothers with less than college education who are single at the time of delivery. High welfare sample is restricted to mothers residing in counties with above median welfare transfer in 1996. Calculation of aggregate rates utilize population count data published by SEER, and live birth and fetal death counts in the natality and linked perinatal mortality data, 1996-2002. Not all counties and demographic cells are identified: see text for more detail. 97 and fetal deaths by mother’s age band, race, ethnicity for each year-month conception cohort and county. I consult Linked Birth and Perinatal Mortality Data 1996-2002 for the universe of recorded fetal deaths. I define fertility as the conception rate of women, with the numerator summing over live birth and fetal deaths. The denominator is the population count of women in each county stratified by age-race-ethnicity cell. I obtain annual counts from Survey of Epidemiology and End Results (SEER), and let cohorts conceived in the same year share the same denominator. Because the linked perinatal mortality data mask counties with fewer than 250,000 residents, and due to a few missing counties in the denominator file, I am able to calculate fertility and fetal death rates for 247 county-equivalent areas. Lacking count data by education and marital status, I only show comparisons by location for aggregate rates in Table 4.1. The fertility rate in this period translates to about 1.90 (=0.008*(1-0.013)*12*20) births per woman between age 21 and 40, slightly lower than the World Bank statistics of 2.01 birth per woman which also includes fertility in teenage and older ages. Turning to long-run outcomes, I utilize birth state and birth quarter identifiers in ACS to recover in-utero CHIP exposure of school age children between age 8 and 18 observed in year 2010-2014. These are cohorts conceived 7 quarters before and up to 4 quarters after CHIP onset. 103 and Even the oldest cohort is yet to enter the labor market. The average age (12.70) in the sample also precludes using education attainment as a measure of human capital. I therefore rely on functional limitation, in particular, if the child experiences any difficulty remembering, as a proxy of the child’s cognitive skill. The measure cannot speak to changes in average skill level, but highlights the far left tail of skill distribution where programs like CHIP should have the largest impact. I also locate mothers in the family unit and attach maternal characteristics to the child sample. This is important not only because children born to higher SES parents acquire more skills, but also because CHIP eligibility is determined by the income distribution of parents over the life cycle, not children. For the matching, I use the mother locator developed by Ruggles et al. (2018) available through Integrated Public Use Microdata Series (IPUMS). I derive mother’s age at the time of delivery, and limit my long-run sample to cases where the birth of the child occurs between age 21 and 40. The restriction reduces matching errors, and maintains consistent composition across the birth and the long-run sample. Table 4.2 summarizes the long-run sample, averaging over school age children con- ceived between 1996 and 2001 who are successfully matched to mothers in ACS 2010-2014. Child outcomes are worse for those born to less than college educated mothers: they suffer 103 I assume gestation is 3 quarters. 98 Table 4.2: Descriptive Statistics, Long-run Sample Full Sample Low SES Observations Mean Observations Mean Child Characteristics male 823,328 0.51 308,910 0.51 age 823,328 12.69 308,910 12.70 race =white 823,328 0.73 308,910 0.68 =black 823,328 0.12 308,910 0.13 =other 823,328 0.15 308,910 0.18 Hispanic origin 823,328 0.20 308,910 0.32 Mother Characteristics age at fertility 823,328 28.25 308,910 27.27 race =white 823,328 0.75 308,910 0.70 =black 823,328 0.12 308,910 0.13 =other 823,328 0.13 308,910 0.17 Hispanic origin 823,328 0.18 308,910 0.31 education =less than high school 823,328 0.10 308,910 0.25 =high school 823,328 0.29 308,910 0.75 =some college 823,328 0.61 308,910 0 Child Outcome cognitive difficulty 823,328 0.039 308,910 0.047 physical difficulty 823,328 0.023 308,910 0.028 attending school 823,328 0.98 308,910 0.98 mover 823,328 0.18 308,910 0.15 In-utero Environment avg. childhood CHIP eligibility 823,328 0.22 308,910 0.36 avg. maternal eligibility 823,328 0.31 308,910 0.48 avg. unemployment rate (%) 823,328 4.69 308,910 4.77 avg. tobacco tax (dollar) 823,328 0.45 308,910 0.45 Childhood CHIP eligibility avg. in ages 5- 823,328 0.32 308,910 0.50 avg. in ages 6+ 823,328 0.32 308,910 0.49 Notes. Sample contains children conceived between 1996 and 2001 to mothers between age 21 and 40 at the time of delivery. These cohorts are taken from ACS 2010-2014. All averages adjusted by ACS sampling weights. 99 more from cognitive difficulty, despite significantly greater CHIP eligibility in both early (average of 0.50 in age 0 to 5) and later childhood (0.49 in age 6 and above). School attendance is nearly universal within this age range. Race and ethnicity of the child are almost identical to the mother’s. Age at fertility is comparable across the birth and the long-run sample, but mothers in the long-run sample have higher education attainment, suggesting that selective coresidence, and potentially mortality, of the child by parental SES may bias the long-run estimates if not controlled for. 104 In-utero conditions are averaged over 4 quarters covering the birth quarter and the previous three quarters. The in-utero unemployment rates and tobacco tax rates replicate those in the birth sample. Average CHIP eligibility, as perceived in-utero, is 0.22 (4.18 years) in the full sample and 0.36 (6.84 years) in the high impact sample. Because in pre-reform years (1996-1997) Medicaid coverage is not at all generous to older children, these averages are lower than realized CHIP eligibility summarized in the last two rows of Table 4.2. Understanding how in-utero eligibility interacts with actual eligibility sheds light on how in-utero investment complements later life efforts and the nature of their joint effect on long-run outcomes of the child. 4.4 In-utero Investment Response This section presents estimated effects of CHIP on parental investment in the fetus, looking separately at health service utilization and health behavior. I first show event study estimates comparing cohorts differentially impacted by CHIP in utero in Section 4.4.1. I also evaluate the extent sample selection may be biasing the research design. To address selection, and to study the dose response to program generosity at different stages of childhood, I simulate program eligibility following the life cycle income distribution of women in the national sample, and present instrumental variable estimates in Section 4.4.3. In Section 4.4.4 I estimate the birth-weight production function to understand the joint effect of health care utilization and health behavior on birth outcomes. 4.4.1 Event Study In the event study I compare otherwise similar cohorts conceived just before and after the program onset in each state. The underlying assumption is that cohort trends would have been parallel if none were impacted by CHIP in utero. Event study provides direct 104 Another change is higher fraction of “other” race being reported in the long-run sample. Therefore income distribution by race in 2010s may be different from that in the 1990s, and the non-stationarity tends to reduce the power of the instrument. However, I find very similar results if I instead simulate over 2005-2014 ACS, a more recent period to the long-run sample. 100 evidence as to the plausibility of this assumption. In addition, it is informative as to the timing of the effect over the gestation, comparing cohorts differentially impacted by CHIP in utero. I match birth records to state CHIP roll-out dates listed in Table C.1.1 using mother’s state of residence at the time of delivery. I relate the conception year-month of each birth, derived by deducting gestation months from birth month, to the enrollment onset dates, limiting the sample to those conceived 12 months before up to 4 months after the onset. The balanced sample facilitates the study of in-utero response to CHIP in the following ways. Assuming an average gestation of 9 months, cohorts conceived 10 to 12 months before the program onset are impacted shortly after birth, but not impacted in-utero. Cohorts conceived 7 to 9 months before the program are impacted in the third trimester; those conceived 4 to 6 (1 to 3) months before the onset are impacted in the second (first) trimester. Lastly, those conceived after the program onset receive full impact in-utero. Therefore, I group the 17 conception cohorts in my sample into 5 sub-cohorts. One should not see much differential trending in the first sub-cohort, and the relative size of effects in the subsequent 4 sub-cohorts suggest the timing of the investment response during pregnancy. Henceforth I refer to the first sub-cohort as the pre-treatment cohort, the next 3 sub-cohorts as the third, second, and first trimester cohort, and the last sub- cohort as the full-impact cohort. One caveat is that, the pre-treatment cohort will not be eligible for CHIP in their first 1 to 3 months of life, whereas later conception cohorts can be immediately enrolled in CHIP upon birth. This need not introduce any bias if outcome is observed at the time of birth, but for long run outcomes, the estimates then reflect both in-utero and very early life impacts of the program. This cohort difference is difficult to address due to the non-episodic nature of public programs. I drop those states that simultaneously expanded maternal and child eligibility at the onset of the program, because event study design cannot separately identify investment response to the child’s insurance coverage as opposed to mother’s own. For example, it is difficult to interpret increases in health service utilization as investment in the child, when maternal coverage also encourages health care access. With simulated eligibility I will be using all states while controlling for both types of coverage. What is reassuring is that, given the greatest variation over this period is cumulative childhood eligibility rather than maternal eligibility, even using the full set of states in the event study still gives very similar results. I estimate the following regression to study program impact on in-utero investment 101 and birth outcomes: y iymc = 0 + j,10 X j=12;:::;4 j Ifconception ym onset s(c) =jg + 1 UE icym (21) + 2 tobaccotax is(c)ym +X i + y + m + c + c t ym + iymc : That is, I compare outcomes for children conceived in different yeary and monthm and countyc who are first exposed to CHIP in states at different stages of the gestation (j). The key coefficients are the set of timing indicators j , where I normalize the effect on the last of the pre-treatment cohort–those conceived ten months prior to onset–to zero. Although the roll-out is at the state level, I control for more granular geographical variation with fixed effects for all identified county and equivalent areas in the natality sample. To further address any secular trending correlated with both program onset and outcomes, I include linear time trends for these counties. I therefore utilize variation by sub-state local areas in the estimation of the program impact, and address pre-existing trending in outcomes by these areas. The main results remain similar, however, if only state level variation is used. Specific to each child i, I also control for her in-utero exposure to local economic conditions, proxied by the county unemployment rate and the state tobacco tax rate. Both have been shown in previous literature to have significant influence on birth outcome and later life success, 105 and hence I control for them directly in the model. Because the timing of fertility and birth may respond to both of these environment variables, the actual exposure during gestation is likely endogenous. To overcome this limitation, instead of averaging over the actual gestation period, I average over a fixed 11 month period with the sixth month coinciding with the mid-month of the gestation. Children conceived at the same year and month can have different gestation length, and are exposed to different unemployment and tobacco tax rates in-utero. These controls are approximately moving averages of year-month specific economic conditions, and due to concerns of multicollinearity, I control for conception year and month fixed effects separately in the regression. 106 The individual controls included inX i include indicators for interactive cells of mother 105 For example, Dehejia and Lleras-Muney (2004) finds infant health is counter-cyclical. Fertility is believed to be pro-cyclical (Schaller, 2016), with higher unemployment rate reducing both short run and life time conception (Currie and Schwandt, 2014). Higher tobacco taxes in-utero increase birth weight (Evans and Ringel, 1999), and further improve child health in the medium run (Simon, 2016) 106 Results are similar, however, if I only control for conception month unemployment rate and tax rate but include interactive year-month fixed effects in the model. 102 age band (20-24,25-29,30-34,35-40), race (White, Black, other), Hispanic origin, education (high school drop-out, high school, some college), and marital status at the time of delivery, indicators of birth order (1;2;3;4;5+), plurality (1;2;3;4;5+), and a host of maternal comor- bidity controls in the natality data including, for example, whether mother has anemia, cardiovascular disease, lung disease, diabetes, and whether previous birth is pre-term. Results are not sensitive to interactive or level demographic controls, or additional linear trends by demographic cells, suggesting selective program exposure by demographics is unlikely to drive the result. Results are also not sensitive to additional controls of county characteristics varying by year, but to preserve sample size I do not include them in the main analysis. Figure 4.2 plots event study estimates on birth outcomes and 95% confidence intervals with standard errors clustered by state. I stratify the sample by maternal SES: on the left panel of each outcome I show full sample estimates and those specific to less than college educated single mothers. Among the low SES sample, I further stratify by location, and focus on mothers residing in above median welfare transfer counties in 1996. Because low- income households in these areas are more likely to access the safety net, behavior changes in this sub-sample are more likely driven by CHIP eligibility of their children, allowing better identification of the program impact on its most likely beneficiaries. Generally, as results from the “low SES + high welfare” sample agree fairly well with instrumental variable evidence in Section 3.2, I consider these estimates as my preferred estimates. Compared to pre-treatment cohorts, those exposed to CHIP in-utero have significantly higher birth weight, lower incidence of low or very low birth weight, and longer gestation. The effect is larger down the SES gradient. In the least advantaged “low SES + high welfare” sample, gestation increases by about half a week, and the rate of low birth weight is reduced by 0.4, or 40% from the baseline of 0.10. The effect is not only large but also immediate: cohorts narrowly exposed in the final trimester still exhibit significant improvements in most birth outcomes, and the magnitude is generally larger for cohorts impacted earlier in the gestation. There is no strong evidence of in-utero CHIP exposure affecting the Apgar score, although effect is larger and increasing in exposure if I instead use raw integer scores between 0 and 10. None of the indicators for the pre-treatment cohorts are significant, suggesting parallel trending in the absence of the reform is plausible. Figure 4.3 looks at health behavior during pregnancy. At the time of delivery mothers are asked if they ever smoke or drank during gestation, and if they did, their average daily consumption. Health behavior is recalled over the entire gestation, rather than for each trimester. Hence, mechanically, the effect will be larger for first-trimester cohorts and smaller for third-trimester cohorts, whose behavior in the first two trimesters tends to 103 Figure 4.2: Event Study Estimates on Birth Outcomes never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −100 0 100 200 −100 0 100 200 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Birth Weight (gram) never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.1 −.05 0 .05 −.1 −.05 0 .05 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Low Birth Weight (<2500 grams) never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.04 −.02 0 .02 −.04 −.02 0 .02 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Very Low Birth Weight (<1500 grams) never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.02 0 .02 .04 .06 −.02 0 .02 .04 .06 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Apgar Score >=8 never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −1 0 1 2 −1 0 1 2 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Gestation Length (week) never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.2 −.1 0 .1 −.2 −.1 0 .1 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Pre−term Birth 104 Figure 4.3: Event Study Estimates on Pregnancy Health Behavior never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.06 −.04 −.02 0 .02 −.06 −.04 −.02 0 .02 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Smoked in Pregnancy never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.06 −.04 −.02 0 .02 −.06 −.04 −.02 0 .02 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months 5+ Cigarettes Daily never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.02 −.01 0 .01 −.02 −.01 0 .01 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Drink Alcohol in Pregnancy never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.02 −.01 0 .01 −.02 −.01 0 .01 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months 2+ Drinks Daily never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −2 0 2 4 −2 0 2 4 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Weight Gain (lbs.) 105 bias against finding any significant changes for this group. On the other hand, measure of weight gain does not suffer from this bias, and there is a significant increase in weight gain for mothers impacted in the third trimester, with the effect rising with earlier exposure. Focusing on first-trimester cohorts, smoking incidence is 0.03 lower among the lowest SES sample, representing a 12% reduction from a baseline of 24.5%. Given I already control for tobacco tax faced by pregnant mothers, and tobacco tax changes over this period are relatively sparse, the estimated reduction in smoking rates reflects maternal investment response to later life insurance provision to the child. In terms of magnitude, because the coefficient beforetobaccotax is(c)ym is -0.127, which implies that a dollar increase in in-utero average tobacco tax will reduce maternal smoking by 0.127, and that the child investment incentive of smoking cessation is roughly comparable to an additional 25 cents increase in tobacco tax. Turning to health care utilization, Figure 4.4 presents event study estimates on the number of doctor visits, the first month of pre-natal care, and a synthetic measure of whether pre-natal care is adequate–care adequacy increases with earlier onset of care and more doctor visits. Overall, there is minimal change in number of doctor visits during the roll-out of CHIP . This is not surprising given states in the event study sample did not expand coverage to pregnant mothers. However, there is a significant increase in the probability of acquiring 8 visits among cohorts impacted in the third trimester, implying more intensive care-seeking by near term mothers. The magnitude is relatively small: compared to the baseline probability of 0.787, the increment by 0.15 is about 2%. In terms of care onset, although the low SES sample suggests a small and marginally insignificant increase in care being started in the first trimester, the high welfare and low SES sample points to a significant decrease in early onset of pre-natal care, and the effect is more concentrated among cohorts impacted in the first trimester, as expected. Evidence is less conclusive in this case, however, since the parallel trend assumption is less likely to hold with marginally significant pre-trends. A similar pattern emerges for care adequacy, which is but a synthetic measure of doctor visits and care onset. In specifications not shown, I have constructed indicators of whether care started in the first two, four and six months of gestation. In-utero exposure to CHIP significantly delays care onset in the first 2 to 4 months, with no significant effect on later onset of care. In other words, although the total number of doctor visits does not change, initiation of care tends to be delayed beyond the first trimester, followed by more intensive care utilization in the third trimester. I interpret this as suggestive evidence of “procrastination” by impacted mothers in early stages of care. Put together, in-utero response to CHIP is much larger in terms of health behavior than 106 Figure 4.4: Event Study Estimates on Health Care Utilization never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.4 −.2 0 .2 .4 −.4 −.2 0 .2 .4 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months # of Doctor Visits never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.02 0 .02 .04 −.02 0 .02 .04 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months 8+ Doctor Visits never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.05 0 .05 −.05 0 .05 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Care Started in 1st Trimester never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.05 0 .05 −.05 0 .05 full and low SES sample low SES + high welfare full sample low SES (non−college, single mothers) conception relative to CHIP onset, months Care Is Adequate 107 health care utilization. Part of the reason is because CHIP induces later utilization when the child is born, with no direct incentive on contemporaneous utilization in pregnancy. Another reason could be that dynamic complementary is stronger between health behavior and utilization than between utilization in different periods. Behavioral features such as time-inconsistent preferences may also be important for the delay in care onset among impacted mothers. Combining health behavior and utilization, overall investment has resulted in significant improvement in birth outcomes, which is in line with parental investment seeking to maximize the long-run benefit to the child, taking into account the dynamic complementary in human capital development over the life cycle. 4.4.2 Robustness and Alternative Mechanisms The event study as it stands is subject to a few concerns. Primarily, given the sample contains only live births, if the program also shifts the position of the marginal live birth in the full distribution of all potential births, then estimates based on the selected sample do not recover the causal effect on birth outcomes. Additionally, there might be selective exposure to the program by deliberate timing of either fertility or birth. Although a “reduced-form” interpretation of event study estimates over a fixed 9 month gestation alleviates selection in the timing of birth, there might still be residual selection in the timing of fertility, and the selection response may differ by demographics. Here, to better understand how selected program exposure and live births affect event study estimates, I place fertility and fetal death rates directly on the left hand side, and estimate the following regression: r dymc = 0 + j,10 X j=12;:::;4 j Ifconception ym onset s(c) =jg + 1 UE cym + 2 tobaccotax s(c)ym + d + y + m + c + c t ym + dymc ; Outcomer dymc is fertility or fetal death rates by demographic celld, conception year y, monthm, and county of residencec. Details on how I constructed these rates are in Section 2. Due to limited cells in SEER population count data, demographic groups are stratified by age band, race, and Hispanic origin only. I include fixed effects for each of the group in d . In-utero environmental variablesUE cym andtobaccotax s(c)ym are averaged over the 4 quarters including the birth quarter and 3 quarters before. Figure 4.5 shows how fertility and fetal death rates changed with CHIP onset. The fertility response to CHIP in the full sample is negligible, and does not appear to bias 108 estimates from the high welfare sample until 2 quarters before the onset, where fertility rate increases by a tiny amount of 0.0008, a 10% uptick from the baseline. This might affect the event study estimates on the second and first trimester cohorts to a small extent. Reduction in fetal death rate is large and significant in the full sample, but much smaller and insignificant in high welfare counties. Overall selection through conception timing and fetal death is therefore unlikely to be substantial. Hence assuming quasi-random roll- out of the program, selective fertility and live birth does not pose considerable challenge to a causal interpretation of the timing effects of the program. Figure 4.5: Fertility and Fetal Survival Response to CHIP never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.002 0 .002 .004 −.002 0 .002 .004 full sample high welfare conception relative to CHIP onset, months Fertility Rate never impacted impact in 3rd trimester impact in 2nd trimester impact in 1st trimester full impact −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.02 −.01 0 .01 −.02 −.01 0 .01 full sample high welfare conception relative to CHIP onset, months Fetal Death Rate A remaining issue is whether the observed timing effects indeed reflect parental in- vestment incentive in the child over the long run, rather than other pathways of influence extraneous to child development. Imagine the worst case scenario where I am simply picking up secular trends in smoking rates over the roll-out period of CHIP, and the trending is either irrelevant to CHIP, or related to CHIP in a way that lowers the smoking rate for the entire population, regardless of child bearing status. The latter case would cast doubt on human capital investment in the child as a potential mechanism. I follow two steps to address this concern. First, I extend the event study to include older cohorts who are not exposed to CHIP onset in-utero. Specifically, I expand the “control” cohorts to those conceived 16 months to 10 months prior to CHIP onset. I re- examine the pre-trend for key outcomes such as birth weight (Appendix Figure C.2.1), smoking (Appendix Figure C.2.2) and drinking (Appendix Figure C.2.3) behavior during pregnancy, and do not find significant effects on control cohorts for these outcomes. Consistent with the main analysis, effects are concentrated in the low-SES sample in high 109 welfare transfer counties. The results indicate that secular trends in health behavior and birth outcomes are unlikely to explain the differential effects of CHIP exposure on cohorts. In a second step, I replicate the key findings on health behavior using samples from the Behavioral Risk Factor Surveillance System (BRFSS). One advantage of the BRFSS sample is that it includes both pregnant and non-pregnant mothers, and the latter group provides a natural placebo test. I use the monthly variation in BRFSS to construct a similar event study design around the enrollment start month by state. Unlike previous specifications where I follow conception cohorts, this design is completely cross-sectional: I do not observe the mother’s conception month or her current trimester in BRFSS. I therefore try to understand if the smoking rate decreased more among pregnant women on average than among non-pregnant women, before and after the reform in different states. I interpret the “triple difference” as suggesting the relative importance of child investment incentives, and the effect on the sample of pregnant mothers as providing one additional robustness check to corresponding results in Figure 4.3. Table 4.3 summarizes the BRFSS sample, averaged over 1996-2001 survey years. In- cluded women are aged between 21 and 40. I stratify the sample depending on whether the woman is pregnant at the time of interview. Demographics are remarkably similar across samples except for marital status: pregnant women are more likely to be married. Pregnant women are also significantly less likely to smoke or drink, and have better overall health, especially mental health, than non-pregnant women. The insurance coverage rate is also higher. Figure 4.6 shows event study estimates from the following regression: h iyms = 0 + j,0 X j=6;:::;6 j Ifinterview ym onset s =jg + 1 UE sym + 2 tobaccotax sym +X i + y + m + s + s t ym + iyms ; where I include individuals interviewed 6 months before and after the onset. I control for state monthly unemployment and tobacco tax rates, which may have independent effects on smoking and other health behavior. I include demographic cell fixed effects and cell-specific trends inX i , as well as state fixed effects and state-specific trends. There does not appear to be systematic differences in smoking incidence across the two samples over this period. Smoking intensity, however, is much lower with CHIP among pregnant women: reduction in heavy smoking (5+ cigarettes daily) is about 0.4, or 20% from the baseline of 0.20. The non-pregnant sample, on the other hand, exhibits the reverse trend of increasing smoking intensity over this period. This lends further credence to the finding from the natality sample. The specific effect on pregnant women 110 Table 4.3: Descriptive Statistics, BRFSS Pregnant Women Non-pregnant Women Observations Mean Observations Mean age 9,385 28.91 181,957 30.96 race =White 7,388 0.82 142,016 0.79 =Black 7,388 0.10 142,016 0.12 =other 7,388 0.083 142,016 0.087 Hispanic origin 9,360 0.16 181,443 0.15 education =less than high school 9,381 0.11 181,818 0.094 =high school 9,381 0.27 181,818 0.28 =some college 9,381 0.62 181,818 0.62 married 9,372 0.78 181,609 0.57 have previous child 9,370 0.68 181,629 0.68 behavior in past 30 days did not smoke 3,125 0.65 74,348 0.36 5+ cigarettes daily 3,125 0.20 74,348 0.39 10+ cigarettes daily 3,125 0.17 74,348 0.34 did not drink 5,759 0.89 111,820 0.44 2+ weeks of good physical health 3,071 0.87 64,668 0.89 2+ weeks of good mental health 3,240 0.87 82,578 0.83 any insurance coverage now 9,382 0.89 181,738 0.81 in good health now 9,385 0.94 181,957 0.92 unemployment rate (%) 9,385 4.18 181,957 4.19 tobacco tax (dollar) 9,385 0.43 181,957 0.43 Notes. Sample contains women between age 21 and 40 sampled between 1996 and 2001. All averages adjusted by BRFSS final sampling weights. 111 Figure 4.6: Event Study Estimates on Health Behavior, BRFSS −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 −1 0 1 2 −1 0 1 2 pregnant women non−pregnant women month before/after CHIP onset Did not Smoke in Past 30 Days −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 −1 −.5 0 .5 −1 −.5 0 .5 pregnant women non−pregnant women month before/after CHIP onset 5+ cigarettes daily −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 −1 −.5 0 .5 −1 −.5 0 .5 pregnant women non−pregnant women month before/after CHIP onset 10+ cigarettes daily −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 −.4 −.2 0 .2 .4 −.4 −.2 0 .2 .4 pregnant women non−pregnant women month before/after CHIP onset Did not Drink in Past 30 Days 112 suggests mothers factor in the future well-being of their children when making health choices in-utero, and this consideration is made more salient with the introduction of CHIP . Change in drinking behavior is less conclusive: there is a general downward trend in alcohol consumption among non-pregnant women, but no discernible trending for pregnant women. Using BRFSS one can also test for other plausible mechanisms that could explain the improved health behavior in pregnancy. For example, it is possible that campaigns surrounding the roll-out of CHIP serve to remind eligible mothers to take up Medicaid for own coverage during pregnancy, which in turn may induce more health care utilization and potentially changes in behavior. Previous literature has documented such “welcoming mat” effects for other social insurance programs. In this context, however, insurance take-up can also be motivated by the investment incentivee, since health care in pregnancy most likely benefits the fetus. Therefore, I view this mechanism as complementary. A priori, however, because the increase in care utilization is very small compared to changes in health behavior, own coverage gain is unlikely to play a major role. The upper left panel in Figure 4.7 compares differential changes in coverage rate by pregnancy status. In both samples the general trending suggests an increase in coverage rate over this period, with larger but insignifcant increases in some months in the preg- nancy sample. Observe that overall trending of insurance coverage for pregnant women mirrors that for the broader population, and if greater insurance access is the mechanism at play, one would expect a similarly parallel trending in the smoking rate across these samples as well. However, Figure 4.6 shows evidence of a diverging trend for measures of smoking intensity. Hence, the insurance coverage gain in this period is unlikely to be the causal mechanism behind improved health behavior of pregnant mothers. In addition to improving child well-being, insurance provision to children may also have direct health benefits on the pregnant mother. For instance, it may lower the stress level experienced in pregnancy, and contribute to better birth outcomes (Persson and Rossin-Slater, 2016; Black, Devereux, and Salvanes, 2016). There are a few reasons why mother’s mental health may be improved with CHIP . Financially, the opportunities for enrolling the child in heavily subsidized public insurance programs lower the long-run cost of raising the child, reducing the economic stress surrounding fertility choices and during gestation. Higher income is also positively correlated with overall health, so the effect need not be limited to only mental health. In terms of human capital development, greater access to health care after birth makes health at birth less “immutable” and less important for future stock of health for the child. This gives mother more opportunities to invest over a longer period of time, which tends to lower the stress associated with 113 Figure 4.7: Alternative Mechanisms, BRFSS −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 −.4 −.2 0 .2 .4 −.4 −.2 0 .2 .4 pregnant women non−pregnant women month before/after CHIP onset Have Any Insurance Coverage −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 −.3 −.2 −.1 0 .1 −.3 −.2 −.1 0 .1 pregnant women non−pregnant women month before/after CHIP onset Good Health −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 −1 −.5 0 .5 1 −1 −.5 0 .5 1 pregnant women non−pregnant women month before/after CHIP onset Good Physical Health −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 −1 −.5 0 .5 −1 −.5 0 .5 pregnant women non−pregnant women month before/after CHIP onset Good Mental Health 114 gestational investments. The remaining panels in Figure 4.7 investigate the effect on mental, physical, and overall health of the mother. Surprisingly, there is in fact a downward trending in mental health among pregnant women after CHIP, as opposed to a generally upward trending among non-pregnant women. There is also no significant change in either overall health or physical health. Therefore, there is no strong evidence suggesting CHIP improved birth outcomes by lowering maternal stress, or induced healthier behavior by insuring more mothers–both might still be contributing factors, but they cannot explain the more relevant question why in-utero investment would respond to future investment return the way it did, and the implications of these dynamic linkages. 4.4.3 Simulated Eligibility Event study in the previous section focuses on the timing of the effect on cohorts impacted at different trimesters of gestation. Alternatively, one might be interested in the relative effect of coverage expansion in different phases of childhood on in-utero investments and birth outcomes. Unlike the event study that looks at the binary onset of the program, for this analysis I need a continuous measure of program generosity. Following a large literature led by Currie and Gruber (1996a), Currie and Gruber (1996b), and Cutler and Gruber (1996), I parametrize program eligibility rules for children in different ages using simulated eligibility from a reference national sample, and study the dose response of investment behavior to program expansions. I collect Medicaid and CHIP eligibility rules for children and pregnant mothers over the years from 1997 to 2000. This period covers the roll-out of CHIP in all 50 states included in this study. I track both the program onset and subsequent expansions, and codify Medicaid eligibility rules when CHIP is not yet established. For each state and year-month, I record the eligibility for small children between age 0-5, older children starting age 6, and pregnant mothers. I then average the eligibility over the gestation period for each child in the birth sample, and construct a measure of in-utero exposure to insurance eligibility in early childhoodelig05, later childhoodelig6+, and for mothers during pregnancyeligpreg. Specifically, I transform raw program parameters, in terms of percentages of the federal poverty level (FPL), to percent eligible within demographic cells in a reference sample. I include in this sample all women who are between age 21 and 40 in the 2000 decennial census. The national census in a fixed survey year averages out any local and secular trending that may be correlated with the roll-out of CHIP . More importantly to this study, I use the comprehensive sample of all women, rather than the more selected sample of moth- 115 ers or children, to address the sample selection issue of live birth and potential fertility response to the program correlated with unobserved mother characteristics. To facilitate within-state identification by demographics, I follow Mahoney (2015), and simulate over age-race-ethnicity-education-marriage cells of the woman. Because eligibility in this case is forward-looking, I add up the child’s progression in childhood onto the life cycle of the mother, and construct projected childhood eligibility in the following way: elig05(a; ¯ d;s;t) = #ffamincFPL i eFPL05 s;t ;aage i a + 5; ¯ d i = ¯ dg #faage i a + 5; ¯ d i = ¯ dg elig6+(a; ¯ d;s;t) = #ffamincFPL i eFPL6+ s;t ;a + 6age i a + 18; ¯ d i = ¯ dg #fa + 6age i a + 18; ¯ d i = ¯ dg eligpreg(a; ¯ d;s;t) = #ffamincFPL i eFPLpreg s;t ;age i =a 1; ¯ d i = ¯ dg #fage i =a 1; ¯ d i = ¯ dg ; where ¯ d i are demographic controls of the woman except age, and famincFPL i is the woman’s family income transformed to percentage FPL according to the 2000 poverty guideline. For a hypothetical mother giving birth at agea = 30, for example, her projected insurance eligibility for that child in the first six years of life, elig05, is based on the income distribution of women between age 30 to 35, andelig6+ is based on the income distribution in ages 36 to 48. 107 I assign the simulated eligibility to corresponding observations in the birth sample; for every motheri in states and celld, I have a sequence of program eligibility varying by month. I average over gestation months to derive motheri’s in-utero exposure to childhood eligibility: elig05 i = X t elig05(a i ; ¯ d i ;s i ;t)Ifconception i tbirth i g birth i conception i + 1 elig6+ i = X t elig6+(a i ; ¯ d i ;s i ;t)Ifconception i tbirth i g birth i conception i + 1 eligpreg i = X t eligpreg(a i ; ¯ d i ;s i ;t)Ifconception i tbirth i g birth i conception i + 1 : 107 One might wonder if mothers actually forecast based on cross sectional income dynamics. There might also be cohort trending that makes it less reliable over longer terms. Fortunately, results are very similar if I instead simulate over the 2001-2014 period using annual ACS samples, or if I combine both the 2000 census and 2001-2014 ACS. 116 With these measures I estimate the following regression: h iymc = 0 + 1 elig05 i + 2 elig6+ i + 3 eligpreg i + 1 UE icym (22) + 2 tobaccotax is(c)ym + X i + y + m + c + c t ym + iymc : Similar to Equation (1), I control for confounding in-utero influences such as unemploy- ment rate and state tobacco tax. I also control for county fixed effects and county-specific trends. Because eligibility is simulated by cells, I include cell fixed effects and cell-specific trends inX i , in addition to birth order, plurality, and maternal comorbidity controls. I include separate fixed effects for conception year and conception month, although year- month fixed effects barely change the results for this specification due to smaller concerns of multicollinearity. The key coefficients are i ;i = 1;2;3. For interpretation, it is useful to think of the coef- ficients as unrestricted weights mothers attach to program generosity in different phases of childhood, with larger weights suggesting larger behavior responses. Algebraically, the regression is equivalent to the following: h iymc = 0 +CHIP i + 2 tobaccotax is(c)ym + X i + y + m + c + c t ym + iymc ; whereCHIP i =w 1 elig05 i +w 2 elig6+ i +w 3 eligpreg i is the overall CHIP generosity, a weighted average of sub-program generosity. It is easy to see that = 1 + 2 + 3 , and the weights can be reconstructed asw i = i P j j ;i = 1;2;3. This suggests the relative magnitude of is informative for understanding the the type of expansion most relevant for the behavioral changes of pregnant mothers. To estimate consistently, one needs to ensure that simulated eligibility is a pure parametrization of program rules unconfounded by behavioral selection related to the program. Without this condition, behavior response enters both the covariates and the weights, complicating the interpretation of. The way I simulate the eligibility should adequately address endogenous program roll-out and sample selection of live births. When I average over gestation months, however, I introduce potential biases from endogenous timing of fertility and birth, if mothers plan gestation around program onset and expansion dates. To purge the selective timing and duration of gestation, I adopt a similar strategy as in Currie and Rossin-Slater (2013), and average over a fixed 11 month period with the 6th month coinciding with the actual middle month of the gestation. This way I enclose every pregnancy in the birth sample and use leeway on both ends to alleviate the selection in either fertility or birth. When I use these exogenous eligibility measures as instruments, 117 the reduced form is the following: 108 h iymc = 0 + 1 elig05iv i + 2 elig6+iv i + 3 eligpregiv i + 1 UE icym (23) + 2 tobaccotax is(c)ym + X i + y + m + c + c t ym + iymc : Table 4.4 shows the result for birth outcomes. I limit the sample to low SES mothers who are less than college educated and single at the time of delivery. I first show the naive OLS estimates of Equation (2), where in-utero exposure to program generosity may be biased from mother’s timing of gestation. In the lower part I show reduced form estimates corresponding to Equation (3). These estimates are very similar to two-stage least square (2SLS) estimates where endogenous exposure is instrumented by exposure over a fixed gestation length, because first stage coefficients are very close to unity. For sake of brevity I omit 2SLS estimates in the table. First note that OLS estimates are generally smaller in magnitude than reduced form estimates, especially for eligibility in later childhood. Including demographic cell specific trends also seems to enlarge the effect of later childhood eligibility, where expansion is largest relative to Medicaid eligibility prior to CHIP . Given average childhood eligibility increased from 0.16 to 0.32 during the roll-out, the corresponding increase in birth weight is approximately 32 grams (200*0.16), which is similar in magnitude with what I find in the event study. Birth weight, low birth weight, gestation and infant health are all most responsive to coverage expansions in later childhood. 108 It is possible that mothers are able to time fertility over longer horizons than a few months–if, for example, the program starts in July 1998, and a child who would have been conceived in 1997 was pushed over to be born in August 1998, then the instrument will not be able to address this type of planning. However, given I include both location and demographic specific trending, these controls should at least partially address the selection into program in the longer term. 118 Table 4.4: Effect on Birth Outcomes, Simulated Eligibility (I) (II) (III) (IV) (V) birth weight low gestation pre-term Apgar 8 (gram) birth weight (week) birth Panel A: OLS eligpreg 155.33*** 165.66*** -0.033* -0.038** 0.55** 0.66** -0.067* -0.076** 0.027 0.041 (45.68) (45.18) (0.018) (0.017) (0.26) (0.25) (0.035) (0.034) (0.055) (0.053) elig05 267.64*** 213.50* -0.10*** -0.077* 118 0.45 -0.13* -0.061 0.093 0.0063 (97.68) (108.45) (0.036) (0.040) (0.72) (0.73) (0.076) (0.078) (0.060) (0.074) elig6+ 69.21 129.49** -0.045** -0.074*** 1.54*** 2.27*** -0.12*** -0.19*** 0.21*** 0.30*** (54.04) (63.93) (0.022) (0.026) (0.38) (0.42) (0.042) (0.045) (0.042) (0.057) R 2 0.14 0.14 0.13 0.13 0.081 0.082 0.070 0.070 0.036 0.036 Panel B: reduced form eligpregiv 187.58*** 205.41*** -0.050*** -0.059*** 0.72** 0.88*** -0.11** -0.13*** 0.065 0.084* (40.48) (41.51) (0.016) (0.017) (0.27) (0.26) (0.042) (0.041) (0.046) (0.045) elig05iv 319.90** 221.66 -0.13** -0.078 1.34 0.29 -0.19 -0.056 0.083 -0.039 (128.47) (136.68) (0.051) (0.053) (0.91) (0.92) (0.12) (0.11) (0.066) (0.088) elig6+iv 153.48** 262.72*** -0.089*** -0.14*** 2.15*** 3.23*** -0.24*** -0.37*** 0.29*** 0.42*** (64.97) (75.34) (0.028) (0.033) (0.46) (0.51) (0.055) (0.058) (0.051) (0.074) R 2 0.14 0.14 0.13 0.13 0.082 0.083 0.071 0.071 0.036 0.036 N 1,924,520 1,924,520 1,925,743 1,925,743 1,606,957 Y-mean 3221.58 0.099 38.63 0.23 0.89 trends: county Y Y Y Y Y Y Y Y Y Y demo. cell Y Y Y Y Y Notes. Please see main text for details. *** 0.01 ** 0.05 * 0.10 119 Table 4.5: Effect on Health Behavior, Simulated Eligibility (I) (II) (III) (IV) smoke 5+ cigarettes drink weight gain daily (lbs.) Panel A: OLS eligpreg 0.057 0.049 0.057** 0.044 0.020* 0.015 -1.96* -1.74 (0.040) (0.040) (0.027) (0.027) (0.010) (0.010) (1.17) (1.15) elig05 -0.24*** -0.20** -0.22*** -0.16** -0.013 -0.026* 0.76 0.026 (0.064) (0.084) (0050) (0.066) (0.012) (0.013) (2.07) (2.55) elig6+ 0.10** 0.053 0.10*** 0.035 -0.0066 -0.0050 2.16 2.98 (0.048) (0.067) (0.036) (0.050) (0.0076) (0.0087) (1.37) (1.83) R 2 0.19 0.19 0.19 0.19 0.030 0.032 0.050 0.050 Panel B: reduced form eligpregiv 0.057 0.048 0.059** 0.044 0.020* 0.016 -1.63 -1.34 (0.042) (0.042) (0.028) (0.028) (0.010) (0.010) (1.067) (1.05) elig05iv -0.26*** -0.22** -0.24*** -0.18** -0.016 -0.030** 0.87 -0.24 (0.071) (0.0093) (0.056) (0.073) (0.013) (0.014) (2.17) (2.64) elig6+iv 0.10* 0.051 0.11*** 0.031 -0.0079 -0.0074 3.05** 4.30** (0.058) (0.074) (0.039) (0.055) (0.0080) (0.0092) (1.38) (1.83) R 2 0.19 0.19 0.19 0.19 0.030 0.032 0.050 0.050 N 1,571,746 1,525,624 1,600,339 1,480,744 Y-mean 0.25 2.58 0.055 29.78 trends: county X X X X X X X X demo. cell X X X X Notes. Please see main text for details. *** 0.01 ** 0.05 * 0.10 120 Table 4.6: Effect on Health Care Utilization, Simulated Eligibility (I) (II) (III) (IV) # doctor visit month care started care is care started in 1st trimester adequate Panel A: OLS eligpreg 0.99 0.96 -0.54 -0.53 0.13 0.13 0.13 0.12 (1.39) (1.43) (0.67) (0.68) (0.14) (0.15) (0.15) (0.15) elig05 -0.13 0.21 -0.081 -0.22 0.057 0.093 0.032 0.087 (0.57) (0.73) (0.32) (0.38) (0.067) (0.082) (0.060) (0.076) elig6+ 0.43 0.16 0.31 0.44* -0.095** -0.13** -0.072* -0.12** (0.37) (0.48) (0.19) (0.23) (0.040) (0.051) (0.038) (0.047) R 2 0.097 0.098 0.075 0.076 0.059 0.060 0.077 0.078 Panel B: reduced form eligpregiv 1.11 1.08 -0.56 -0.54 0.14 0.13 0.13 0.13 (1.45) (1.49) (0.69) (0.71) (0.15) (0.15) (0.15) (0.15) elig05iv -0.054 0.24 -0.087 -0.24 0.061 0.10 0.036 0.095 (0.61) (0.80) (0.35) (0.42) (0.074) (0.091) (0.065) (0.083) elig6+iv 0.59 0.37 0.34 0.48* -0.10** -0.14** -0.076* -0.13** (0.38) (0.53) (0.20) (0.25) (0.043) (0.056) (0.040) (0.051) R 2 0.097 0.098 0.075 0.076 0.075 0.060 0.077 0.078 N 1,851,607 1,866,374 1,866,374 1,830,592 Y-mean 10.54 3.07 0.71 0.62 trends: county X X X X X X X X demo. cell X X X X Notes. Please see main text for details. *** 0.01 ** 0.05 * 0.10 121 On the other hand, pregnancy smoking behavior is more responsive to early childhood eligibility, with no significant effect of maternal and later childhood coverage on either smoking incidence or intensity (Table 4.5). Interestingly, mothers seem to drink more often with more generous coverage during pregnancy, as much as they are more likely to abstain from drinking with higher eligibility in early childhood. The negative effect maternal coverage has on pregnancy health behavior is not novel. For example, Dave et al. (2015) finds increased smoking rate and lower weight gain associated with a previous Medicaid expansion in the 1980s. In this case, although a similar effect on weight gain is evident, it is not quite significant and is counteracted by a much larger, positive effect brought by later childhood coverage expansions. In terms of health care utilization (Table 4.6), frequency of doctor visits is most respon- sive to maternal pregnancy coverage, although not precisely estimated. Later childhood eligibility tends to delay onset of care beyond the first trimester, and the same pattern shows up in the event study. By almost exactly the same magnitude it also reduces care adequacy: with little impact on number of care sought, change in care adequacy is driven completely by delayed care onset. For this set of outcomes, reduced form and OLS esti- mates are very similar across specifications with or without demographic trend controls. This is another indication that care utilization is probably more responsive to own coverage options which remained stable over the study period. 4.4.4 Birth Weight Production Functions Having shown that insurance eligibility changes in-utero health investments and birth out- comes, I then seek to understand the “structural” pathway from these investments to birth outcomes, using exogenous program generosity measures as instruments. Conceptually, estimates in the precious section relate to the following reduced form equations: birth weight =f (eligpregiv;elig05iv;elig6+iv) smoking =g(eligpregiv;elig05iv;elig6+iv) doctor visits =h(eligpregiv;elig05iv;elig6+iv); whereas the birth weight production function itself takes investments such as smoking and doctor visits as arguments: birth weight =(smoking;doctor visits): (24) Simulated eligibility measures are valid instrument for inputs in Equation (4) if invest- 122 ments respond significantly to program generosity (first stage correlation), and if program rules do not affect birth outcomes except through effect on actual investments (exclusion restriction). The exclusion restriction is plausible in this case because CHIP is unlikely to directly benefit fetus and improve birth outcomes except through parental investment responses during pregnancy. Both the first stage and reduced form results are strong and significant in Section 3.3, suggesting the data contain sufficient variation to identify the relationship in Equation (4). Historically a large literature has estimated variants of Equation (4) to understand how birth weight varies with in-utero investments. Earlier studies led by Rosenzweig and Schultz (1982), followed by Corman et al. (1987), Grossman and Joyce (1990), and Warner (1998) use local market characteristics that affect mother’s input demand to identify the role of health investments on birth weight. These studies generally find significant, positive role of health care utilization. More recently, researchers have increasingly used welfare reforms that alter care affordability to the low-income population to identify the birth weight equation. Currie and Grogger (2002) shows Medicaid expansion increases the prenatal care utilization by pregnant mothers, and the effect on infant health is relatively small. Similarly, Evans and Lien (2005) finds small and imprecisely estimated effect of pre-natal care on birth weight. Using Medicaid reimbursement reform, on the other hand, Sonchak (2015) shows that pre-natal care visits lead to a significant increase in birth weight among children born to White mothers. Adding to this literature, I consider jointly the role of health behavior and pre-natal care on the birth weight of the newborn. I use expansion over different phases of childhood for (over-)identification. I focus on smoking and smoking intensity for health behavior, given the large reduction driven by the program, and number of doctor visits as the main measure of pre-natal care utilization. I assume the birth weight production function takes the exponential function form birth weight = (doctor visits) 1 exp( 0 + 2 smoke); (25) and among smokers, I assume a Cobb-Douglas production function in total number of doctor visits and number of cigarettes daily: birth weight =exp( 0 )(doctor visits) 1 (cigarettes) 2 : (26) An expected negative coefficient of 2 and 2 implies decreasing marginal return of doctor visits at higher intensity of smoking, or complementarity between health behavior and pre-natal care utilization. 123 To estimate the production function I take the logarithmic of birth weight, number of doctor visits and cigarettes daily. I then estimate the model by two-stage least squared. In addition to investment inputs, I also include the same set of right hand side variables as in Equation (3): in particular, I include both county and demographic cell level linear trends, and measures of in-utero environments such as unemployment rate and tobacco tax. At the minimum, I need two instruments: one for smoking behavior and the other for pre-natal care. Based on estimates magnitude, I choose to instrument smoking with early childhood eligibility, and pre-natal care with mother’s own eligibility in Table 4.7, Column (2). To test for the functional form of the production function, as well as to shed light on the validity of the instruments, I include additional instruments for over identification. I first experiment with interactive eligibility by early childhood and maternal coverage in Column (3). I then consider later childhood eligibility for over identification in Column (4). Relative to OLS estimates in Column (1), instrumented structural estimates are much larger in magnitude. 2SLS estimates have strong first stage explanatory power, are fairly robust to different sets of instruments, and pass the over-identification test, except for the full sample estimate when using teenage eligibility as the over-identifying instrument. All IV specifications pass the under-identification test. The most appreciable difference with the instruments is that, coefficient before smoking variables become much larger relative to corresponding OLS estimates. For instance, smoking incidence significantly reduces birth weight, by a far wider margin than the effect of one more prenatal care during pregnancy. That prenatal visits do not have a particularly big effect on birth outcomes is consistent with prior researches. This pattern is clearly not very well captured with naive OLS estimates, which would suggest much larger effect of prenatal care on birth weight. Effect of doctor visit on birth weight is higher among the smoker sample, where greater physician care and maternal smoking intensity have similar but opposite impact on birth weight. To simplify, birth weight production by smoking mothers is approximately proportional to q #doctorvisits #cigarettes , with greater tobacco consumption lowering the return of doctor visits. This implies complementarity between care utilization and health behavior, with better health behavior increasing the return to prenatal care. Furthermore, the magnitude also suggests that the return to reduced smoking intensity in terms of birth weight is fairly modest: the same increase can alternatively be achieved by one more doctor visit during pregnancy. This is not to say, however, that pregnancy smoking has no detrimental effect on the child in the long run. Despite the mediating effect of pregnancy care on smoking for the outcome of birth weight, the full detrimental effect of smoking on the child may not materialize until much later in life. Lastly, applying the estimated 124 Table 4.7: Birth Weight Production Function (I) (II) (III) (IV) Full sample logdoctor visits 0.082*** 0.23* 0.29** 0.30** (0.0034) (0.13) (0.12) (0.12) [25.18] [19.05] [17.18] smoke -0.057*** -1.22*** -1.14*** -1.04*** (0.0012) (0.19) (0.26) (0.19) [31.87] [22.01] [21.54] R 2 0.17 -3.91 -3.48 -2.98 N=1,509,387 Overid. p-value 0.62 0.012 Underid. p-value <0.0001 0.0001 <0.0001 Smoker sample logdoctor visits 0.082*** 0.54*** 0.54*** 0.59*** (0.0033) (0.12) (0.097) (0.11) [21.14] [14.56] [14.58] logcigarettes -0.020*** -0.51*** -0.51*** -0.51*** (0.00090) (0.11) (0.12) (0.12) [11.82] [8.27] [7.87] R 2 0.18 -2.87 -2.87 -3.13 N=357,807 Overid. p-value 0.995 0.12 Underid. p-value 0.0012 0.0013 Notes. Dependent variable is log birth weight. Column (1) shows OLS estimates. Column (2) shows 2SLS estimates whereeligpregiv instruments forlog doctor visits andelig05iv instruments forsmoke orlog cigarettes. Column (3) shows over-identified 2SLS estimates where instruments in addition includeeligpregivelig05iv. In Column (4) the over-identifying instrument iselig6+iv. First stage F-statistic in the square bracket. Sample includes low SES mothers less than college educated and single. Please see text for more details. *** 0.01 ** 0.05 * 0.10 125 production function to mothers impacted by the program, since variation in pre-natal care utilization during CHIP roll-out is small, improvements in birth weight are largely driven by smoking cessation during pregnancy. 4.5 Long-run Impact A growing literature on the fetal origin hypothesis has shown that in-utero and early life shocks have long lasting impact on later life outcomes. There is also mounting evidence from the medical literature that maternal smoking during pregnancy impairs the brain development of the fetus, increasing the risk of the child developing psychiatric disorders such as ADHD, depression and autism. These conditions tend to limit the cognitive skill of the child, and adversely affect their labor market performance. In this section I unravel the long-run implications of reduced pregnancy smoking on the cognitive skill of the impacted cohorts in later childhood. To do so I first compare cohorts with differential in-utero exposure to policy incentives that reduce maternal smoking in an event study set-up. I then sort out the dynamics of investments across multiple periods using simulated eligibility, and show that long-run impact of in-utero exposure is due to increased return to future investments, rather than having an independent, time-invariant effect by the time of birth. 4.5.1 Event Study I focus on the cognitive skill of children between age 8 and 18 when sampled in ACS during 2010 and 2014. The CHIP cohorts, those impacted by CHIP in-utero or around birth, fall within this age range when they are sampled in the most recent census data. In this range the majority of the children are still attending school, hence I do not observe their labor market outcomes and focus on their cognitive performance at school. If the child reports having difficulty remembering in response to one of the functional limitation questions, then I code the child as having low cognitive skill. Earlier study by Levine and Schanzenbach (2009) also looks at long-run impact of CHIP on children’s cognitive skills, using test score as a continuous measure. The main difference here is that I specifically studies in-utero exposure to CHIP, rather than staggered exposure after birth. In terms of outcome, while test scores are better at revealing mean changes in the average skill by cohort, my measure of cognitive limitation focuses on the more extreme cases of skill deficiency. Understanding how CHIP improves the conditions of the most vulnerable kids complements the depiction of the average population. Moreover, using the rich demographic and family background variables in the census data but absent in test 126 score data, I can quantify the effect on particular sub-groups most likely to benefit from the program. The “triple difference” facilitates better identification and understanding of the program impact and its potential pathways. I match children to their co-residing mothers using the mother locator developed by Ruggles et al. (2018), available through Integrated Public Use Microdata Series (IPUMS). I then append maternal demographics such as race, ethnicity, and education to the child observation. I calculate the age of mother at the birth of the child by subtracting child’s age from mother’s current age. To reduce potential matching error, and to maintain consistency with the birth sample, I only include cases where the derived age at delivery of the child is between 21 and 40. I then cross-check the overlap between the child’s demographics and that of the mother: in more than 95% of the cases, mother and the child share identical race and ethnicity (for a summary, see Table 4.2). In what follows I therefore control for maternal race and ethnicity where relevant, rather than those of the child, although result is not at all sensitive to this choice. Knowing the education of the mother, I then stratify the sample and expect larger effects on children born to low SES mothers with less than college education. I recode program onset dates into quarters, and merge with birth quarter and state of birth variable in ACS to construct timing indicators of program onset relative to the birth of the child. The event study specification for the long-run outcome is: k ibsyqw = 0 + l,4 X l=4;:::;7 l Ifbirth yq onset b =lg + 1 UE isyq + 2 tobacotax isyq (27) +X i + y + q + w + y w + s + s w + b + b t yq + ibsyqw ; where unit of observation is at the level of individuali born in stateb in yeary quarter q surveyed in wave (year 2010 till 2014)w and states. Included cohorts are born within 4 quarters before till 7 quarters after CHIP onset. Assuming a three quarter gestation, they are conceived between 7 quarters before and 4 quarters after CHIP . I control for both birth state and current state fixed effects, and birth state linear trends to account for endogenous timing of the roll-out. Interactive fixed effects by current state and survey wave are included to absorb any state differential trends at the time of the survey. Demographic variables in X i include cell fixed effects by child gender, maternal race, ethnicity and education, as well as interactive fixed effects of these cells and birth year dummies, to address any potential trending by demographics that is correlated with the program. I flexibly control for birth cohort and child age at the time of the survey using two-way fixed effects by birth year and survey year. In-utero environment variables such 127 as unemployment rate and tobacco tax are averaged over 4 quarters including the birth quarter and 3 quarters before. I weight observations by ACS sampling weights, and cluster standard error by state of birth. Figure 4.8 plots the event study estimates. Here I transform timing indicators to reflect differences from conception rather than birth quarter, to better showcase the effect of in-utero exposure and to be consistent with prior plots from the birth sample. As with the birth sample, in the event study I exclude states that simultaneously expanded maternal eligibility with the onset of CHIP: these states would introduce confounding factors other than investment incentives in the child. Furthermore, in the long-run specification, when comparing pre-trend with in-utero periods, one should also bear in mind that the pre- treatment cohorts are not immediately eligible for CHIP once born, whereas the in-utero and subsequent cohorts are. The nuance is unlikely to severely bias the result given infant eligibility from Medicaid is traditionally high and CHIP expansion is greatest for children in later childhood. Still, results are probably more reliable and more informative if compared within the in-utero cohorts. Figure 4.8: Long-run Impact on Cognitive Skill never impacted partial impact full impact −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.04 −.02 0 .02 −.04 −.02 0 .02 full sample low SES mothers conception relative to CHIP onset, quarters low cognitive skill The comparison with the full sample suggests the importance of demographic variables that better identify the at-risk population where one might reasonably expect effects to be larger. While in-utero exposure to CHIP has small and not at all significant impact on 128 the average cognitive skill, it has much larger and significant impact on children born to low SES mothers. The effect starts to kick in for cohorts exposed in the first and second trimester, and is fairly persistent for later cohorts receiving full impact in utero. The timing is consistent with the first two trimesters being vital for the brain development of the fetus. Although potentially confounded by differential eligibility in the first year of life, pre-treatment cohorts exhibit remarkably parallel trends in both samples. 4.5.2 Simulated Eligibility To investigate the mechanism by which in-utero CHIP exposure has long lasting impact on later childhood cognitive ability–in particular, how does in-utero investment interact with later life investments when the child is born and become eligible for CHIP, I construct two sets of eligibility measures of the program. The first set exclusively uses program information available to pregnant mothers during gestation, and gives the projected average childhood eligibility by the time of birth. The second set of eligibility traces actual eligibility rules as they unfolded over the course of childhood. I calculate average realized eligibility in early childhood (ages 0-5) and later childhood (ages 6 to the age observed in ACS). These measures are reduced form proxies of parental investments in respective phases of child development. Using these measures I explore the level and interactive relationships between investments across different periods on the cognitive outcome of the child. Specifically, I use weighted average ofelig05 andelig6+ to derive average childhood eligibility as believed in utero:uteroCHIP = 6 19 elig05 + 13 19 elig6+. Once the child is born, I update eligibility rules for the child at each age by state of birth, and construct eligibility per age for a given maternal demographic cell. Intrinsically I assume the child does not move for this simulation to be correct. I hence limit the sample to non-movers whose state at the time of the survey is the same as state of birth. I then average realized coverage eligibility between age 0 and 5, denoted byCHIP 05, and over age 6 till the age the child is sampled, denoted byCHIP 6+. I estimate the following regression on low SES children who have not moved from the birth state and are between age 8 and 18 when sampled in ACS 2010-2014: k ibyqw = 0 + 1 uteroCHIP + 2 CHIP 05 + 3 uteroCHIPCHIP 05 + 4 CHIP 6+ (28) + 5 uteroCHIPCHIP 6+ + 6 CHIP 05CHIP 6+ + 7 eligpreg + 1 UE iyqb + 2 tobaccotax iyqb + X i + y + q + w + y w + b + b w + b t yq + ibyqw ; 129 where I include two-way interaction in utero and realized childhood CHIP eligibility, and that between early and later childhood. I control for in-utero exposure to maternal coverage, unemployment rate and tobacco tax over a 4 quarter period. I also allow state- specific trends during the roll-out of the program, and state-year fixed effects at the time of the survey. Individual characteristics inX i include gender, maternal age band at delivery, race, ethnicity and education cells, 109 as well as cell specific trends that may be correlated with roll-out. I include all states except Tennessee in the sample, although result is not sensitive to this exclusion. Table 4.8: Long-run Impact on Cognitive Skill, Investment Dynamics (I) (II) (III) uteroCHIP 0.0012 0.012 0.023 (0.011) (0.039) (0.037) CHIP 05 -0.063* -0.057* -0.070* (0.036) (0.033) (0.039) uteroCHIPCHIP 05 -0.11* -0.23** (0.059) (0.10) CHIP 6+ 0.042 0.0011 0.023 (0.026) (0.033) (0.041) uteroCHIPCHIP 6+ 0.095 0.20 (0.075) (0.13) CHIP 05CHIP 6+ 0.056 0.021 (0.067) (0.11) eligpreg 0.031 0.029 0.0056 (0.022) (0.020) (0.026) eligpregCHIP 05 0.13 (0.10) eligpregCHIP 6+ -0.085 (0.080) In-utero environment unemploymentrate 0.0026 0.0027* 0.0030* (0.0016) (0.0016) (0.0017) tobaccotax -0.014 -0.014 -0.013 (0.0093) (0.0090) (0.0084) R 2 0.015 0.015 0.015 N 317,846 317,846 317,846 Notes. Please see text for more details. *** 0.01 ** 0.05 * 0.10 Table 4.8 shows the dynamic relationship between investments across different periods. In Column (1) I show the simplest specification where only level effects of eligibility are 109 Eligibility is simulated over maternal age band at delivery, race, ethnicity, and education. Hence identification comes solely from within cell variation determined by program generosity. 130 included. In Column (2) I estimate the specification in Equation (8), with full interaction between child eligibility over utero period, early childhood and later childhood. In Column (3) I additionally control for any interaction between maternal coverage in pregnancy with later eligibility of the child. Across all these specifications there is a marginally significant effect of early childhood eligibility on cognitive development, but no statistically significant effect of in-utero and later childhood eligibility. On top of this level effect, in-utero CHIP exposure significantly increases the return of early childhood investments: for every ten percentage point increase in coverage eligibility in utero, incidence of low cognitive skill decreases by an additional 1.15 percentage point through early childhood investment (0.10*0.23*0.50=0.0115), or a 24.5% reduction from a 0.047 baseline. The magnitude is comparable to a 3.83 percentage point reduction in in-utero unemployment rate, or a 88 cents increase in tobacco tax. The significant interactive effect suggests dynamic complementary in early critical periods of life is an important mechanism to the long-run impact of in-utero investment: return to subsequent investment is higher with greater in-utero investments and health at birth. Removing this channel, the impact of in-utero investment partially enters the return to later period investments, making the own effect not statistically different from zero. Another way to interpret the result is that the long-run relevance of in-utero investments completely materializes through added investment returns in future periods, rather than having an immutable, time-invariant effect set at birth. Similar complementarity is not detected across more distantly spaced periods (in-utero and later childhood) or between later periods (early and later childhood). This implies the dynamic linkage weakens as one moves away from the early critical period of life. 4.6 Conclusion This paper shows that future insurance provision to children has large influence on the investment behavior of parents in the fetus: smoking incidence and intensity decline and weight gain increases for mothers whose pregnancy overlaps with the roll-out of CHIP, and care utilization is more frequent by mothers impacted in the third trimester. Jointly these in-utero inputs bring about better health stock of the newborn: birth weight is higher, gestation longer and Apgar score measuring overall bodily function of the infant is higher. Therefore because of parental investment responses, the benefit of children’s insurance programs begins in the womb, even before the targeted children are born and administratively eligible for the program. Consistent with the growing literature on the fetal origin hypothesis, in-utero health 131 investments induced by the policy have long-run impact on cognitive development of the impacted cohorts as they age. Children of low educated mothers exposed to CHIP in their first and second trimester have significantly lower probability of cognitive difficulties in late childhood. In terms of mechanism, it does not appear that in-utero investments have an independent, time-invariant effect on later life outcomes–the level effect is small and imprecisely estimated. Rather, almost all of the influence in-utero investment has operates through increased investment return in early childhood, a period that significantly affects later childhood outcomes and the magnitude is augmented with greater investments in utero. This finding lends support to the hypothesis that early childhood is the critical period for child development, and the hypotheisis that dynamic complementarity is much stronger within the early critical period rather than across later or more distant periods. In relation to social insurance policies, the fact that dynamic complementarity in life cycle skill acquisition implies that social investment in the child and parental private investment can be complements: social insurance provision crowds-in, rather than crowds- out, family’s investment in the next generation, and there is efficiency gain in the long run that may in turn lower the cost of providing the insurance programs. This complementary interaction between public and private investments is not always emphasized in the evaluation of many social safety programs. Better understanding this interaction, however, is important for fully recognizing the long-term benefit of social safety nets to vulnerable population in still impressionable stages of development. 132 5 Conclusion The design of social insurance programs is a complicated problem. Reforms aimed at improving the delivery of medical services may backfire, if policy results in market outcomes that disadvantage intended beneficiaries of the policy. Economically, the issue is related to the contract design problem in a principal-agent relationship. Contract design that properly governs agents’ behavior is key to the realization of desirable market outcomes. In the case of the private Medicare market, a performance-based payment scheme is implemented to promote cost saving and quality. Chapter 2 shows that a flawed design in the contract, namely failing to sufficiently adjust for enrollee health conditions in the measurement of quality, resulted in favorable selection of low-risk enrollees but reduced access to quality for high-risk individuals. The distributional effect calls into question the potential tension between private return to insurers given the payment incentives, and the socially beneficial allocation of quality across risk types. Another difficulty in the design problem is a lack of focus on alternative policy pro- posals when one policy is being evaluated. For policies featuring small changes in the incentive design, the status quo economy is often not fully understood. As a result, it is not always clear what are the motivation for the proposed policy and how effective are these policies in addressing market failures. Chapter 3 attempts to formulate the benefits of insurance expansion policies in Massachusetts, and evaluate the effectiveness of policies relative to costs. I highlight that fact that the uninsured have access to medical services through the safety net program that provides charity care to the uninsured. Formal health insurance, however, has two major advantages over charity care. First, it alleviates the adverse selection effect on insurance price. Second, a safety net mandate financed by tax-based subsidy on premium is superior to one financed by surcharges on patients, as is the case in the status quo. Based on my calculation, benefits from lowering the charity care burden on payers exceed the costs of subsidy to the government, generating positive return on subsidy dollars. Ignoring the implicit cost and coverage of the safety net program would instead motivate the subsidy policy as mostly a means of redistribution, and would require comparisons with other means of redistribution in terms of effectiveness and efficiency. Finally, the design of social insurance may suffer from biases from a short-term focus on costs and benefits. This is particularly relevant for early-life insurance programs that generate persistent health benefits throughout the life cycle. As an illustration, Chapter 4 looks at the Children’s Health Insurance Program that enrolls children in low-income families. A novel mechanism of impact is that pregnant mothers started responding to 133 the program by reducing smoking and drinking during pregnancy, and birth outcomes improved for their children. The change in health behavior precedes actual enrollment in the program, but increases the return to later-life investments in the program. Exploiting the in-utero investment reponse, I find long-run benefits on the cognition of impacted cohorts in teenage. The preliminary finding suggests potential longer-term benefits in the labor markets, including higher earnings and tax revenues. Fully accounting for the life-cycle benefits is likely to significantly raise the effectiveness of early-life insurance programs for children. 134 Bibliography Aizawa,N. (2017). Labor market sorting and health insurance system design. mimeo. Aizer,A.,Eli,S.,Ferrie,J. andLleras-Muney,A. (2016). The long-run impact of cash transfers to poor families. American Economic Review, 106 (4), 935–71. Akerlof,G. (1970). The market for lemons. Quarterly journal of Economics, 84 (3), 488–500. AkosaAntwi,Y.,Moriya,A.S. andSimon,K. (2013). Effects of federal policy to insure young adults: evidence from the 2010 affordable care act’s dependent-coverage mandate. American Economic Journal: Economic Policy, 5 (4), 1–28. Allen,H.,Swanson,A.,Wang,J. andGross,T. (2017). Early medicaid expansion associ- ated with reduced payday borrowing in california. Health Affairs, 36 (10), 1769–1776. Andrews,I. andMiller,C. (2013). Optimal social insurance with heterogeneity. Argys,L.M.,Friedson,A.I.,Pitts,M.M. andTello-Trillo,D.S. (2017). Losing public health insurance: Tenncare disenrollment and personal financial distress. Athey,S. andImbens,G.W. (2006). Identification and inference in nonlinear difference-in- differences models. Econometrica, 74 (2), 431–497. Barseghyan,L.,Molinari,F.,O’Donoghue,T. andTeitelbaum,J.C. (2013). The nature of risk preferences: Evidence from insurance choices. American Economic Review, 103 (6), 2499–2529. Barsky,R.B.,Juster,F.T.,Kimball,M.S. andShapiro,M.D. (1997). Preference parameters and behavioral heterogeneity: An experimental approach in the health and retirement study. The Quarterly Journal of Economics, 112 (2), 537–579. Bauhoff,S. (2012). Do health plans risk-select? an audit study on germany’s social health insurance. Journal of Public Economics, 96 (9), 750–759. Bhargava,S.,Loewenstein,G. andSydnor,J. (2017). Choose to lose: Health plan choices from a menu with dominated option. The Quarterly Journal of Economics, 132 (3), 1319– 1372. Biasi,B. (2018). The labor market for teachers under different pay schemes. Mimeo, Yale University. Blavin, F. (2016). Association between the 2014 medicaid expansion and us hospital finances. Jama, 316 (14), 1475–1483. Blomqvist,A. andHorn,H. (1984). Public health insurance and optimal income taxation. Journal of Public Economics, 24 (3), 353–371. 135 Boudreaux,M.H.,Golberstein,E. andMcAlpine,D.D. (2016). The long-term impacts of medicaid exposure in early childhood: Evidence from the program’s origin. Journal of Health Economics, 45, 161–175. Brevoort,K.,Grodzicki,D. andHackmann,M.B. (2017). Medicaid and financial health. Brown,D.W.,Kowalski,A.E. andLurie,I.Z. (2015). Medicaid as an Investment in Children: What is the Long-Term Impact on Tax Receipts? Tech. rep., National Bureau of Economic Research. Brown,J.,Duggan,M.,Kuziemko,I. andWoolston,W. (2014). How does risk selection respond to risk adjustment? New evidence from the Medicare Advantage Program. The American Economic Review, 104 (10), 3335–3364. Cabral,M.,Geruso,M. andMahoney,N. (2018). Do larger health insurance subsidies benefit patients or producers? Evidence from Medicare Advantage. American Economic Review, 108 (8), 2048—-2087. Card, D. and Shore-Sheppard, L. D. (2004). Using discontinuous eligibility rules to identify the effects of the federal medicaid expansions on low-income children. Review of Economics and Statistics, 86 (3), 752–766. Carey,C. (2017). Technological change and risk adjustment: Benefit design incentives in Medicare Part D. American Economic Journal: Economic Policy, 9 (1), 38–73. — (2018). Sharing the burden of subsidization: Evidence on pass-through from a payment revision in medicare part d, Mimeo, Cornell University. Chandra, A., Gruber, J. and McKnight, R. (2011). The importance of the individual mandate—evidence from massachusetts. New England Journal of Medicine, 364 (4), 293– 295. —, — and — (2014). The impact of patient cost-sharing on low-income populations: evidence from massachusetts. Journal of health economics, 33, 57–66. Chetty,R. (2006). A general formula for the optimal level of social insurance. Journal of Public Economics, 90 (10-11), 1879–1901. — andFinkelstein,A. (2013). Social insurance: Connecting theory to data. 5, 111–193. —,Friedman,J.N. andRockoff,J.E. (2014). Measuring the impacts of teachers i: Evalu- ating bias in teacher value-added estimates. American Economic Review, 104 (9), 2593– 2632. Clemens,J. andGottlieb,J.D. (2014). Do physicians’ financial incentives affect medical treatment and patient health? American Economic Review, 104 (4), 1320–49. CMS (2008). Medicare and you 2008. 10050, DIANE Publishing. 136 CMS (2016). Nhe fact sheet. https://www.cms.gov/ research-statistics-data-and-systems/statistics-trends-and-reports/ nationalhealthexpenddata/nhe-fact-sheet.html, accessed: 2018-06-08. Cohen, A. and Einav, L. (2007). Estimating risk preferences from deductible choice. American economic review, 97 (3), 745–788. Cohodes,S.R.,Grossman,D.S.,Kleiner,S.A. andLovenheim,M.F. (2016). The effect of child health insurance access on schooling: Evidence from public insurance expansions. Journal of Human Resources, 51 (3), 727–759. CongressionalBudgetOffice (2018). Reduce Quality Bonus Payments to Medicare Advan- tage Plans. Tech. rep., Congressional Budget Office. Corman,H.,Joyce,T.J. andGrossman,M. (1987). Birth outcome production function in the united states. Journal of Human Resources, pp. 339–360. Cunha,F. andHeckman,J. (2007). The technology of skill formation. American Economic Review, 97 (2), 31–47. Currie,J.,Decker,S. andLin,W. (2008). Has public health insurance for older children reduced disparities in access to care and health outcomes? Journal of health Economics, 27 (6), 1567–1581. — andGrogger,J. (2002). Medicaid expansions and welfare contractions: offsetting effects on prenatal care and infant health? Journal of Health Economics, 21 (2), 313–335. — and Gruber, J. (1996a). Health insurance eligibility and child health: lessons from recent expansions of the medicaid program. Quarterly Journal of Economics, 431, 466. — and— (1996b). Saving babies: the efficacy and cost of recent changes in the medicaid eligibility of pregnant women. Journal of political Economy, 104 (6), 1263–1296. — and— (2001). Public health insurance and medical treatment: the equalizing impact of the medicaid expansions. Journal of Public Economics, 82 (1), 63–89. — andRossin-Slater,M. (2013). Weathering the storm: Hurricanes and birth outcomes. Journal of health economics, 32 (3), 487–503. — andSchwandt,H. (2014). Short-and long-term effects of unemployment on fertility. Proceedings of the National Academy of Sciences, 111 (41), 14734–14739. Curto,V.,Einav,L.,Levin,J. andBhattacharya,J. (2014). Can health insurance competi- tion work? Evidence from medicare advantage, National Bureau of Economic Research, No. w20818. Cutler,D.M. andGruber,J. (1996). Does public insurance crowd out private insurance? The Quarterly Journal of Economics, 111 (2), 391–430. 137 Dafny,L.S. (2005). How do hospitals respond to price changes? American Economic Review, 95 (5), 1525–1547. Darden,M. andMcCarthy,I.M. (2015). The star treatment: Estimating the impact of star ratings on medicare advantage enrollments. Journal of Human Resources, 50 (4), 980–1008. Dave,D.M.,Kaestner,R. andWehby,G.L. (2015). Does medicaid coverage for pregnant women affect prenatal health behaviors? Tech. rep., National Bureau of Economic Research. Decarolis,F. andGuglielmo,A. (2017). Insurers response to selection risk: Evidence from medicare enrollment reforms. Journal of Health Economics, forthcoming. —, — and Luscombe, C. (2017). Open enrollment periods and plan choices, national Bureau of Economic Research. Dehejia,R. andLleras-Muney,A. (2004). Booms, busts, and babies’ health. The Quarterly Journal of Economics, 119 (3), 1091–1130. Diamond, P.A. (1967). The role of a stock market in a general equilibrium model with technological uncertainty. The American Economic Review, pp. 759–776. Dranove, D., Garthwaite, C. and Ody, C. (2016). Uncompensated care decreased at hospitals in medicaid expansion states but not at hospitals in nonexpansion states. Health Affairs, 35 (8), 1471–1479. Duggan,M.,Starc,A. andVabson,B. (2016). Who benefits when the government pays more? pass-through in the medicare advantage program. Journal of Public Economics, 141, 50–67. Einav, L., Finkelstein, A. and Cullen, M. R. (2010). Estimating welfare in insurance markets using variation in prices. The quarterly journal of economics, 125 (3), 877–921. —,—,Kluender,R. andSchrimpf,P. (2016). Beyond statistics: the economic content of risk scores. American Economic Journal: Applied Economics, 8 (2), 195–224. Evans,W.N. andLien,D.S. (2005). The benefits of prenatal care: evidence from the pat bus strike. Journal of Econometrics, 125 (1-2), 207–239. — andRingel,J.S. (1999). Can higher cigarette taxes improve birth outcomes? Journal of public Economics, 72 (1), 135–154. Finkelstein,A.,Hendren,N. andLuttmer,E.F. (2015). The value of medicaid: Interpret- ing results from the oregon health insurance experiment. —, — and Shepard, M. (2017). Subsidizing health insurance for low-income adults: Evidence from massachusetts. — andMcGarry,K. (2006). Multiple dimensions of private information: evidence from the long-term care insurance market. American Economic Review, 96 (4), 938–958. 138 — andPoterba,J. (2004). Adverse selection in insurance markets: Policyholder evidence from the uk annuity market. Journal of Political Economy, 112 (1), 183–208. —,Taubman,S.,Wright,B.,Bernstein,M.,Gruber,J.,Newhouse,J.P.,Allen,H.,Baicker, K. andGroup,O.H.S. (2012). The oregon health insurance experiment: evidence from the first year. The Quarterly journal of economics, 127 (3), 1057–1106. Frean, M., Gruber, J. and Sommers, B. D. (2017). Premium subsidies, the mandate, and medicaid expansion: Coverage effects of the affordable care act. Journal of Health Economics, 53, 72–86. Gallagher,E.,Gopalan,R. andGrinstein-Weiss,M. (2018). The effect of health insurance on home payment delinquency: Evidence from aca marketplace subsidies. Garthwaite,C.,Gross,T. andNotowidigdo,M.J. (2014). Public health insurance, labor supply, and employment lock. The Quarterly Journal of Economics, 129 (2), 653–696. Geruso,M.,Layton,T.J. andPrinz,D. (2016). Screening in contract design: Evidence from the ACA Health Insurance Exchanges. Mimeo, National Bureau of Economic Research. Girotti, M. E., Ko, C. Y. and Dimick, J. B. (2013). Hospital morbidity rankings and complication severity in vascular surgery. Journal of vascular surgery, 57 (1), 158–164. Glazer, J. andMcGuire, T.G. (2006). Optimal quality reporting in markets for health plans. Journal of Health Economics, 25 (2), 295–310. Graves,J.A. andGruber,J. (2012). How did health care reform in massachusetts impact insurance premiums? American Economic Review, 102 (3), 508–13. Gross,T. andNotowidigdo,M.J. (2011). Health insurance and the consumer bankruptcy decision: Evidence from expansions of medicaid. Journal of Public Economics, 95 (7-8), 767–778. Grossman,M. andJoyce,T.J. (1990). Unobservables, pregnancy resolutions, and birth weight production functions in new york city. Journal of Political Economy, 98 (5, Part 1), 983–1007. Gruber,J. (2010). The tax exclusion for employer-sponsored health insurance. — (2017). Delivering public health insurance through private plan choice in the united states. Journal of Economic Perspectives, 31 (4), 3–22. — and Simon, K. (2008). Crowd-out 10 years later: Have recent public insurance ex- pansions crowded out private health insurance? Journal of health economics, 27 (2), 201–217. Gupta,A. (2017). Impacts of performance pay for hospitals: The readmissions reduction program, Mimeo, Wharton. 139 Hackmann, M. B., Kolstad, J. T. and Kowalski, A. E. (2015). Adverse selection and an individual mandate: When theory meets practice. American Economic Review, 105 (3), 1030–66. Handel, B. R. and Kolstad, J. T. (2015). Health insurance for” humans”: Information frictions, plan choice, and consumer welfare. American Economic Review, 105 (8), 2449– 2500. Harris Interactive (2011). Medicare star quality rating system study: Key findings. xnet.kp.org/newscenter/pressreleases/nat/2011/downloads/ 101011medicarerankingsHarrisSurveyInfo.pdf, accessed: 2018-03-22. Hart,O.,Shleifer,A. andVishny,R.W. (1997). The proper scope of government: theory and an application to prisons. The Quarterly Journal of Economics, 112 (4), 1127–1161. Heim,B.T.,Hunter,G.,Isen,A.,Lurie,I.Z. andRamnath,S.P. (2017). Income responses to the affordable care act: Evidence from the premium tax credit notch. Hendren,N. (2017). Knowledge of future job loss and implications for unemployment insurance. American Economic Review, 107 (7), 1778–1823. Holmstrom, B. and Milgrom, P. (1991). Multitask principal-agent analyses: Incentive contracts, asset ownership, and job design. Journal of Law, Economics, & Organization, 7, 24–52. Hoynes,H.,Schanzenbach,D.W. andAlmond,D. (2016). Long-run impacts of childhood access to the safety net. American Economic Review, 106 (4), 903–34. Hu, L., Kaestner, R., Mazumder, B., Miller, S. and Wong, A. (2016). The effect of the patient protection and affordable care act medicaid expansions on financial wellbeing. Tech. rep., National Bureau of economic research. Isen,A.,Rossin-Slater,M. andWalker,W.R. (2017). Every breath you take—every dollar you’ll make: The long-term consequences of the clean air act of 1970. Journal of Political Economy, 125 (3), 848–902. Jaffe, S. and Shepard, M. (2018). Price-linked subsidies and imperfect competition in health insurance. mimeo. Joyce,T. andRacine,A. (2003). Chip shots: Association between the state children’s health insurance programs and immunization coverage and delivery. Tech. rep., National Bureau of Economic Research. Keehan,S.P.,Sisko,A.M.,Truffer,C.J.,Poisal,J.A.,Cuckler,G.A.,Madison, A.J., Lizonitz,J.M. andSmith,S.D. (2011). National health spending projections through 2020: economic recovery and reform drive faster spending growth. Health Affairs, 30 (8), 1594–1605. 140 Kling,J.R.,Mullainathan,S.,Shafir,E.,Vermeulen,L.C. andWrobel,M.V. (2012). Comparison friction: Experimental evidence from medicare drug plans. The Quarterly Journal of Economics, 127 (1), 199–235. Kucko,K.,Rinz,K. andSolow,B. (2018). Labor market effects of the affordable care act: Evidence from a tax notch. Landais, C., Nekoei, A., Nilsson, P., Seim, D. and Spinnewijn, J. (2017). Risk-based selection in unemployment insurance: evidence and implications. Lavetti, K. andSimon, K. (2017). Strategic formulary design in Medicare Part D plans. American Economic Journal: Economic Policy, forthcoming. Layton,T.J. andRyan,A.M. (2015). Higher incentive payments in Medicare Advantage’s pay-for-performance program did not improve quality but did increase plan offerings. Health services research, 50 (6), 1810–1828. Levine,P.B. andSchanzenbach,D. (2009). The impact of children’s public health insur- ance expansions on educational outcomes. In Forum for Health Economics & Policy, De Gruyter, vol. 12. Lyons,S. (2017). Are employer mandates to offer health insurance effective in reducing subsidized coverage crowd-out of employer-sponsored insurance? American Journal of Health Economics, 3 (3), 370–391. Mahoney,N. (2015). Bankruptcy as implicit health insurance. American Economic Review, 105 (2), 710–46. McCarthy,I.M. andDarden,M. (2017). Supply-side responses to public quality ratings: Evidence from Medicare Advantage. American Journal of Health Economics. McClellan,M. (2011). Reforming payments to healthcare providers: The key to slowing healthcare cost growth while improving quality? Journal of Economic Perspectives, 25 (2), 69–92. MedPAC (2015). Medicare payment advisory commission, report to the congress: Medicare payment policy. Miller, K. (2016). Do private Medicare firms face lower costs?, Mimeo. University of Oregon. Newhouse,J.P. andMcGuire,T.G. (2014). How successful is Medicare Advantage. Milbank Quarterly, 92 (2), 351–394. —,Price,M.,Huang,J.,McWilliams,J.M. andHsu,J. (2012). Steps to reduce favorable risk selection in Medicare Advantage largely succeeded, boding well for health insurance exchanges. Health Affairs, 31 (12), 2618–2628. Nilsson,J.P. (2017). Alcohol availability, prenatal conditions, and long-term economic outcomes. Journal of Political Economy, 125 (4), 1149–1207. 141 NQF (2014). Risk adjustment for socioeconomic status or other sociodemographic factors. National Quality Forum. Reynolds, L. (2013). The future of the nhs–irreversible privatisation? interview by jill mountford. BMJ (Clinical research ed), 346, f1848. Rosenzweig,M.R. andSchultz,T.P. (1982). Market opportunities, genetic endowments, and intrafamily resource distribution: Child survival in rural india. The American Economic Review, 72 (4), 803–815. Rothschild,M. andStiglitz,J. (1976). Equilibrium in competitive insurance markets: An essay on the economics of imperfect information. The Quarterly Journal of Economics, pp. 629–649. Ruggles,S.,Flood,S.,Goeken,R.,Grover,J.,Meyer,E.,Pacas,J. andSobek,M. (2018). Ipums usa: Version 8.0 [dataset]. Minneapolis, MN: IPUMS, https://doi.org/10. 18128/D010.V8.0. Saez, E. and Stantcheva, S. (2016). Generalized social marginal welfare weights for optimal tax theory. American Economic Review, 106 (1), 24–45. Saltzman, E. (2018). Demand for health insurance: Evidence from the california and washington aca exchanges. Sasso, A. T. L. and Buchmueller, T. C. (2004). The effect of the state children’s health insurance program on health insurance coverage. Journal of health economics, 23 (5), 1059–1082. Schaller, J. (2016). Booms, busts, and fertility testing the becker model using gender- specific labor demand. Journal of Human Resources, 51 (1), 1–29. Simon,D. (2016). Does early life exposure to cigarette smoke permanently harm childhood welfare? evidence from cigarette tax hikes. American Economic Journal: Applied Economics, 8 (4), 128–59. Sommers,B.D.,Shepard,M. andHempstead,K. (2018). Why did employer coverage fall in massachusetts after the aca? potential consequences of a changing employer mandate. Health Affairs, 37 (7), 1144–1152. Sonchak,L. (2015). Medicaid reimbursement, prenatal care and infant health. Journal of health economics, 44, 10–24. Song,Z.,Landrum,M.B. andChernew,M.E. (2012). Competitive bidding in medicare: who benefits from competition? The American Journal of Managed Care, 18 (9), 546. —,— and— (2013). Competitive bidding in Medicare Advantage: Effect of benchmark changes on plan bids. Journal of Health Economics, 32 (6), 1301–1312. 142 Soria-Saucedo, R., Xu, P., Newsom, J., Cabral, H. and Kazis, L. E. (2016). The role of geography in the assessment of quality: Evidence from the medicare advantage program. PloS one, 11 (1), e0145656. Tebaldi,P. (2017). Estimating equilibrium in health insurance exchanges: Price competi- tion and subsidy design under the aca. mimeo. Veiga, A. and Weyl, E. G. (2016). Product design in selection markets. The Quarterly Journal of Economics, 131 (2), 1007–1056. Vickers, J. and Yarrow, G. (1991). Economic perspectives on privatization. Journal of Economic Perspectives, 5 (2), 111–132. Warner,G. (1998). Birthweight productivity of prenatal care. Southern Economic Journal, pp. 42–63. Wherry, L. R. and Meyer, B. D. (2016). Saving teens: using a policy discontinuity to estimate the effects of medicaid eligibility. Journal of Human Resources, 51 (3), 556–588. —,Miller,S.,Kaestner,R. andMeyer,B.D. (2015). Childhood Medicaid coverage and later life health care utilization. Tech. rep., National Bureau of Economic Research. Wilson,R. (1968). The theory of syndicates. Econometrica: journal of the Econometric Society, pp. 119–132. 143 Appendices 144 A Appendix to Chapter 2 A.1 Appendix Tables Table A.1.1: Effect of QBP on Part C premium, within-contract cross-county variation (I) (II) (III) (IV) (V) (VI) Risk High Post 18.08* 18.63 (10.45) (13.50) Risk Post -10.43 7.44 -11.33* -8.74 6.88 -10.34 (6.42) (10.25) (6.29) (6.75) (12.80) (6.48) High Post -3.68 -4.66 (3.79) (3.99) Counties all 15% Tails Sample low high full low high full y mean 25.17 49.94 31.96 24.97 75.47 31.20 R 2 0.77 0.85 0.82 0.77 0.85 0.82 N 14,861 5,611 20,472 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract Part C premium variation across risk regions. Column 1-2 show the difference-in-difference estimates on premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, which gives the differential pricing response by high quality contracts in higher risk counties. Column 4-6 repeat the analysis, but only include counties in the lower and upper 15% of the risk distribution within a contract’s market set. All regressions include contract-county fixed effects. The raw trend is displayed in Figure A.2.9. Standard errors clustered two-way at the contract and county level in the parenthesis. 145 Table A.1.2: Effect of QBP on Part D premium, within-contract cross-county variation (I) (II) (III) (IV) (V) (VI) Risk High Post 12.43* 16.98** (7.49) (7.98) Risk Post -4.56 13.56** -3.42 -4.75 14.76** -4.32 (4.83) (6.31) (4.71) (5.17) (6.99) (4.97) High Post 2.44 3.50* (2.31) (1.90) Counties all 15% Tails Sample low high full low high full y mean 17.77 28.60 20.74 17.46 27.60 20.22 R 2 0.76 0.67 0.75 0.75 0.70 0.75 N 14,861 5,611 20,472 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract Part D premium variation across risk regions. Column 1-2 show the difference-in-difference estimates on premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, which gives the differential pricing response by high quality contracts in higher risk counties. Column 4-6 repeat the analysis, but only include counties in the lower and upper 15% of the risk distribution within a contract’s market set. All regressions include contract-county fixed effects. The raw trend is displayed in Figure A.2.10. Standard errors clustered two-way at the contract and county level in the parenthesis. 146 Table A.1.3: Effect of QBP on Part C premium, within-contract variation, unweighted by enrollment (I) (II) (III) (IV) (V) (VI) Risk High Post 15.97 19.19 (12.04) (13.71) Risk Post -11.32* 7.72 -10.75 -10.10 8.86 -10.26 (6.65) (11.55) (6.69) (7.00) (12.28) (6.86) High Post -3.10 -3.96 (3.44) (3.53) Counties all 15% Tails Sample low high full low high full y mean 26.92 49.55 33.12 26.73 48.21 32.57 R 2 0.76 0.80 0.79 0.76 0.82 0.79 N 14,861 5,611 20,472 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract Part C premium variation across risk regions. Premium is measured as the simple arithmetic mean over all plan premium offered in a county, unweighted by enrollment. Column 1-2 show the difference-in-difference estimates on premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, which gives the differential pricing response by high quality contracts in higher risk counties. Column 4-6 repeat the analysis, but only include counties in the lower and upper 15% of the risk distribution within a contract’s market set. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. 147 Table A.1.4: Effect of QBP on Part D premium, within-contract variation, un- weighted by enrollment (I) (II) (III) (IV) (V) (VI) Risk High Post 16.45*** 18.53** (7.54) (8.34) Risk Post -5.07 15.58** -4.15 -5.05 14.92* -4.65 (4.86) (6.69) (4.82) (4.77) (7.82) (4.67) High Post 2.25 2.98 (2.14) (1.80) Counties all 15% Tails Sample low high full low high full y mean 18.78 29.37 21.68 18.50 28.59 21.24 R 2 0.74 0.66 0.74 0.73 0.69 0.74 N 14,861 5,611 20,472 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract Part D premium variation across risk regions. Premium is measured as the simple arithmetic mean over all plan premium offered in a county, unweighted by enrollment. Column 1-2 show the difference-in-difference estimates on premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, which gives the differential pricing response by high quality contracts in higher risk counties. Column 4-6 repeat the analysis, but only include counties in the lower and upper 15% of the risk distribution within a contract’s market set. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. 148 Table A.1.5: Effect of QBP on drug deductible, within-contract variation, un- weighted by enrollment (I) (II) (III) (IV) (V) (VI) Risk High Post -25.18 -22.04 (43.50) (51.92) Risk Post 64.36** 58.54 65.77*** 66.97** 63.30 67.24*** (24.89) (41.96) (24.97) (25.68) (46.92) (25.58) High Post -10.53 -12.88 (11.29) (10.66) Counties all 15% Tails Sample low high full low high full y mean 31.14 27.83 29.08 28.09 26.29 28.32 R 2 0.66 0.52 0.63 0.65 0.55 0.62 N 14,861 5,611 20,472 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract variation in drug deductible across risk regions. Premium is measured as the simple arithmetic mean over all plans offered in a county, unweighted by enrollment. Column 1-2 show the difference-in-difference estimates on premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, which gives the differential pricing response by high quality contracts in higher risk counties. Column 4-6 repeat the analysis, but only include counties in the lower and upper 15% of the risk distribution within a contract’s market set. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. 149 Table A.1.6: Effect of QBP on Part C premium, within-contract variation, distance to mean (I) (II) (III) (IV) (V) (VI) Risk High Post 11.75 15.26 (14.18) (14.84) Risk Post -10.56 1.10 -11.34* -8.83 3.39 -10.39 (6.86) (13.44) (6.67) (6.94) (14.25) (6.66) High Post -3.61 -4.60 (3.76) (3.98) Counties all 15% Tails Sample low high full low high full y mean 25.17 49.94 31.96 24.97 47.87 31.20 R 2 0.77 0.85 0.82 0.77 0.85 0.82 N 14,861 5,611 20,472 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract Part C premium variation across risk regions. Specifically, the variable Risk measures the distance to the mean county risk in a contract’s service area, rather than distance to median as in the main analysis. Column 1-2 show the difference-in-difference estimates on premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, which gives the differential pricing response by high quality contracts in higher risk counties. Column 4-6 repeat the analysis, but only include counties in the lower and upper 15% of the risk distribution within a contract’s market set. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. 150 Table A.1.7: Effect of QBP on Part D premium, within-contract variation, dis- tance to mean (I) (II) (III) (IV) (V) (VI) Risk High Post 14.64* 17.74** (7.66) (8.65) Risk Post -2.87 17.16** -1.80 -4.09 16.19** -3.65 (4.70) (7.16) (4.57) (5.16) (7.73) (4.96) High Post 2.43 3.51 (2.29) (1.89) Counties all 15% Tails Sample low high full low high full y mean 17.77 28.60 20.74 17.46 27.60 20.22 R 2 0.76 0.67 0.75 0.75 0.70 0.75 N 14,861 5,611 20,472 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract Part D premium variation across risk regions. Specif- ically, the variable Risk measures the distance to the mean county risk in a contract’s service area, rather than distance to median as in the main analysis. Column 1-2 show the difference- in-difference estimates on premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, which gives the differential pricing response by high quality contracts in higher risk counties. Column 4-6 repeat the analysis, but only include counties in the lower and upper 15% of the risk distribution within a contract’s market set. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. 151 Table A.1.8: Effect of QBP on drug deductible, within-contract variation, distance to mean (I) (II) (III) (IV) (V) (VI) Risk High Post -38.25 -33.03 (46.29) (50.93) Risk Post 22.08 42.13** 44.76** 36.31** 21.34 38.37** (44.70) (19.14) (19.47) (17.23) (49.49) (17.50) High Post -12.03 -13.44 (9.98) (9.50) Counties all 15% Tails Sample low high full low high full y mean 30.05 24.92 28.65 28.62 24.91 27.61 R 2 0.61 0.70 0.67 0.68 0.67 0.68 N 5,611 14,861 20,472 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract drug deductible variation across risk regions. Specif- ically, the variable Risk measures the distance to the mean county risk in a contract’s service area, rather than distance to median as in the main analysis. Column 1-2 show the difference- in-difference estimates on premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, which gives the differential pricing response by high quality contracts in higher risk counties. Column 4-6 repeat the analysis, but only include counties in the lower and upper 15% of the risk distribution within a contract’s market set. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. 152 Table A.1.9: Effect on premium and deductible, within-contract variation, standard deviation from mean county risk (I) (II) (III) (IV) (V) (VI) (VII) (VIII) (XI) Part C Premium Part D Premium Drug Deductible Risk High Post 16.81 16.16** -28.31 (13.37) (7.94) (50.39) Risk Post -6.98 9.28 -7.41 -5.32 12.46 -4.92 60.55** 44.94 60.72** (8.25) (11.54) (8.14) (5.13) (7.63) (5.02) (23.54) (46.46) (24.05) High Post -2.52 3.26* -16.53 (3.27) (1.93) (10.90) Counties distance to mean> s.d. distance to mean> s.d. distance to mean> s.d. Sample low high full low high full low high full y mean 26.42 47.72 32.12 18.37 29.39 21.32 27.99 26.58 27.61 R 2 0.77 0.81 0.80 0.73 0.68 0.74 0.65 0.50 0.61 N 4,317 1,578 5,895 4,317 1,578 5,895 4,317 1,578 5,895 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract premium and drug deductible variation across counties. Specifically, the variable Risk measures the distance to the mean county risk in a contract’s service area. Sample is restricted to counties whose risk is more than one standard deviation above or below the mean county risk. All price variables are simple averages over plans, unweighted by enrollment. Column 1-2 show the difference-in-difference estimates on Part C premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, or the differential effect on high quality. Column 4-6 repeat the analysis for part D premium. Column 7-9 looks at drug deductible. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. 153 Table A.1.10: Plans with zero premium or deductible, within-contract variation (I) (II) (III) (IV) (V) (VI) (VII) (VIII) (XI) Zero Part C Premium Zero Part D Premium Zero Drug Deductible Risk High Post -0.18 -0.44** 0.093 (0.12) (0.19) (0.22) Risk Post 0.11 -0.083 0.10 0.20 -0.30* 0.19 -0.31*** -0.27 -0.29** (0.10) (0.084) (0.10) (0.13) (0.15) (0.13) (0.11) (0.21) (0.11) High Post 0.034 0.022 0.061 (0.031) (0.032) (0.053) Counties 15% tails 15% tails 15% tails Sample low high full low high full low high full y mean 0.44 0.23 0.38 0.43 0.19 0.36 0.84 0.87 0.85 R 2 0.75 0.71 0.75 0.74 0.74 0.75 0.63 0.59 0.62 N 4,393 1,641 6,034 4,393 1,641 6,034 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract variation in the offering of zero-premium and zero-deductible plans across counties. Specifically, the outcome is the percent of plans with zero premium (or deductible) given contract and location, unweighted by enrollment. Variable Risk measures the distance to the median county risk in the contract’s service area. Sample is restricted to counties in the upper or lower 15 percentile of county risk in the service area. Column 1-2 show the difference-in-difference estimates on Part C premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, or the differential effect on high quality. Column 4-6 repeat the analysis for part D premium. Column 7-9 looks at drug deductible. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. 154 Table A.1.11: Enrollment in plans with zero premium or deductible, within-contract variation (I) (II) (III) (IV) (V) (VI) (VII) (VIII) (XI) Zero Part C Premium Zero Part D Premium Zero Drug Deductible Risk High Post -0.20 -0.52** 0.14 (0.15) (0.21) (0.23) Risk Post 0.12 -0.13 0.098 0.23 -0.39** 0.21 -0.19** -0.098 -0.18* (0.12) (0.10) (0.12) (0.14) (0.17) (0.14) (0.094) (0.23) (0.096) High Post 0.024 0.011 0.058 (0.032) (0.032) (0.053) Counties 15% tails 15% tails 15% tails Sample low high full low high full low high full y mean 0.46 0.23 0.40 0.45 0.19 0.38 0.84 0.87 0.85 R 2 0.77 0.71 0.77 0.76 0.78 0.78 0.65 0.65 0.65 N 4,393 1,641 6,034 4,393 1,641 6,034 4,393 1,641 6,034 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows the within-contract variation in the enrollment in zero-premium and zero-deductible plans across counties. Specifically, the outcome is the share of individuals enrolled in zero-premium or zero-deductible plans given contract and location. Variable Risk measures the distance to the median county risk in the contract’s service area. Sample is restricted to counties in the upper or lower 15 percentile of county risk in the service area. Column 1-2 show the difference-in-difference estimates on Part C premium across risk regions for baseline low quality (column 1) and high quality (column 2) contracts. Column 3 shows the triple-difference estimate, or the differential effect on high quality. Column 4-6 repeat the analysis for part D premium. Column 7-9 looks at drug deductible. All regressions include contract-county fixed effects. Standard errors clustered two-way at the contract and county level in the parenthesis. 155 Table A.1.12: Part C measures in the quality rating, 2013 ID name category weight source time frame Domain 1: Staying Healthy: Screenings, Tests and Vaccines C01 Breast Cancer Screening Process Measure 1 HEDIS 01/01/2011 - 12/31/2011 C02 Colorectal Cancer Screening Process Measure 1 HEDIS 01/01/2011 - 12/31/2011 C03 Cardiovascular Care – Cholesterol Screening Process Measure 1 HEDIS 01/01/2011 - 12/31/2011 C04 Diabetes Care – Cholesterol Screening Process Measure 1 HEDIS 01/01/2011 - 12/31/2011 C05 Glaucoma Testing Process Measure 1 HEDIS 01/01/2011 - 12/31/2011 C06 Annual Flu Vaccine Process Measure 1 CAHPS 02/15/2012 - 05/31/2012 C07 Improving or Maintaining Physical Health Outcome Measure 3 HOS 04/18/2011 - 07/31/2011 C08 Improving or Maintaining Mental Health Outcome Measure 3 HOS 04/18/2011 - 07/31/2011 C09 Monitoring Physical Activity Process Measure 1 HOS/HEDIS 04/18/2011 - 07/31/2011 C10 Adult BMI Assessment Process Measure 1 HEDIS 01/01/2011 - 12/31/2011 Domain 2: Managing Chronic (Long Term) Conditions C11 Care for Older Adults – Medication Review Process Measure 1 HEDIS 01/01/2011 - 12/31/2011 C12 Care for Older Adults – Functional Status Assessment Process Measure 1 HEDIS 01/01/2011 - 12/31/2012 C13 Care for Older Adults – Pain Screening Process Measure 1 HEDIS 01/01/2011 - 12/31/2013 C14 Osteoporosis Management in Women who had a Fracture Process Measure 1 HEDIS 01/01/2011 - 12/31/2014 C15 Diabetes Care – Eye Exam Process Measure 1 HEDIS 01/01/2011 - 12/31/2015 C16 Diabetes Care – Kidney Disease Monitoring Process Measure 1 HEDIS 01/01/2011 - 12/31/2016 C17 Diabetes Care – Blood Sugar Controlled Intermediate Outcome Measures 3 HEDIS 01/01/2011 - 12/31/2017 C18 Diabetes Care – Cholesterol Controlled Intermediate Outcome Measures 3 HEDIS 01/01/2011 - 12/31/2018 C19 Controlling Blood Pressure Intermediate Outcome Measures 3 HEDIS 01/01/2011 - 12/31/2019 C20 Rheumatoid Arthritis Management Process Measure 1 HEDIS 01/01/2011 - 12/31/2020 C21 Improving Bladder Control Process Measure 1 HOS/HEDIS 04/18/2011 - 07/31/2011 C22 Reducing the Risk of Falling Process Measure 1 HOS/HEDIS 04/18/2011 - 07/31/2011 C23 Plan All-Cause Readmissions Outcome Measure 3 HEDIS 01/01/2011 - 12/31/2020 Domain 3: Member Experience with Health Plan C24 Getting Needed Care Patients’ Experience and Complaints Measure 1.5 CAHPS 02/15/2012 - 05/31/2012 C25 Getting Appointments and Care Quickly Patients’ Experience and Complaints Measure 1.5 CAHPS 02/15/2012 - 05/31/2012 C26 Customer Service Patients’ Experience and Complaints Measure 1.5 CAHPS 02/15/2012 - 05/31/2012 C27 Overall Rating of Health Care Quality Patients’ Experience and Complaints Measure 1.5 CAHPS 02/15/2012 - 05/31/2012 C28 Overall Rating of Plan Patients’ Experience and Complaints Measure 1.5 CAHPS 02/15/2012 - 05/31/2012 C29 Care Coordination Patients’ Experience and Complaints Measure 1 CAHPS 02/15/2012 - 05/31/2012 Domain 4: Member Complaints, Problems Getting Services, and Improvement in the Health Plan’s Performance C30 Complaints about the Health Plan Patients’ Experience and Complaints Measure 1.5 CTM 01/01/2012 - 06/30/2012 C31 Beneficiary Access and Performance Problems Measures Capturing Access 1.5 CMS 01/01/2011 - 12/31/2011 C32 Members Choosing to Leave the Plan Patients’ Experience and Complaints Measure 1.5 MBDSS 01/01/2011 - 12/31/2011 C33 Health Plan Quality Improvement Outcome Measure 1 CMS 2012 rating Domain 5: Health Plan Customer Service C34 Plan Makes Timely Decisions about Appeals Measures Capturing Access 1.5 IRE 01/01/2011 - 12/31/2011 C35 Reviewing Appeals Decisions Measures Capturing Access 1.5 IRE 01/01/2011 - 12/31/2011 C36 Call Center – Foreign Language Interpreter and TTY/TDD Availability Measures Capturing Access 1.5 Call Center 01/30/2012 - 05/18/2012 C37 Enrollment Timeliness Process Measure 1 MARx 01/01/2012 - 06/30/2012 Notes: Table lists the name of Part C measures in the 2013 quality rating, with detailed information on the data source of the measure, and the relevant measurement period in the source. Weight attached to each measure in the final rating is also listed. 156 Table A.1.13: Part D measures in the quality rating, 2013 ID name category weight source time frame Domain 1: Drug Plan Customer Service D01 Call Center – Pharmacy Hold Time Measures Capturing Access 1.5 Call Center 02/06/2012 - 05/18/2012 D02 Call Center – Foreign Language Interpreter and TTY/TDD Availability Measures Capturing Access 1.5 Call Center 01/30/2012 - 05/18/2012 D03 Appeals Auto–Forward Measures Capturing Access 1.5 IRE 01/01/2011 - 12/31/2011 D04 Appeals Upheld Measures Capturing Access 1.5 IRE 01/01/2012 - 6/30/2012 D05 Enrollment Timeliness Process Measure 1 MARx 01/01/2012 - 06/30/2012 Domain 2: Member Complaints, Problems Getting Services, and Improvement in the Drug Plan’s Performance (identical to part C domain 4; redundant and not used in the final rating) D06 Complaints about the Drug Plan Patients’ Experience and Complaints Measure 1.5 CTM 01/01/2012 - 06/30/2012 D07 Beneficiary Access and Performance Problems Measures Capturing Access 1.5 CMS 01/01/2011 - 12/31/2011 D08 Members Choosing to Leave the Plan Patients’ Experience and Complaints Measure 1.5 MBDSS 01/01/2011 - 12/31/2011 D09 Drug Plan Quality Improvement Outcome Measure 1 CMS 2012 rating Domain 3: Member Experience with the Drug Plan D10 Getting Information From Drug Plan Patients’ Experience and Complaints Measure 1.5 CAHPS 02/15/2012 - 05/31/2012 D11 Rating of Drug Plan Patients’ Experience and Complaints Measure 1.5 CAHPS 02/15/2012 - 05/31/2012 D12 Getting Needed Prescription Drugs Patients’ Experience and Complaints Measure 1.5 CAHPS 02/15/2012 - 05/31/2012 Domain 4: Member Experience with the Drug Plan D13 MPF Price Accuracy Process Measure 1 PDE 01/01/2011 - 09/30/2011 D14 High Risk Medication Intermediate Outcome Measures 3 PDE 01/01/2011 - 12/31/2011 D15 Diabetes Treatment Intermediate Outcome Measures 3 PDE 01/01/2011 - 12/31/2011 D16 Part D Medication Adherence for Oral Diabetes Medications Intermediate Outcome Measures 3 PDE 01/01/2011 - 12/31/2011 D17 Part D Medication Adherence for Hypertension (RAS antagonists) Intermediate Outcome Measures 3 PDE 01/01/2011 - 12/31/2011 D18 Part D Medication Adherence for Cholesterol (Statins) Intermediate Outcome Measures 3 PDE 01/01/2011 - 12/31/2011 Notes: Table lists the name of Part D measures in the 2013 quality rating, with detailed information on the data source of the measure, and the relevant measurement period in the source. Weight attached to each measure in the final rating is also listed. Measure D06-D09 in the part D domain “Member Complaints, Problems Getting Services, and Improvement in the Drug Plan’s Performance” are identical to measure C30-C33 in the corresponding Part C domain; only C30-C33 are kept for computing the final rating. 157 A.2 Appendix Figures Figure A.2.1: Contract risk score distribution, by quality and year (a) Low-Quality Contracts 0 1 2 3 4 5 .6 .8 1 1.2 1.4 1.6 risk score 2009 2010 2011 2012 2013 2014 (a) low −quality contracts 0 1 2 3 4 5 .8 1 1.2 1.4 1.6 risk score 2009 2010 2011 2012 2013 2014 (b) high −quality contracts (b) High-Quality Contracts 0 1 2 3 4 5 .6 .8 1 1.2 1.4 1.6 risk score 2009 2010 2011 2012 2013 2014 (a) low −quality contracts 0 1 2 3 4 5 .8 1 1.2 1.4 1.6 risk score 2009 2010 2011 2012 2013 2014 (b) high −quality contracts Notes. The figure plots the kernel density of contract risk scores for high and low quality contracts in each year from 2009 to 2014. Contract risk scores are aggregated from plan risk score using enrollment weights. 158 Figure A.2.2: Plan risk score distribution, by quality and year (a) Low-Quality Plans 0 1 2 3 4 5 .5 1 1.5 2 risk score 2009 2010 2011 2012 2013 2014 (a) low −quality contracts 0 1 2 3 4 5 .5 1 1.5 2 risk score 2009 2010 2011 2012 2013 2014 (b) high −quality contracts (b) High-Quality Plans 0 1 2 3 4 5 .5 1 1.5 2 risk score 2009 2010 2011 2012 2013 2014 (a) low −quality contracts 0 1 2 3 4 5 .5 1 1.5 2 risk score 2009 2010 2011 2012 2013 2014 (b) high −quality contracts Notes. The figure plots the kernel density of plan risk scores for high and low quality contracts in each year from 2009 to 2014. Although quality is assigned at the contract level, the figure shows the risk score distribution at the plan level given quality. 159 Figure A.2.3: Effect on risk score, by service area risk, event study (a) Raw Trend .9 .95 1 1.05 2009 2010 2011 2012 2013 2014 ACA QBP high−risk low−quality low−risk high−quality (b) Event Study −.08 −.06 −.04 −.02 0 .02 2009 2010 2011 2012 2013 2014 ACA QBP (c) Raw Trend, 15% Tails .9 .95 1 1.05 1.1 2009 2010 2011 2012 2013 2014 ACA QBP high−risk (>85 pct.) low−quality low−risk (<15 pct.) high−quality (d) Event Study, 15% Tails −.15 −.1 −.05 0 .05 2009 2010 2011 2012 2013 2014 ACA QBP Notes. Panel (a) shows raw trend of contract-level risk score, for high quality contracts with below median service area risk and low quality contracts with above median service area risks. Panel (b) shows the event study estimates in 95% confidence intervals based on robust standard error clustered at the level of contracts. Corresponding raw trend and event study estimates for the 15% tails are in Panel (c) and (d), respectively. 160 Figure A.2.4: Effect on risk score, by market competition, event study (a) Raw Trend .92 .94 .96 .98 1 2009 2010 2011 2012 2013 2014 ACA QBP low−hhi low−quality high−hhi high−quality (b) Event Study −.06 −.04 −.02 0 .02 2009 2010 2011 2012 2013 2014 ACA QBP (c) Raw Trend, 15% Tails .9 .95 1 1.05 2009 2010 2011 2012 2013 2014 ACA QBP low−hhi (<15 pct.) low−quality high−hhi (>85 pct.) high−quality (d) Event Study, 15% Tails −.05 0 .05 2009 2010 2011 2012 2013 2014 ACA QBP Notes. Panel (a) shows raw trend of contract-level risk score, for high quality contracts with below median HHI and low quality contracts with above median HHI. Panel (b) shows the event study estimates in 95% confidence intervals based on robust standard error clustered at the level of contracts. Corresponding raw trend and event study estimates for the 15% tails are in Panel (c) and (d), respectively. 161 Figure A.2.5: Effect on premium and drug deductible, event study (a) Raw Trend, Premium 20 40 60 80 100 2009 2010 2011 2012 2013 2014 ACA QBP low quality high quality (b) Event Study, Premium −10 −5 0 5 10 2009 2010 2011 2012 2013 2014 ACA QBP (c) Raw Trend, Drug Deductible 25 30 35 40 45 2009 2010 2011 2012 2013 2014 ACA QBP low quality high quality (d) Event Study, Drug Deductible −40 −20 0 20 2009 2010 2011 2012 2013 2014 ACA QBP Notes. Panel (a) shows the raw trend of average premium at the contract level. Panel (b) shows the event study estimates for premium with 95% confidence intervals based on robust standard error clustered at the level of contracts. Panel (c) and (d) show the raw trend and event study estimates for average drug deductible at the contract level. 162 Figure A.2.6: Effect on premium, within-contract cross county variation, event study (a) Raw Trend 40 50 60 70 80 90 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (b) Event Study −50 0 50 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality (c) Raw Trend, 15% Tails 40 50 60 70 80 90 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (d) Event Study, 15% Tails −40 −20 0 20 40 60 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality Notes: Panel (a) shows the raw trend of premium for high and low-quality contracts, and their respective high and low risk regions. A high risk region given contract has baseline FFS risk score above the median in the market set. Panel (b) shows the event study estimates for low-quality contracts (left line), high quality contracts (right line), and the differential effect on high quality contracts (middle line). Plotted 95% confidence intervals are based on robust standard errors clustered two-way at the level of county and contract. Panel (c) and (d) show the corresponding raw trend and event study estimates, limiting counties to those in the lower or upper 15% of the risk distribution given contract. 163 Figure A.2.7: Effect on drug deductible, within-contract cross county variation, event study (a) Raw Trend 10 20 30 40 50 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (b) Event Study −100 0 100 200 300 2009 2010 2011 2012 2013 2014 ACA QBP low quality triple difference high quality (c) Raw Trend, 15% Tails 10 20 30 40 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (d) Event Study, 15% Tails −100 0 100 200 2009 2010 2011 2012 2013 2014 ACA QBP low quality triple difference high quality Notes. Panel (a) shows the raw trend of drug deductible for high and low quality contracts, and their respective high and low risk regions. A high risk region given contract has baseline FFS risk score above the median in the market set. Panel (b) shows the event study estimates for low quality contracts (left line), high quality contracts (right line), and the differential effect on high quality contracts (middle line). Plotted 95% confidence intervals are based on robust standard errors clustered two-way at the level of county and contract. Panel (c) and (d) show the corresponding raw trend and event study estimates, limiting counties to those in the lower or upper 15% of the risk distribution given contract. 164 Figure A.2.8: Effect on Part C premium, within-contract variation, event study, unweighted by enrollment (a) Raw Trend 20 30 40 50 60 2009 2010 2011 2012 2013 2014 low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (b) Event Study −40 −20 0 20 40 60 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality (c) Raw Trend, 15% Tails 20 30 40 50 60 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (d) Event Study, 15% Tails −40 −20 0 20 40 60 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality Notes. Panel (a) shows the raw trend of Part C premium set by high and low quality contracts in their respective high and low risk regions. The premium is at the contract-county level, taking simple average over plans offered in a county, unweighted by enrollment in these plans. A high risk region for a contract has baseline FFS risk score above the median in the contract’s market set. Panel (b) shows the event study estimates for low quality contracts (left line), high quality contracts (right line), and the differential effect on high quality contracts (middle line). Plotted 95% confidence intervals are based on robust standard errors clustered two-way at the level of county and contract. Panel (c) and (d) show the corresponding raw trend and event study estimates, limiting counties to those in the lower or upper 15% of the risk distribution given contract. 165 Figure A.2.9: Effect on Part C premium, within-contract cross county variation, event study (a) Raw Trend 20 30 40 50 60 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (b) Event Study −40 −20 0 20 40 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality (c) Raw Trend, 15% Tails 20 30 40 50 60 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (d) Event Study, 15% Tails −40 −20 0 20 40 60 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality Notes. Panel (a) shows the raw trend of Part C premium set by high and low quality contracts in their respective high and low risk regions. A high risk region given contract has baseline FFS risk score above the median in the market set. Panel (b) shows the event study estimates for low quality contracts (left line), high quality contracts (right line), and the differential effect on high quality contracts (middle line). Plotted 95% confidence intervals are based on robust standard errors clustered two-way at the level of county and contract. Panel (c) and (d) show the corresponding raw trend and event study estimates, limiting counties to those in the lower or upper 15% of the risk distribution given contract. 166 Figure A.2.10: Effect on Part D premium, within-contract cross county variation, event study (a) Raw Trend 15 20 25 30 35 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (b) Event Study −20 0 20 40 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality (c) Raw Trend, 15% Tails 15 20 25 30 35 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (d) Event Study, 15% Tails −20 0 20 40 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality Notes. Panel (a) shows the raw trend of Part D premium set by high and low quality contracts in their respective high and low risk regions. A high risk region given contract has baseline FFS risk score above the median in the market set. Panel (b) shows the event study estimates for low quality contracts (left line), high quality contracts (right line), and the differential effect on high quality contracts (middle line). Plotted 95% confidence intervals are based on robust standard errors clustered two-way at the level of county and contract. Panel (c) and (d) show the corresponding raw trend and event study estimates, limiting counties to those in the lower or upper 15% of the risk distribution given contract. 167 Figure A.2.11: Effect on Part D premium, within-contract variation, event study, un- weighted by enrollment (a) Raw Trend 15 20 25 30 35 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (b) Event Study −20 0 20 40 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality (c) Raw Trend, 15% Tails 15 20 25 30 35 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (d) Event Study, 15% Tails −20 0 20 40 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality Notes. Panel (a) shows the raw trend of Part D premium set by high and low quality contracts in their respective high and low risk regions. The premium is at the contract-county level, taking simple average over plans offered in a county, unweighted by enrollment in these plans. A high risk region for a contract has baseline FFS risk score above the median in the contract’s market set. Panel (b) shows the event study estimates for low quality contracts (left line), high quality contracts (right line), and the differential effect on high quality contracts (middle line). Plotted 95% confidence intervals are based on robust standard errors clustered two-way at the level of county and contract. Panel (c) and (d) show the corresponding raw trend and event study estimates, limiting counties to those in the lower or upper 15% of the risk distribution given contract. 168 Figure A.2.12: Effect on drug deductible, within-contract variation, event study, un- weighted by enrollment (a) Raw Trend 20 25 30 35 40 45 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (b) Event Study −100 0 100 200 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality (c) Raw Trend, 15% Tails 15 20 25 30 35 40 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (d) Event Study, 15% Tails −100 0 100 200 300 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality Notes. Panel (a) shows the raw trend of drug deductible set by high and low quality contracts in their respective high and low risk regions. The deductible is at the contract-county level, taking simple average over plans offered in a county, unweighted by enrollment in these plans. A high risk region for a contract has baseline FFS risk score above the median in the contract’s market set. Panel (b) shows the event study estimates for low quality contracts (left line), high quality contracts (right line), and the differential effect on high quality contracts (middle line). Plotted 95% confidence intervals are based on robust standard errors clustered two-way at the level of county and contract. Panel (c) and (d) show the corresponding raw trend and event study estimates, limiting counties to those in the lower or upper 15% of the risk distribution given contract. 169 Figure A.2.13: Effect on premium and deductible, within-contract variation, event study, distance to mean (a) Part C Premium −40 −20 0 20 40 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality (b) Part C Premium, 15% Tails −50 0 50 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality (c) Part D premium −20 0 20 40 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low low quality (d) Part D Premium, 15% Tails −20 0 20 40 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality (e) Drug Deductible −100 0 100 200 300 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality (f) Drug Deductible, 15% Tails −100 0 100 200 300 2009 2010 2011 2012 2013 2014 ACA QBP low quality high vs. low high quality Notes. The Figure shows the event study estimates for premium and drug deductible, using distance to mean county risk as the driving variable for within-contract risk distribution. In the main analysis, the variable is defined as distance to the median county risk. Panels on the left show the event study estimates, using all counties in the service area. Panels on the right show corresponding estimates when restricting the sample to counties with risk in the lower or upper 15% tail of the service area. Plotted 95% confidence intervals are based on robust standard errors clustered two-way at the level of county and contract. 170 Figure A.2.14: Effect on zero-premium and zero-deductible plans and enrollment, 15% tail counties, raw trend (a) Zero Part C Premium .1 .2 .3 .4 .5 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (b) Zero Part C Premium, Enrollment .1 .2 .3 .4 .5 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (c) Zero Part D Premium .1 .2 .3 .4 .5 2009 2010 2011 2012 2013 2014 low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (d) Zero Part D Premium, Enrollment .1 .2 .3 .4 .5 2009 2010 2011 2012 2013 2014 low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (e) Zero Drug Deductible .75 .8 .85 .9 .95 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality (f) Zero Drug Deductible, Enrollment .75 .8 .85 .9 .95 2009 2010 2011 2012 2013 2014 ACA QBP low−risk low−quality high−risk low−quality low−risk high−quality high−risk high−quality Notes. The Figure summarizes the raw trend in the offering of zero-premium and zero-deductible plans (left panels), and enrollment in these plans (right panels). Enrollment is measured as the share of enrollees in zero-price plans offered by a contract in a given county. Sample is restricted to counties in the upper or lower 15 percentile of county risk in a contract’s service area. Counties in the upper (lower) 15 percentile are high-risk (low-risk) counties for a contract. Plotted 95% confidence intervals are based on robust standard errors clustered two-way at the level of county and contract. 171 Figure A.2.15: Weight increase of outcome measures in quality rating (a) Star Rating 0 .1 .2 .3 .4 2009 2010 2011 2012 2013 2014 (b) 4 Star −.1 0 .1 .2 .3 .4 2009 2010 2011 2012 2013 2014 Notes. The figures show the event study trends of outcome measure weights in the quality star rating. The left panel shows that outcome ratings receive higher weights after 2012. The right panel shows a similar weight increase in outcome ratings for achieving at least a 4.0 star final rating. 95% confidence intervals are plotted based on robust standard errors clustered at the contract level. 172 Figure A.2.16: Risk score and outcome rating, event study (a) Average Outcome Rating, Raw Trend 2.8 3 3.2 3.4 3.6 2009 2010 2011 2012 2013 2014 ACA QBP low risk high risk (b) Average Outcome Rating, Event Study −2 −1 0 1 2 2009 2010 2011 2012 2013 2014 ACA QBP (c) Health Improved, Raw Trend 3.1 3.2 3.3 3.4 3.5 2009 2010 2011 2012 2013 2014 ACA QBP low risk high risk (d) Health Improved, Event Study −1.5 −1 −.5 0 .5 2009 2010 2011 2012 2013 2014 ACA QBP (e) Diabetes & Blood Pressure, Raw Trend 2.8 3 3.2 3.4 3.6 3.8 2009 2010 2011 2012 2013 2014 ACA QBP low risk high risk (f) Diabetes & Blood Pressure, Event Study −3 −2 −1 0 1 2 2009 2010 2011 2012 2013 2014 ACA QBP Notes. The Figure shows the variation of outcome rating by baseline enrollee risk. In the raw trends (left panels), a high risk contract has baseline (2009-2010) enrollee risk score above the sample median (0.97). Right panels show the event study estimates from difference-in-difference specifications using continuous variation in baseline enrollee risk score. Panel (a) and (b) show the trending of average outcome measure ratings by risk. Panel (c) and (d) show the trending of health improvement measures reported in HOS. Panel (e) and (f) show the trending of diabetes and blood pressure control from HEDIS clinical measures. Event study graphs show 95% confidence intervals based on standard errors clustered at the level of contract. 173 Figure A.2.17: Quality, risk and outcome rating, event study (a) Outcome Rating, Raw Trend 2.5 3 3.5 4 2009 2010 2011 2012 2013 2014 ACA QBP low quality high quality low quality + low risk high quality + high risk (b) Outcome Rating, Event Study −.6 −.4 −.2 0 .2 .4 2009 2010 2011 2012 2013 2014 ACA QBP high vs. low quality high vs. low quality and risk (c) Health Improved, Raw Trend 3.1 3.2 3.3 3.4 3.5 2009 2010 2011 2012 2013 2014 ACA QBP low quality high quality low quality + low risk high quality + high risk (d) Health Improved, Event Study −.4 −.2 0 .2 .4 2009 2010 2011 2012 2013 2014 ACA QBP high vs. low quality high vs. low quality and risk (e) Diabetes & Blood Pressure, Raw Trend 2.5 3 3.5 4 4.5 2009 2010 2011 2012 2013 2014 ACA QBP low quality high quality low quality + low risk high quality + high risk (f) Diabetes & Blood Pressure, Event Study −1 −.5 0 .5 2009 2010 2011 2012 2013 2014 ACA QBP high vs. low quality high vs. low quality and risk Notes. The Figure shows the variation of outcome rating by baseline quality and enrollee risk. A high risk contract has baseline (2009-2010) enrollee risk score above the sample median (0.97). Left panels show the raw trend of outcome ratings and component ratings for baseline high vs.low contracts in dotted lines, and for baseline high-risk high-quality vs. low-risk low-quality contracts in solid lines. Right panels show the event study estimates from corresponding difference-in-difference specifications. Panel (a) and (b) show the trending of average outcome measure ratings by quality and risk. Panel (c) and (d) show the trending of health improvement measures reported in HOS. Panel (e) and (f) show the trending of diabetes and blood pressure control from HEDIS clinical measures. Event study graphs show 95% confidence intervals based on standard errors clustered at the level of contract. 174 B Appendix to Chapter 3 B.1 Appendix Proofs B.1.1 Risk-based pricing Consider first the case of risk-based pricing. Insurers observe health type, and charge competitive premiump(;) = (1)M. Government in addition observes productiv- ity, and varies transferst(;) andt(;) across types. Absent a universal mandate, individual take-up decision involves comparing the utility of 1), acquiring formal insur- ance at cost (1)M in both health states, and 2), remaining uninsured and receiving uncompensated care supportt(;) in the sick state. Absent uncompensated care, willingness to pay (WTP) for insurance is above expected cost (risk-based premium), and take-up is complete. With uncompensated care, insurance take-up is generally incomplete. This is because ex-post transfert lowers ex-ante WTP, and because insurance company has less information than the government, risk-based premium cannot adjust for the implicit coverage from uncompensated care. Given the incomplete take-up, the government can improve welfare with a universal mandate: all individuals purchase competitively priced insurance, and uncompensated care cost to government is zero. To see this, note that for arbitrary employmente(;) and transferst(;) andt(;), the uninsured has consumption (w) +t(;)> 0 is the healthy state, and (w) +t(;) +t(;)M > 0 in the unhealthy state. With non-negative consumption (Inada condition), expected transfert(;)+(1)t(;) plus any labor earning is greater than expected cost (1)M, or competitive premium is already affordable given transfer. To improve welfare, the government instead transfers ¯ t(;) =t(;) + (1)t(;) in both health states, and has the uninsured purchase insurance at expected cost (1)M. The resulting consumption, (w) +t(;) + (1)t(;) (1)M equals the expected consumption in the uninsured case, but gives higher expected utility because individual is risk-averse. Similar argument applies to all uninsured individuals, and a universal mandate im- proves welfare given any employment and transfer policy. With the mandate, government redistributes income to smooth marginal utility of consumption across employment and risk: low-health individuals with greater cost of insurance receive more generous transfer. Abstracting from the efficiency cost of tax transfers, consumption with risk-based pricing 175 equals c RB (e) =ew ¯ p in the economy, where e = E [e(;)] is the employment size, and ¯ p = M(1E[]) is the social cost of insurance. The effectual pooling across risk is socially desirable with risk-based pricing, and is achieved through (risk-adjusted) premium subsidy rather than ex-post uncompensated care. I compare with this benchmark result when more limited information is allowed in premium and transfers. B.1.2 Basic trade-off Consider moving the boundary of social insurance in the risk space ton = 1, the ultra- health margin. Social welfare W = Z n 0 Z 1 () dF(;)u(c e ) + Z 1 n Z 1 () dF(;)u(c e +p) + Z 1 n Z 1 () (1)dF(;)u(c e +pM) + Z n 0 Z () 0 dF(;)u A (1 p )p + Z 1 n Z () 0 dF(;)u(A) + Z 1 n Z () 0 (1)dF(;)u(AM) Z 1 0 Z 1 () g 1 dF(;) where employed enrollees, measured in massE[e(;)hi(;)] = R n 0 R 1 () dF(;), have consumption c e =w 1e e A 2 6 6 6 6 4 p e Z n 0 Z () 0 dF(;) + 1 3 7 7 7 7 5 p where R n 0 R () 0 dF(;) is the mass of subsidized (non-employed) enrollees, ande =E[e(;)] = R 1 0 R 1 () dF(;) is the employment size following from arbitrary employment margin(). Employed non-enrollees in R 1 n R 1 () dF(;) save premiump if healthy, but pay medical costM if not. Average riskr = 1 R n 0 R 1 0 dF(;) R n 0 R 1 0 dF(;) varies with marginn according to dr dn = r (1n) i Z 1 0 f (;n)d wherei =E[hi(;)] = R n 0 R 1 0 dF(;) is the insurance rate. 176 Expanding insurance to the ultra-health margin, welfare increases by dW dn n=1 = Z 1 (1) f (;1)d u(c e )u(c e +p) | {z } marginal utility loss +eu 0 (c e ) 2 6 6 6 6 4 p e p Z (1) 0 f (;1)d 3 7 7 7 7 5 | {z } tax payer cost: new enrollee subsidy +eu 0 (c e ) " p e (1e) + 1 # p Z 1 0 f (;1)d | {z } premium saving + Z (1) 0 f (;1)d u A (1 p )p u(A) | {z } marginal utility loss, subsidy enrollees + (1e)u 0 A (1 p )p (1 p )p Z 1 0 f (;1)d | {z } subsidized premium saving Given full subsidy p = 1, the trade-off only affects workers who are premium payers. The first three terms simply to dW dn n=1; p =1 = Z 1 (1) f (;1)d u(c e )u(c e +p) +pu 0 (c e ) / u(c e )u(c e +p) +pu 0 (c e ) Concave utility implies risk pooling is socially desirable: infra-margin cost savingpu 0 (c e ) is valued more than the utility cost on the ultra-margin. Once a mandate implements the pooling, the government redistributes across income with cash transferA and subsidy p . Given transferA, abstracting from behavioral responses to taxes, standard equity argument implies full subsidy is desirable. B.1.3 Risk preference heterogeneity When risk attitude differs across individuals (Cohen and Einav 2007; Barsky et al. 1997), WTP for insurance is not monotone in risk, but in addition depends on the correlation with risk preference (Andrews and Miller, 2013). If low-risk individuals are also less risk-tolerant, adverse selection is lessened (Finkelstein and McGarry, 2006). In this case, the risk-tolerant efficiently bear more risk (Diamond 1967; Wilson 1968), lowering the desirability of complete redistribution across risk. Regardless of risk preferences, the highest health margin = 1 has zero demand for insurance and remains the last group to 177 enroll. The basic trade-off carries over, but adjusts for the differences in risk preferences between the infra and the ultra margin. Assume preference exhibits constant relative risk aversion (CRRA), with risk coefficient taking two values 1 < 2 in the economy: individuals with 2 are more risk-averse. Let () denote the share of high risk-aversion types on the ultra (infra) margin. Adjusted by risk preference, the trade-off becomes 110 dW dn n=1 / u 2 +u 0 2 p + (1)u 1 + (1)u 0 1 p whereu k =u(c; k )u(c +p; k ) is the utility cost of premiump for risk type k ,k = 1;2, andu 0 k =u 0 (c; k ) the social cost saving. Argument for a coverage mandate is strengthened, if ultra-margin individuals are on average more risk-averse, or< 111 . When true, marginal utility loss is smaller than in the homogeneous case, and social benefit is greater. Conversely, if risk tolerance increases with health ( >), and social benefit is less than the marginal cost, then high-health individuals are efficiently uninsured. B.1.4 Labor productivity and non-labor income Assume now that labor productivity affects earning: type produces outputZ at oppor- tunity costg( 1 ). Workers take home (1)W after a linear tax. The tax revenue is used to fund the cash benefit and premium subsidy. Individuals receive non-labor incomey(;) independent of employment. The distri- 110 Let the distribution of risk preference be Prf (;) = 2 g =(;). The welfare effect of enrolling the perfect health types with full subsidy to the non-employed is dW dn e n e =1; p =1 = Z 1 (1) (;1)dF(;1)u 2 + Z 1 (1) 1(;1) dF(;1)u 1 + Z 1 0 Z 1 () (;)dF(;)u 0 2 (c e ) p e Z 1 (1) f (;1)d + Z 1 0 Z 1 () 1(;) dF(;)u 0 1 (c e ) p e Z 1 (1) f (;1)d The trade-off is then simplified using = R 1 0 R 1 () (;)dF(;) e and = R 1 (1) (;1)dF(;1) R 1 (1) f (;1)d . 111 Note that the social benefit can be written as u 0 2 p + (1)u 0 1 p =pu 0 1 (1) | {z } > (1)u 1 +pu 0 2 | {z } >u 2 +pu 0 () Therefore social benefit offsets marginal utility loss, if>, or risk aversion is higher on the ultra-margin. 178 bution ofy(;) is assumed continuous but otherwise unrestricted in the economy. 112113 The trade-off now adjusts for the tax incidence of subsidy across the productivity space: E [u 0 je = 1]p + tax incidence of subsidy z }| { Cov [u 0 ;je = 1] E[je = 1] 1 e e =1 ! p | {z } social benefit E [uje = 1; = 1] | {z } marginal utility loss wheree outside the conditional probability is employment sizee =E [e(;)], ande =1 = E[e(;)j = 1] is employment on the ultra margin. Cov [u 0 ;je = 1] evaluates the tax burden at marginal utility of consumption. If there are more subsidy eligibles on the ultra-margin than on average (e =1 <e), expanding insurance raises the per capita cost of subsidy. If the extra cost is directed more to high productivity types, then the social benefit increases. B.1.5 Tax penalty Suppose the government provides uncompensated care to the uninsured, and retro-actively enroll patients in formal insurance at the site of care. The low-income (non-employed) receives full premium subsidy. High-income uninsured pays tax penaltykp. Subsidy is financed by a linear tax on payroll and the tax penalty. Uncompensated care is financed by surcharges on paying customers. Lett denote the patient cost share. Surcharge on patients is uc 0 = t 1;1;0 1;0;0 (Mp) + 0;0;0 M | {z } UC where i;j;k = Prfe =i;hi =j;g =kg, andk = 0 for patients.UC gives the total uncompen- sated cost (net of high-income premium). Surcharge on non-patients isuc 1 = 1t 1;1;1 UC. Consumption of premium payers in the healthy state depends on after-tax income T (;), premiump, tax share of subsidy (net of penalty), anduc 1 c 1;1;1 =T (;)p 0;1 p eE[je = 1] + 1;0;1 kp E[je = 1] uc 1 Excess burdenuc 0 >uc 1 lowers patient consumptionc 1;1;0 =c 1;1;1 (uc 0 uc 1 ). 112 Primitives of the distribution include endowment, wealth, and non-labor activities that generate income. y(;) provides a summary proxy for these differences. 113 Adjusting for the tax burden of UI and subsidy, consumption equals c e =y(;) +Z 1e eE[je = 1] A R n 0 R () 0 dF(;) eE[je = 1] pp for workers, where E[je=1] is the tax burden on type in the linear tax schedule. 179 High-income uninsured pay tax penalty if healthy, but are required to purchase insur- ance in the case of a health event, c 1;0;1 =T (;)kp 0;1 p eE[je = 1] + 1;0;1 kp E[je = 1] c 1;0;0 =T (;)p 0;1 p eE[je = 1] + 1;0;1 kp E[je = 1] Non-employed are fully insured (formally or implicitly). Consumptionc 0 =A +y(;) does not vary by enrollment or health. Increase in social welfare W = Z n 0 Z 1 () u(c 1;1;1 )dF(;) + Z n 0 Z 1 () (1)u(c 1;1;0 )dF(;) + Z 1 n Z 1 () u(c 1;0;1 )dF(;) + Z 1 n Z 1 () (1)u(c 1;0;0 )dF(;) + Z 1 0 Z () 0 u(c 0 )dF(;) Z 1 0 Z 1 () g( 1 )dF(;) when extending formal insurance extends to the ultra-marginn = 1 equals dW dn n=1 = Z 1 (1) u(c 1;1;1 )u(c 1;0;1 ) f (;1)d + X k=0;1 E[u 0 je = 1;hi = 1;g =k]E dc 1;1;k dn e = 1;hi = 1;g =k 1;1;k + X k=0;1 Cov[u 0 ;je = 1;hi = 1;g =k]g k e E[je = 1] p " 1k e e =1 # Z 1 (1) f (;1)d whereg k e = Prfg =kje = 1;hi = 1g. Note that X k=0;1 E dc 1;1;k dn e = 1;hi = 1;g =k 1;1;k =p (1k) Z 1 (1) f (;1)d and E dc 1;1;0 dn e = 1;hi = 1;g = 0 1;1;0 =g 0 e p Z 1 (1) f (;1)d " e e =1 + E[je = 1;g = 0] E[je = 1] 1k e e =1 !# 180 Then X k=0;1 E[u 0 je = 1;hi = 1;g =k]E dc 1;;k dn e = 1;hi = 1;g =k 1;1;k =E[u 0 je = 1;hi = 1;g = 1]p (1k) Z 1 (1) f (;1)d +u 0 e=1 (1g e )p Z 1 (1) f (;1)d " e e =1 + E[je = 1;g = 0] E[je = 1] 1k e e =1 !# whereu 0 e=1 =E[u 0 je = 1;hi = 1;g = 0]E[u 0 je = 1;hi = 1;g = 1]. Adjusting for R 1 (1) f (;1)d, welfare impact is proportional to E[u 0 je = 1;hi = 1;g = 1]c +u 0 e=1 (1g e )p " e e =1 + E[je = 1;g = 0] E[je = 1] 1k e e =1 !# + X k=0;1 Cov[u 0 ;je = 1;hi = 1;g = 1]g k e E[je = 1] p " 1k e e =1 # E[uje = 1;hi = 1;g = 1] whereu =u(c 1;1;1 )u(c 1;0;1 ) evaluated atn = 1, andc = (1k)p is net premium. Negative correlation between risk and productivity implies E[je=1;g=0] E[je=1] < 1, and un- compensated care strengthens the social benefit of formal insurance. WhenCov(;) 0, then P k=0;1 Cov[u 0 ;je=1;hi=1;g=1]g k e E[je=1] Cov[u 0 ;je=1] E[je=1] < 0. Trade-off evaluated at Cov[u 0 ;je=1] E[je=1] understates the social benefit of tax-financed subsidy. As penalty increases, net effect on welfare depends onCov[u 0 ;je = 1] relative tou 0 e=1 (1g e )E[je = 1;g = 0]. When consumption variation implies greater value of redistribution across productivity, and the scope of redistribution is smaller between patients, the tax penalty strengths the welfare argument for a mandate. B.1.6 Uncompensated care Suppose the government enrolls risk type 1e who otherwise receives uncompensated care. 114 Welfare effect is given by dW d 1e n e =1 = X k=0;1 E[u 0 je = 1;hi = 1;g =k]E dc 1;1;k d 1e e = 1;hi = 1;g =k 1;1;k X k=0;1 Cov[u 0 ;je = 1;hi = 1;g =k]g k e E[je = 1] pe +MCi 1e i Z (1) 0 f (;1)d 114 For simplicity I assume workers are already enrolled in formal insurance with the tax penalty. I hence focus on the implication of replacing uncompensated care with premium subsidy in the low-income. 181 From resource neutrality, X k=0;1 E dc 1;1;k d 1e e = 1;hi = 1;g =k 1;1;k = 0 Note that E dc 1;1;0 d 1e e = 1;hi = 1;g = 0 1;1;0 = (1g e ) " ep +MCi 1e i 1 E[je = 1;g = 0] E[je = 1] ! +MC t 1g e 1 !# Z (1) 0 f (;1)d Then the welfare effect is proportional to u 0 e=1 " ep +MCi 1e i 1 E[je = 1;g = 0] E[je = 1] ! +MC t 1g e 1 !# (1g e ) X k=0;1 Cov[u 0 ;je = 1;hi = 1;g =k]g k e E[je = 1] ep +MCi 1e i Tax-financed subsidy lowers patient surcharge burden, when subsidy cost is directed to the high productivity negatively correlated with risk. In this case, both the tax finance and uncompensated saving improve welfare. WhenCov(;) 0, P k=0;1 Cov[u 0 ;je=1;hi=1;g=k]g k e E[je=1] Cov[u 0 ;je=1] E[je=1] < 0. IfCov[u 0 ;je = 1]>u 0 e=1 (1g e )E[je = 1;g = 0], or when tax transfer has greater scope of redistribution and social value, subsidized formal insurance dominates uncompensated care. Subsidized universal insurance maximizes welfare. B.1.7 Welfare formula For Equation 4, first derive the pricing externality on state utilityu t (! t ). The key external- ities include the cost composition effect on premiump, and the burden of uncompensated careuc p . In addition, if individuals do not internalize the penalty on credit rating, policy that induces take-up has external benefit on interest charges. Given state distribution, the utility change from the pricing externalities on consumption equals dV dK Pr i;j d pb dK eu 0 (c 1 ) d pr dK e 1 u 0 (c 11 ) d(1 p )p dK 2 u 0 (c 2 ) dkp dK 0 u 0 (c 0 ) duc p dK 0 >0 u 0 (c >0;0 ) + d dK ! (B.2) wherec ijg is average consumption for individuals with employmenti, insurancej, and health stateg. The first-order approximation occurs when moving expectation inside the state utility using Taylor expansion, ignoring second-order effects onu 00 (c). is the 182 interest rate, or cost of borrowing charged to individuals, negatively affected by medical debts from uninsured health events. Saving in interest payments when policy increases take-up is given by d dK . Effect on tax burden d pb dK comes from total differentiation of BC:1, on private transfer d pr dK fromBC:2, on survice surcharge duc p dK fromBC:3, and on premium dp dK fromBC:4. Then, to focus on changes in state distribution, I Taylor-expand utility at the mean consumption of employmenti and insurancej U Z 1 0 X i=0;1 X j=0;1;2 ij t u(c ij t ) + 1 2 u 00 (c ij t )( ij t ) 2 dt (B.3) Ignoring second-order terms and variance, and averaging over life cycle, consumer welfare V = P i P j ij u(c ij ) responds to policy through distribution ij and the utility change of movers. Absent friction, movers choose outcomes optimally, and the behavior has no direct impact on welfare. With friction, observe outcomes are not always rationalized with the decision model, and the deviation may respond to policy. In the case of insurance, take-up pattern not predicted by the decision model is attributed to the taste shock t . When a policy change induces more take-up than the model prediction, t responds positively to policy in generating over-demand for insurance and sub-optimal utility loss. The unintended effects on utility enter the welfare calculation in dV dK u 0 . Continuing the insurance example, for a total take-up effect d 0 dK observed in the data, if fraction 0 (K) are over-insured resulting in utility lossu 0 (K) , the friction component dV dK u 0 = 0 (K) d 0 dK u 0 (K) . 183 B.2 Appendix Tables Table B.2.1: Incentive effect of subsidy, by age (I) (II) (III) (IV) (V) any insurance employed ESI ESI + ESI + employed not employed 27-29 0.18** -0.15 -0.87*** -0.57*** -0.31*** (0.081) (0.13) (0.11) (0.13) (0.045) [0.91] [0.20] [0.67] [0.61] [0.059] 30-34 0.076 -0.062 -0.65*** -0.29*** -0.36*** (0.053) (0.081) (0.097) (0.11) (0.037) [0.92] [0.19] [0.71] [0.64] [0.070] 35-39 0.056 0.13 -0.57*** -0.17* -0.40*** (0.048) (0.085) (0.089) (0.098) (0.044) [0.94] [0.19] [0.74] [0.66] [0.087] 40-44 0.13*** 0.11 -0.48*** -0.072 -0.41*** (0.047) (0.075) (0.063) (0.066) (0.040) [0.95] [0.20] [0.75] [0.66] [0.090] 45-49 0.099*** 0.0023 -0.53*** -0.27*** -0.26*** (0.029) (0.073) (0.068) (0.076) (0.032) [0.95] [0.20] [0.76] [0.68] [0.086] 50-54 0.11*** -0.12* -0.55*** -0.33*** -0.22*** (0.029) (0.062) (0.061) (0.061) (0.040) [0.96] [0.22] [0.76] [0.66] [0.092] 55-64 0.087*** -0.24*** -0.66*** -0.26*** -0.40*** (0.027) (0.069) (0.059) (0.065) (0.035) [0.97] [0.33] [0.74] [0.58] [0.16] R 2 0.071 0.091 0.19 0.13 0.054 y mean 0.95 0.77 0.83 0.64 0.10 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows reduced-form estimates of subsidy incentivesubiv interacted with full set of age band indicators, based on the main specification with full interaction terms. Robust standard error clustered at the level of PUMA in the parenthesis. Group mean of dependent variable in the square bracket. 184 Table B.2.2: Subsidy effect on ESI and alternative employment measures (I) (II) (III) (IV) (V) worked in worked in ESI + ESI + ESI + 12 months 5 years not in LBF 12 month UE 5 year UE Panel A: OLS subs -0.35*** -0.21*** 0.034*** 0.048*** 0.024*** (0.0080) (0.0067) (0.0034) (0.0034) (0.0025) R 2 0.21 0.15 0.063 0.055 0.037 Panel B: reduced form subiv -0.077* -0.061* -0.30*** -0.28*** -0.19*** (0.045) (0.031) (0.021) (0.020) (0.018) R 2 0.099 0.090 0.064 0.053 0.037 2008 -0.043 -0.020 -0.33*** -0.28*** -0.18*** (0.076) (0.061) (0.042) (0.033) (0.031) 2009 -0.048 -0.087** -0.32*** -0.33*** -0.19*** (0.066) (0.042) (0.032) (0.036) (0.026) 2010 -0.12** -0.11** -0.27*** -0.25*** -0.17*** (0.059) (0.048) (0.030) (0.033) (0.028) 2011 -0.079 -0.020 -0.29*** -0.27*** -0.21*** (0.070) (0.052) (0.033) (0.031) (0.029) Panel C: over-identified 2SLS subs -0.075 -0.012 -0.26*** -0.22*** -0.17*** (0.049) (0.034) (0.028) (0.028) (0.023) F-stat 227.99 227.99 227.99 227.99 227.99 p-value 0.82 0.48 0.36 0.15 0.32 y mean 0.83 0.90 0.077 0.071 0.039 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table estimates the effect of subsidy on alternative measures of employment (ever worked in 12 months and in 5 years), and interactive outcomes with ESI: ESI coverage while not in labor force (LBF) last week in column 3, ESI coverage while unemployed in 12 months in column 4, and ESI coverage while unemployed in 5 years in column 5. Panel A shows OLS estimates using endogenous subsidy ratesubs. Panel B shows reduced-form effect of simulated generositysubiv, and year-specific effects interacting subiv with year dummies from a separate regression. Panel A and B are based on the main specification with full interaction terms. Panel C shows 2-stage least square estimates instrumentingsubs withsubiv andsublean, based on a specification with main effects of PUMA, age, year, income, and demographic variables interacted with unemployment rate. P-value from Hansen over-identification test is reported. Robust standard errors clustered at the level of PUMA in the parenthesis. 185 Table B.2.3: Incentive effect of subsidy, basic controls (I) (II) (III) (IV) (V) any insurance employed in labor force ESI + ESI + employed not employed Panel A: OLS, endogenous subsidy exposure subs -0.071*** -0.41*** -0.31*** -0.55*** 0.048*** (0.0033) (0.0068) (0.0064) (0.0068) (0.0046) R 2 0.068 0.20 0.18 0.29 0.053 Panel B: 2SLS estimates, instrument varying by location, year, age, and demographics subs 0.16*** -0.067 -0.053 -0.33*** -0.26*** (0.036) (0.057) (0.047) (0.047) (0.029) F-stat 456.33 456.33 456.33 456.33 456.33 Panel C: 2SLS estimates, instrument varying by location, year, and age subs 0.18 -0.62 -0.26 -1.18** -0.28 (0.29) (0.42) (0.37) (0.57) (0.36) F-stat 6.06 6.06 6.06 6.06 6.06 Panel D: over-identified 2SLS estimates subs 0.16*** -0.068 -0.054 -0.33*** -0.26*** (0.036) (0.057) (0.047) (0.047) (0.029) F-stat 228.36 228.36 228.36 228.36 228.36 p-value 0.93 0.17 0.57 0.048 0.95 y mean 0.83 0.90 0.077 0.071 0.039 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table estimates the incentive effect of subsidy on main outcomes, from a basic specification controlling for main effects of location, year, age, and demographic variables, and region-year fixed effects. Panel A shows OLS estimates on endogenous subsidysubs. Panel B shows 2SLS estimates instrumented bysubiv varying at location, year, age, and demographics. Panel C shows 2SLS estimates instrumented bysublean varying at location, year, and age. Panel D uses both instruments, and reports p-value from the Hansen over-identification test. Robust standard errors clustered at the level of PUMA in the parenthesis. 186 Table B.2.4: Incentive effect of subsidy, second instrument simulated from Massachusetts (1) (2) (3) (4) (5) any insurance employed in labor force ESI + ESI + employed not employed Panel A: instrumentsublean andsublean ma subs 0.20 -0.42 -0.35 -0.52 -0.19 (0.29) (0.34) (0.33) (0.37) (0.29) F-stat 3.03 3.03 3.03 3.03 3.03 p-value 0.88 0.39 0.77 0.061 0.60 Panel B: instrumentsubiv andsublean ma subs 0.10*** -0.055 -0.057 -0.084 -0.43*** (0.025) (0.052) (0.043) (0.078) (0.043) F-stat 261.84 261.84 261.84 261.84 261.84 p-value 0.17 0.38 0.42 0.79 0.35 y mean 0.95 0.77 0.83 0.64 0.10 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table estimates the incentive effect of subsidy on main outcomes, using different simulated generosity as instruments. Panel A estimates the basic specification that controls for location, year, age, and demographics, and region-year fixed effects. I instrument endogenous subsidysubs withsublean from the national sample and sublean ma from the Massachusetts sample. Panel B estimates the main specification in Equation 18. Endogenous subsidysubs is instrumented bysubiv from the national sample andsublean ma from the Massachusetts sample. First-stage F statistics and p-value from the Hansen over-identification test are reported in each panel. Robust standard errors clustered at the level of PUMA in the parenthesis. 187 Table B.2.5: Robustness analysis: border PUMAs (I) (II) (III) (IV) (V) any insurance employed in labor force ESI + ESI + employed not employed Panel A: main results subiv 0.10*** -0.055 -0.056 -0.26*** -0.33*** (0.025) (0.053) (0.044) (0.058) (0.023) R 2 0.071 0.090 0.10 0.13 0.054 Panel B: average subsidy (instead of price) for border PUMA subiv 0.10*** -0.055 -0.056 -0.26*** -0.33*** (0.025) (0.053) (0.044) (0.058) (0.023) R 2 0.071 0.090 0.10 0.13 0.054 Panel C: assign border PUMA to larger region subiv 0.093*** -0.059 -0.061 -0.28*** -0.33*** (0.024) (0.053) (0.044) (0.059) (0.024) R 2 0.061 0.088 0.10 0.13 0.053 Panel D: cluster by region-age band subiv 0.093** -0.059 -0.061 -0.28*** -0.33*** (0.033) (0.062) (0.060) (0.070) (0.036) R 2 0.061 0.088 0.10 0.13 0.053 ***p< 0:01 **p< 0:05 *p< 0:10 Notes: Table shows reduced-form estimates of subsidy incentivesubiv from the main specification, treating border PUMAs (intersecting two rating regions) differently. In Panel A, the main result averages premium across regions and then calculates subsidy. Panel B calculates subsidy under each region before averaging. Panel C assigns border PUMAs to region with the greatest population share. Panel D uses the region division in C, but clusters standard error at the level of region and age band. In Panel A-C, cluster is at the level of 52 PUMAs. 188 Table B.2.6: Consumption, in thousands of dollars (I) (II) (III) (IV) (V) (VI) c 1 c 11 c 2 c 0 c >0;0 c Panel A: non-medical consumption mean 44.00 46.41 27.84 24.95 33.53 38.82 s.e. (3.10) (3.45) (6.16) (6.50) (6.92) (2.42) c 1 1 1.05 0.63 0.57 0.76 0.88 Panel B: food consumption mean 1.92 1.94 1.78 1.76 1.59 1.87 s.e. (0.09) (0.10) (0.14) (0.34) (0.22) (0.07) c 1 1 1.01 0.93 0.92 0.83 0.97 N 242 203 50 22 18 345 externality pb pr p k uc p Notes: Table shows average quarterly non-medical consumption expenditures (in thousands of dollars) in Panel A, and food expenditures in Panel B, for beneficiary groups. Sample includes 27-64 Massachusetts adults in 2011 panel of Consumer Expenditure Survey (CEX). Standard error of mean estimates in the parenthesis. 189 Table B.2.7: Welfare effect of subsidy, alternative incidence of uncompensated cost (I) (II) (III) (IV) (V) (VI) dW B d p p dW UC d p p dW d p p dW C d p p (1)+(2)+(3)+(4) dW F d p p u 0 = 0:32 = 1 premium tax = 0:32 = 1 premium tax Panel A: non-medical consumption = 0 0.232 0.074 0.074 0.077 0.011 -0.338 -0.021 -0.021 -0.018 -0.018 = 1 0.368 0.082 0.097 0.086 0.013 -0.342 0.121 0.136 0.125 -0.028 = 2 0.585 0.092 0.128 0.095 0.014 -0.346 0.345 0.381 0.348 -0.045 = 3 0.928 0.105 0.169 0.106 0.016 -0.350 0.699 0.763 0.700 -0.071 Panel B: food consumption = 1 0.249 0.079 0.089 0.079 0.011 -0.339 0 0.010 0 -0.018 = 2 0.268 0.085 0.107 0.080 0.012 -0.339 0.026 0.048 0.021 -0.019 = 3 0.288 0.092 0.129 0.082 0.012 -0.340 0.052 0.089 0.042 -0.021 Notes: Table calculates the welfare effect of premium subsidy under different scenarios of uncompensated care incidence. = 0:32 corresponds to the main analysis where patient surcharge finances 32% of uncompensated cost. Assuming the remaining cost (assessment) is completely passed-through to commercially insured patients, = 1. Alternative pass-through to insurers amounts to a premium tax on enrollees, and subsidy lowers the tax by: dW UC d p p =g u 0 (c >0 ) u 0 (c 1 ) ri p 1 1 0 +" ri; 0 d 0 d p , where u 0 (c >0 ) u 0 (c 1 ) = 0:90 for non-medical consumption, and 0:98 for food. 190 Table B.2.8: Consumption, unsubsidized population > 300% FPL, in thousands of dollars (I) (II) (III) (IV) (V) (VI) c 1 c 11 c 2 c 0 c >0;0 c Panel A: non-medical consumption mean 49.21 49.44 70.84 17.98 37.71 48.54 s.e. (3.60) (3.74) (16.42) (7.00) (7.48) (3.15) c 1 1 1 1.44 0.37 0.77 0.99 Panel B: food consumption mean 2.07 2.03 2.64 2.15 1.82 2.10 s.e. (0.10) (0.10) (0.33) (1.21) (0.25) (0.09) c 1 1 1 1.28 1.04 0.88 1.01 N 189 175 14 6 14 231 externality pb pr p = 0 k = 1 2 uc p Notes: Table shows average quarterly non-medical consumption expenditures (in thousands of dollars) in Panel A, and food expenditures in Panel B, for beneficiary groups with income above 300% FPL. Sample includes 27-64 Massachusetts adults in 2011 panel of Consumer Expenditure Survey (CEX). Standard error of mean estimates in the parenthesis. 191 Table B.2.9: Welfare effect of penalty, alternative incidence of uncompensated cost (I) (II) (III) (IV) (V) dW B d p p dW UC d p p dW C d p p (1)+(2)+(3) dW F d p p u 0 = 0:32 = 1 premium tax = 0:32 = 1 premium tax Panel A: non-medical consumption = 0 -0.023 0.011 0.012 0.012 0.003 -0.009 -0.008 -0.008 -0.006 = 1 -0.062 0.013 0.015 0.012 0.003 -0.046 -0.044 -0.047 -0.016 = 2 -0.168 0.014 0.020 0.012 0.003 -0.151 -0.145 -0.153 -0.044 = 3 -0.454 0.016 0.025 0.012 0.003 -0.435 -0.426 -0.439 -0.118 Panel B: food consumption = 1 -0.022 0.012 0.013 0.012 0.003 -0.007 -0.006 -0.007 -0.006 = 2 -0.021 0.012 0.015 0.011 0.003 -0.006 -0.003 -0.007 -0.006 = 3 -0.020 0.013 0.017 0.011 0.003 -0.004 0 -0.006 -0.005 Notes: Table calculates the welfare effect of mandate penalty under different scenarios of uncompensated care incidence. I restrict the analysis to the unsubsidized population with income above 300% FPL. = 0:32 corresponds to the main analysis where patient surcharge finances 32% of uncompensated cost. Assuming the remaining cost (assessment) is completely passed-through to commercially insured patients, = 1. Alternative pass-through to insurers amounts to a premium tax on enrollees, and penalty lowers the tax by dW UC dkp =g u 0 (c >0 ) u 0 (c 1 ) ri p 1 1 0 +" ri; 0 d 0 dk , where u 0 (c >0 ) u 0 (c 1 ) = 1 for non-medical consumption, and 1:01 for food. 192 Table B.2.10: Welfare effect of subsidy, alternative calculation of fiscal cost (I) (II) (III) (IV) (V) (VI) dW B d p p dW UC d p p dW d p p dW C d p p dW B d p p + dW C d p p + dW d p p + dW C d p p dW F d p p u 0 de d p = 0 =0:08 =0:16 d e 1 d p = 0 d 1e 1 d p = 0 d 1 d p = 0 de d p = 0 =0:08 =0:16 d e 1 d p = 0 d 1e 1 d p = 0 d 1 d p = 0 Panel A: non-medical consumption = 0 0.242 0.071 0.010 -0.358 -0.423 -0.487 -0.311 -0.462 -0.381 -0.035 -0.100 -0.164 0.012 -0.139 -0.058 -0.017 = 1 0.390 0.076 0.011 -0.364 -0.429 -0.493 -0.318 -0.460 -0.381 0.113 0.049 -0.016 0.159 0.017 0.096 -0.027 = 2 0.630 0.083 0.013 -0.369 -0.434 -0.498 -0.324 -0.459 -0.381 0.357 0.292 0.228 0.402 0.267 0.345 -0.044 = 3 1.015 0.091 0.015 -0.374 -0.439 -0.503 -0.330 -0.458 -0.381 0.747 0.682 0.618 0.791 0.663 0.740 -0.071 Panel B: food consumption = 1 0.260 0.076 0.010 -0.360 -0.425 -0.489 -0.312 -0.423 -0.381 -0.014 -0.079 -0.143 0.034 -0.077 -0.035 -0.018 = 2 0.280 0.081 0.011 -0.360 -0.425 -0.489 -0.313 -0.423 -0.381 0.012 -0.053 -0.117 0.059 -0.051 -0.009 -0.019 = 3 0.300 0.088 0.011 -0.361 -0.426 -0.490 -0.314 -0.422 -0.381 0.038 -0.027 -0.091 0.085 -0.023 0.018 -0.021 Notes: Table calculates the welfare effect of premium subsidy under different calibration of fiscal costs. Sample includes Massachusetts adults aged 19-64 in Massachusetts. de d p = 0 corresponds to the baseline calibration for this sample, where d 2 1 d p =0:25, d 1e 1 d p =0:27, and d 1 d p =0:52. I vary the labor response to subsidy to equal either half or the full size of take-up (-0.08 and -0.16), and shut down ESI selection from workers, non-workers, or both, when employment effect is zero. Incidence of uncompensated cost follows the main analysis, with = 32% born by patient surcharge finances 32% of uncompensated cost. Welfare weights (based on consumption) differ only slightly from the 27-64 sample in the main analysis. 193 Table B.2.11: Welfare effect of subsidy, alternative moral hazard on formal insurance spending (I) (II) (III) (IV) (V) (VI) dW B d p p dW UC d p p dW d p p dW C d p p (1)+(2)+(3)+(4) dW F d p p u 0 g = 1 g = 0:8 g = 0:7 g = 1 g = 0:8 g = 0:7 Panel A: non-medical consumption = 0 0.232 0.089 0.074 0.062 0.011 -0.338 -0.006 -0.021 -0.033 -0.018 = 1 0.368 0.095 0.082 0.067 0.013 -0.342 0.134 0.121 0.106 -0.028 = 2 0.585 0.104 0.092 0.073 0.014 -0.346 0.357 0.345 0.326 -0.045 = 3 0.928 0.114 0.105 0.080 0.016 -0.350 0.708 0.699 0.674 -0.071 Panel B: food consumption = 1 0.249 0.095 0.079 0.066 0.011 -0.339 0.016 0 -0.013 -0.018 = 2 0.268 0.102 0.085 0.071 0.012 -0.339 0.043 0.026 0.012 -0.019 = 3 0.288 0.110 0.092 0.077 0.012 -0.340 0.070 0.052 0.037 -0.021 Notes: Table calculates the welfare effect of premium subsidy for the 19-64 population under different moral hazard on formal insurance spending, modeled as relative generosity of informal insuranceg. I re-calculated uncompensated cost saving assuming formal insurance does not increase spending (g = 1), increases spending by 25% (g = 0:8), or increases spending by 43% (g = 0:7). I assume = 0:32 of uncompensated cost is financed by patient surcharge, as in the main analysis. 194 Table B.2.12: Welfare effect of penalty, alternative moral hazard on formal insurance spending (I) (II) (III) (IV) (V) dW B dkp dW UC dkp dW C dkp (1)+(2)+(3) dW F dkp u 0 g = 1 g = 0:8 g = 0:7 g = 1 g = 0:8 g = 0:7 Panel A: non-medical consumption = 0 -0.023 0.014 0.011 0.010 0.003 -0.006 -0.009 -0.010 -0.006 = 1 -0.062 0.015 0.013 0.011 0.003 -0.044 -0.046 -0.048 -0.016 = 2 -0.168 0.017 0.014 0.013 0.003 -0.148 -0.151 -0.152 -0.044 = 3 -0.454 0.019 0.016 0.014 0.003 -0.432 -0.435 -0.437 -0.118 Panel B: food consumption = 1 -0.022 0.014 0.012 0.011 0.003 -0.005 -0.007 -0.008 -0.006 = 2 -0.021 0.015 0.012 0.011 0.003 -0.003 -0.006 -0.007 -0.006 = 3 -0.020 0.016 0.013 0.012 0.003 -0.001 -0.004 -0.005 -0.005 Notes: Table calculates the welfare effect of mandate penalty for the 19-64 population above 300% FPL under different moral hazard on formal insurance spending, modeled as relative generosity of informal insuranceg. I re-calculated uncompensated cost saving assuming formal insurance does not increase spending (g = 1), increases spending by 25% (g = 0:8), or increases spending by 43% (g = 0:7). I assume = 0:32 of uncompensated cost is financed by patient surcharge, as in the main analysis. 195 B.3 Appendix Figures 196 Figure B.3.1: Permutation test: random rating communities in non-MA states (a) any insurance 0 .2 .4 .6 .8 1 −.1 −.05 0 .05 .1 (b) employed 0 .2 .4 .6 .8 1 −.4 −.2 0 .2 (c) in labor force 0 .2 .4 .6 .8 1 −.1 −.05 0 .05 .1 .15 (d) ESI + employed 0 .2 .4 .6 .8 1 −.8 −.6 −.4 −.2 0 .2 (e) ESI + not employed 0 .2 .4 .6 .8 1 −.6 −.4 −.2 0 .2 Notes. Graphs plot the empirical cumulative distribution of pseudo estimates from non-MA states and Massachusetts (marked with a plus). I permute actual Massachusetts community rating across location, year and age band cells in control states, and estimate pseudo policy effect using simulated generositysublean, based on the reduced-form specification with main effects of age, PUMA, year, and demographic variables interacted with unemployment rate. Each hollow circle represents estimate from a non-MA state, with the size of the circle corresponding to the number of clustering unit (PUMA) in the state. 197 Figure B.3.2: Permutation test: random rating communities in non-MA states (a) any insurance 0 .2 .4 .6 .8 1 −.05 0 .05 .1 (b) employed 0 .2 .4 .6 .8 1 −.06 −.04 −.02 0 .02 .04 (c) in labor force 0 .2 .4 .6 .8 1 −.06 −.04 −.02 0 .02 .04 (d) ESI + employed 0 .2 .4 .6 .8 1 −.3 −.2 −.1 0 .1 (e) ESI + not employed 0 .2 .4 .6 .8 1 −.3 −.2 −.1 0 .1 Notes. Graphs plot the empirical cumulative distribution of pseudo estimates from non-MA states and Massachusetts (marked with a plus). I permute actual Massachusetts community rating across location, year and age band cells in control states, and Massachusetts affordability across income (demographic) groups. For each state I estimate pseudo policy effect using simulated generositysubiv, based on the reduced-form specification with full interaction terms. Each hollow circle represents estimate from a non-MA state, with the size of the circle corresponding to the number of clustering unit (PUMA) in the state. 198 C Appendix to Chapter 4 C.1 Appendix Tables 199 Table C.1.1: Coverage Eligibility Changes at CHIP Onset State Enrollment Begins Eligibility Change (% FPL) infant age 1-5 age 6-18 Alabama Oct-98 133-200* 133-200 100-200 Alaska Mar-99 133-200* 133-200 100-200 Arizona Nov-98 140-200 133-200 100-200 Arkansas Sep-97 133-200 133-200 100-200 California Jul-98 200-250 133-200 100-200 Colorado May-98 133-185 133-185 100-185 Connecticut Jun-98 185-300 185-300 185-300 Delaware Feb-99 185-200 133-200 100-200 District of Columbia Oct-98 185-200* 133-200 100-200 Florida Apr-98 185-200 133-200 100-200 Georgia Jan-99 185-200* 133-200 100-200 Hawaii Jul-00 185-200 133-200 100-200 Idaho Oct-97 133-160 133-160 100-160 Illinois Jan-98 133-200* 133-133 100-133 Indiana Oct-97 150-150 133-150 100-150 Iowa Jul-98 185-185 133-133 100-133 Kansas Jan-99 150-200 133-200 100-200 Kentucky Jul-98 185-185 133-150 100-150 Louisiana Nov-98 133-133 133-133 100-133 Maine Aug-98 185-185 133-185 125-185 Maryland Jul-98 185-200* 185-200 185-200 Massachusetts Oct-97 185-200* 133-133 100-133 Michigan May-98 185-200* 150-200 150-200 Minnesota Oct-98 275-280 275-275 275-275 Mississippi Jan-99 185-185 133-133 100-133 Missouri Jul-98 185-300 133-300 100-300 Montana Jan-99 133-150 133-150 100-150 Nebraska Sep-98 150-185* 133-185 100-185 Nevada Oct-98 133-200 133-200 100-200 New Hampshire Sep-98 185-300 185-300 185-300 New Jersey Mar-98 185-200 133-200 100-200 New Mexico Mar-99 185-235 185-235 185-235 New York Jan-99 185-192 133-192 100-192 North Carolina Oct-98 185-200 133-200 100-200 North Dakota Oct-99 133-140 133-140 100-140 Ohio Jan-98 133-150* 133-150 100-150 Oklahoma Dec-97 150-185* 133-185 100-185 Oregon Jul-98 133-170* 133-170 100-170 Pennsylvania Jun-98 185-235 133-235 100-235 Rhode Island May-97 250-250 250-250 100-250 South Carolina Oct-97 185-185 133-150 100-150 South Dakota Jul-98 133-133 133-133 100-133 Tennessee – – – – Texas May-00 185-200 133-200 100-200 Utah Aug-98 133-200 133-200 100-200 Vermont Oct-98 225-300 225-300 225-300 Virginia Nov-98 133-185 133-185 100-185 Washington Jan-00 200-250 200-250 200-250 West Virginia Nov-00 150-200 133-200 100-200 Wisconsin Jul-99 185-185 185-185 100-185 Wyoming Nov-99 133-133 133-133 100-133 Notes. States marked with *in theinfant column simultaneously increased coverage eligibility of pregnant women. See main text for more details. 200 C.2 Appendix Figures Figure C.2.1: SCHIP effect on low birth weight, longer pre-trend −16 −15 −14 −13 −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.04 −.02 0 .02 −.04 −.02 0 .02 full sample low SES + high welfare conception relative to SCHIP onset, months Notes. Figure shows the event study of SCHIP exposure on low birth weight, extending the sample to include cohorts conceived 16 months (instead of 12 months in the main analysis) prior to program onset. Cohorts conceived 12-10 months prior to onset are impacted in the third trimester, 9-7 months prior are impacted in the second trimester, and those conceived 3-1 months prior are impacted in the first trimester. The right panel plots the event study for the low SES sample (single mothers without college education) in high welfare transfer counties. 95% confidence intervals based on robust standard errors clustered at the state level are plotted. 201 Figure C.2.2: SCHIP effect on smoking, longer pre-trend −16 −15 −14 −13 −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.1 −.05 0 .05 −.1 −.05 0 .05 full sample low SES + high welfare conception relative to SCHIP onset, months Notes. Figure shows the event study of SCHIP exposure on smoking, extending the sample to include cohorts conceived 16 months (instead of 12 months in the main analysis) prior to program onset. Cohorts conceived 12-10 months prior to onset are impacted in the third trimester, 9-7 months prior are impacted in the second trimester, and those conceived 3-1 months prior are impacted in the first trimester. The right panel plots the event study for the low SES sample (single mothers without college education) in high welfare transfer counties. 95% confidence intervals based on robust standard errors clustered at the state level are plotted. 202 Figure C.2.3: SCHIP effect on drinking, longer pre-trend −16 −15 −14 −13 −12 −11 −10 −9 −8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 −.2 −.1 0 .1 −.2 −.1 0 .1 full sample low SES + high welfare conception relative to SCHIP onset, months Notes. Figure shows the event study of SCHIP exposure on drinking, extending the sample to include cohorts conceived 16 months (instead of 12 months in the main analysis) prior to program onset. Cohorts conceived 12-10 months prior to onset are impacted in the third trimester, 9-7 months prior are impacted in the second trimester, and those conceived 3-1 months prior are impacted in the first trimester. The right panel plots the event study for the low SES sample (single mothers without college education) in high welfare transfer counties. 95% confidence intervals based on robust standard errors clustered at the state level are plotted. 203
Abstract (if available)
Abstract
Health is a key aspect of human capital. In modern societies, health insurance programs play important roles in promoting health and health investments, particularly for the vulnerable population. In the US, expenditures on government-sponsored health insurance programs accounted for 8.1% of GDP in 2017, a 3.5% increase from 2016. On the other hand, the extent to which the spending benefits enrollees instead of providers, and the effectiveness of the spending in improving health and human capital, are more nuanced but important questions for policy design. ❧ This dissertation analyzes the benefits and beneficiaries of policy interventions in three health insurance programs. Chapter 2 looks at the valued-based payment reform in the private Medicare market that took effect in 2012. The reform cut the payment to private Medicare insurers, but increased quality-adjusted payments to insurers with higher quality rating. I find that the bonus payment did not reduce the out-of-pocket costs for enrollees: the pass-through of the bonus payment to enrollee premium and deductible is indistinguishable from zero. Nor did enrollees in high-quality contracts report greater improvement in health. Rather, I find that high-quality contracts decreased (increased) premium in low (high) risk counties, differentially enrolling low-risk enrollees. Examining the rubric of the quality rating, I note that health outcome measures are not adjusted by diagnoses codes prior to enrollment, but receive the greatest weight in the computation of quality. The risk selection therefore has the unintended consequence of restricting access to quality among the sicker population, who face higher premium to purchase high-quality contracts. ❧ Chapter 3 seeks to understand the rationale of expanding formal health insurance with policies such as premium subsidy and mandate penalty. The classic justification for the insurance mandate is adverse selection. In the US context, because the uninsured are not turned away at the emergency room, the social cost of uncompensated care provides additional motivation for expanding formal insurance. To understand the joint and relative relevance of either argument, I turn to the 2006-2007 health insurance reform in Massachusetts. ❧ I focus on two policy instruments that expanded formal insurance in the state: subsidy to private insurance premium and penalty on the high-income uninsured. I derive and calculate the welfare effects exploiting behavioral responses to policy incentives and the resulting externality on premium, uncompensated cost, and tax-subsidy transfers. I find that the rationale of mandate penalty rests entirely on the selection effect on premium: shutting down the premium response, uncompensated care alone does not offset the cost of penalty. Premium subsidy, by contrast, is mostly motivated by the high cost of uncompensated care in the low-income uninsured. Including a modest effect on premium—realistic even in the presence of mark-up adjustment to price-linked subsidy—generates net positive return of subsidy dollars purely from an efficiency standpoint. ❧ Chapter 4 looks at the long-run impact of in-utero investment response to Children’s Health Insurance Program (CHIP). I find pregnant mothers impacted by CHIP onset during pregnancy reduced smoking and drinking, and their children have higher birth weight and lower chance of cognitive difficulty in teenage. The forward-looking behavior implies that public investment in children can “crowd-in” private investment that precedes program participation. Accounting for the short-term benefits at birth and the long-run benefits on later-life outcomes increases the cost-effectiveness of public insurance programs for children.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Three essays on the evaluation of long-term care insurance policies
PDF
Essays in health economics and provider behavior
PDF
Three essays on estimating the effects of government programs and policies on health care among disadvantaged population
PDF
Three essays in health economics
PDF
Essays on health economics
PDF
Essays on health and aging with focus on the spillover of human capital
PDF
Essays on development and health economics: social media and education policy
PDF
Essays on development economics
PDF
Three essays on health & aging
PDF
Essays on development and health economics
PDF
Three essays on economics of early life health in developing countries
PDF
Behavioral approaches to industrial organization
PDF
Essays on education and institutions in developing countries
PDF
Thesis on Medicare Part D
PDF
Value in health in the era of vertical integration
PDF
Essays on work, retirement, and fostering longer working lives
PDF
Essays on family planning policies
PDF
Three essays on health economics
PDF
Three essays on behavioral economics approaches to understanding the implications of mental health stigma
PDF
Essays on the politics of the American welfare state
Asset Metadata
Creator
Wang, Hongming
(author)
Core Title
Essays on health insurance programs and policies
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Publication Date
06/17/2019
Defense Date
04/22/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
costs and benefits,externalities,health insurance,OAI-PMH Harvest,social insurance,Welfare
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Strauss, John (
committee chair
), Chen, Alice (
committee member
), Nugent, Jeffrey (
committee member
)
Creator Email
hongminw@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-175637
Unique identifier
UC11662468
Identifier
etd-WangHongmi-7494.pdf (filename),usctheses-c89-175637 (legacy record id)
Legacy Identifier
etd-WangHongmi-7494.pdf
Dmrecord
175637
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Wang, Hongming
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
costs and benefits
externalities
health insurance
social insurance