Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on innovation, human capital, and COVID-19 related policies
(USC Thesis Other)
Essays on innovation, human capital, and COVID-19 related policies
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Essays on Innovation, Human Capital, and COVID-19 Related Policies by Shaoshuang Yang A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ECONOMICS) May 2023 Copyright 2023 Shaoshuang Yang Contents List of Tables iv List of Figures v Chapter 1 The Distribution of Innovations across Firms 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 A Theoretical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.2.1 Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2.2 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.3 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.3.2 Moment Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.3.3 Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 1.4.1 Optimal Contracts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 1.4.2 The Distribution of Inventors across Firms . . . . . . . . . . . . . . . 36 1.4.3 Untargeted Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 1.4.4 Growth Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . 39 1.4.5 Counterfactual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 1.4.6 Endogenous Contracting and Firm-Investor Matching . . . . . . . . . 47 1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Chapter 2 Estimate the Belief Bias in Learning from Coworkers 51 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.2.1 Firms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.2.2 Workers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 2.2.3 Labor market and Firm Entry . . . . . . . . . . . . . . . . . . . . . . 57 2.2.4 Knowledge Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.2.5 Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.3 Structural Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.3.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.3.2 Identifying Perceived And Actual Learning Parameters . . . . . . . . 63 2.3.3 Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 2.3.4 Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Chapter 3 72 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 3.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 ii 3.3 Empirical Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3.4.1 Containment measures . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Chapter A The Distribution of Innovations across Firms 80 A.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 A.2 Growth Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 A.3 Untargeted Moments—In-house versus outsourcing choice . . . . . . . . . . . 81 A.4 Counterfactuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 A.4.1 Counterfactual: Tax . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 A.4.2 Counterfactual Decomposition: Signal . . . . . . . . . . . . . . . . . 84 A.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 A.5.1 A More Recent Sample Period . . . . . . . . . . . . . . . . . . . . . . 85 A.5.2 Financial Friction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 A.5.3 Innovations as Substitutions . . . . . . . . . . . . . . . . . . . . . . . 86 A.5.4 Use Stock Options Instead of Equity . . . . . . . . . . . . . . . . . . 88 A.5.5 Patent Originality Distribution . . . . . . . . . . . . . . . . . . . . . 88 Chapter B Estimate the Belief Bias in Learning from Coworkers 90 B.1 Multi-dimension Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 B.1.1 Case 1: the non-transmissible is linear in knowledge . . . . . . . . . . 91 B.1.2 Case 2: the non-transmissible is independent with knowledge . . . . . 92 Chapter C Bibliography 93 iii List of Tables 1.1 Model Fit for Key Moments of the Statistical Model . . . . . . . . . . . . . . 28 1.2 List of Parameters Used in the Model . . . . . . . . . . . . . . . . . . . . . . 29 1.3 External Calibrated Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.4 Indirect Inference Calibrated Parameters . . . . . . . . . . . . . . . . . . . . 34 1.5 Model Fit for Key Moments—Targeted Moments . . . . . . . . . . . . . . . 35 1.6 Model Fit for Untargeted Moments . . . . . . . . . . . . . . . . . . . . . . . 39 1.7 Model Fit for Untargeted Moments Using Spanish Data . . . . . . . . . . . . 39 1.8 Counterfactuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 1.9 The Secondary Market’s Impact Depends on the Probability of Selling an Innovation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 1.10 Endogenous Contracts and Matching . . . . . . . . . . . . . . . . . . . . . . 48 2.1 Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 2.2 Inequality Change When Perception Is Correct . . . . . . . . . . . . . . . . . 71 3.1 Fiscal measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 3.2 Containment measures, sub-index level. . . . . . . . . . . . . . . . . . . . . . 78 A.1 Firm R&D Investment Regressions . . . . . . . . . . . . . . . . . . . . . . . 82 C1 Counterfactual on Secondary Market Tax: 1% of the Transaction Price . . . 84 C2 Endogenous Contracts and Matching—Signals (R 2 =0.4%) . . . . . . . . . . 85 C3 Directly Calibrated Parameters Given Indirect Inference Results (A More Re- cent Sample Period) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 C4 Indirect Inference Calibrated Parameters (A More Recent Sample Period) . . 86 C5 Model Fit for Key Targeted Moments (A More Recent Sample Period) . . . 87 C6 Counterfactuals (A More Recent Sample Period) . . . . . . . . . . . . . . . . 87 iv List of Figures 1.1 Public Innovative Firm Size Distribution . . . . . . . . . . . . . . . . . . . . 26 1.2 Probability of Being Public Conditional on Firm Size . . . . . . . . . . . . . 26 1.3 Estimated Firm Size Distribution of Innovative Firms . . . . . . . . . . . . . 28 1.4 Estimated Innovation Distributions across Innovative Firms (with more than 500 employees) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 1.5 The Impact of k 1 on h(q) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 1.6 The Impact of k 2 on h(q) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 1.7 The Impact of η on Innovation Distribution . . . . . . . . . . . . . . . . . . 33 1.8 The Innovation-Related Risk across Firm Sizes . . . . . . . . . . . . . . . . . 36 1.9 The Equilibrium Effort and the Optimal Contract . . . . . . . . . . . . . . . 36 1.10 The Relationship between Inventors and Firms . . . . . . . . . . . . . . . . . 36 1.11 Cumulative Share of Innovations . . . . . . . . . . . . . . . . . . . . . . . . . 37 1.12 Probability Density of Innovation . . . . . . . . . . . . . . . . . . . . . . . . 37 1.13 The Mapping between Inventors and Firms . . . . . . . . . . . . . . . . . . . 45 1.14 The Innovation-Related Risk Share Increases with the Signal . . . . . . . . . 46 1.15 The Optimal Contract Changes with the Signal . . . . . . . . . . . . . . . . 46 2.1 Team Size Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 2.2 Age Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 2.3 Wage Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 2.4 Wage Gap Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.1 The distribution of fiscal and containment measures across countries. 74 D1 The Average Originality By Firm Size in 1997 . . . . . . . . . . . . . . . . . 89 v Abstract This thesis includes three parts, aiming to understand economics behind innovation, human capital accumulation, as well as how governments react to COVID-19 pandemic. The first chapter builds a macroeconomic framework with heterogeneous firms and het- erogeneous inventors to quantitatively understand the allocation of innovations across firms and the implications of innovation tradability on the allocation. The model characterizes some key aspects of the innovation process that captures firms’ decision to innovate in-house versus hiring innovators as well as trading innovations with each other. The model, cal- ibrated to moments in the United States, including the probability of selling innovations, and the distribution of innovations across firms implies that projects whose outcomes are more effort-sensitive are developed by smaller firms. In a counterfactual scenario where firms cannot sell innovations, inventors move to larger firms. The share of innovations in firms with more than 100,000 employees increases by more than 10 percentage points, and growth drops by 0.17 percentage points. The second chapter studies the belief bias in the workplace. I build a structural model where workers can learn from coworkers. They choose where to work based on both wage and perceived learning opportunities. I estimate the model using German administrative data which contain information on workforce composition and workers’ characteristics. I propose a methodology to jointly estimate the perceived and the correct learning functions, building on the observation that learning is priced by a competitive market based on belief. The estimation results show that workers overestimate how much they can learn from coworkers by seven times. It implies that better knowledgeable workers are overpaid while the rest are underpaid, which increases the within-team inequality. Thethirdchapterlooksintohowelectionsshapepolicymakingduringacrisis. TheCovid crisis offers a unique opportunity to examine this issue in the context of a homogeneous and contemporaneous shock. We find that closer elections predict more generous fiscal pack- ages and less restrictive containment measures. Exploring the heterogeneity in containment measures, we find that elections reduce restrictions that impact economic activity the most. i Chapter 1 The Distribution of Innovations across Firms 1.1 Introduction Innovationsarethefuelofgrowthandtheyaredevelopedinmanydifferenttypesoffirms. For example, AirPods were invented by Jason Giles at Apple, and both Facebook and Whatsapp were developed by inventors in start-ups. Also, firms resell innovations to other firms that complement them. For example, WhatsApp was sold to Facebook, which already owned a social network platform. Why are some innovations created in big firms and others in small firms? What are the forces pushing innovation inside or outside a firm’s boundaries? Why are some innovations kept and some sold? This paper studies how the option of selling innovations affects growth in an endogenous contracting setting. I model the entire process of producing an innovation, including both the primary and the secondary markets. In the primary market, an inventor with some innovative idea decides what firm to work for. In the secondary market, a firm decides whethertoresellaninnovationtoanotherfirm. IcalibratethemodelusingUSdatamoments, including the probability of selling innovations, and the distribution of innovations across 1 firms. I run counterfactual experiments on the innovation tradability. Trade frictions in secondary markets affect not only the innovation allocation and firm growth but also the aggregate growth rate. In the extreme case where firms are not allowed to trade innovations, inventors move to larger firms, which contribute more to the growth than before. The share of innovations created in start-ups and firms with fewer than 500 employees decreases from 7.9% to 1.4%. By contrast, for firms with more than 100,000 employees, this share increases by more than 10%. The overall aggregate growth rate in the economy decreases by 0.17%, from 2.00% to 1.83%. I show that the endogenous contracting and firm-inventor matching play an important role quantitatively. They mitigate the growth effect of shutting down secondary markets. If ignored, growth would drop 0.21 percentage points instead, and the innovation distribution across firms would remain mostly unchanged. The paper shows that the option to sell innovations is important for growth; endogenous contracting and mapping matter for quantitative magnitude. I model a population of firms of different and endogenously evolving sizes. They face a population of risk-averse, short-lived inventors with innovation projects that differ in effort sensitivity. 1 The primary market is where firms offer compensation to attract inventors and an inventor accepts the contract of the firm that gives the highest utility. Then the inventor works on the idea to develop an innovation. A firm chooses its compensation schedule to solve a principal-agent problem. The firm is risk neutral and improves its product quality using innovations. It offers a combination of equity and wages to provide incentives, share risk, and split surplus with the inventor; when the firm offers more equity, the inventor is better incentivized but also more exposed to the variance in equity. In particular, the use of equity exposes inventors both to the risks of their own innovations and all other risks the firm faces. 2 A given inventor’s innovation-related risks contribute to a smaller portion of the equity variance in a larger firm. Specifically, in 1 It measures the elasticity of the innovation rate with respect to the inventor’s effort 2 The problem cannot be solved by cash bonus contingent on an innovation delivery, because of the assumption that inventors can easily come up with a fake innovation, and whether an innovation is fake cannot be proved to a judge. See more in Section 1.2.1. 2 start-ups, thevalueofequitydependssolelyontheinventor’sinnovation. Thus, start-upscan offer compensation to maximize innovation outcomes merely restricted by the inventor’s risk aversion. The inventor can be well-incentivized to exert effort. In larger firms, by contrast, much of the equity value depends on stochastic factors unrelated to the innovator’s effort (e.g. the success of other product lines). If large firms provided the same amount of incentive as start-ups, which means the large firms provided the same share of equity as start-ups, the inventors would be exposed to lots of unrelated risks and face a highly levered compensation package. The equilibrium outcome is that both the equity share and the incentive decrease with the firm size, and an inventor works harder in a smaller firm. In larger firms, an innovation is likely to be more valuable since it has a higher probability of complement with the firm (Figueroa and Serrano, 2019). 3 The complementarity includes that a firm can apply the technology to a larger production scope, the innovation is in line with the manager’s expertise, or the firm has enough liquidity to meet any potential financial requirement in commercialization. 4 If there is complementarity, the firm is the most efficient user; otherwise, there are other firms which are more efficient in using the technology. Given contracts, an inventor chooses in which firm to work by trading off between the chances of a successful innovation and the value of the innovation. On the one hand, the smaller the firm is, the harder the inventor works, hence a higher probability of innovating. On the other hand, conditional on the delivery of an innovation, a larger firm on average extract more value from the innovation. Therefore, the key trade-off is that working in a bigger firm means a higher surplus from using an innovation, while a smaller firm leads to better incentives and a higher probability to get an innovation. The latter is more important for an inventor with an effort-sensitive idea. As a result, inventors whose ideas are more effort-sensitive work for small firms or start-ups, whereas inventors with less effort-sensitive ideas work for large firms. 3 This paper focuses on quality-improvement innovations. In the appendix, I consider when innovations are substitutes for existing technologies. 4 In the appendix, I show that under certain functional form assumptions, financial frictions show up in the model in the same way as complementarity. 3 Once an innovation is created, the firm decides whether to resell it on a secondary market. Before making the decision, the firm observes the innovation step size, 5 which is random, and whether it has complementarity with the innovation. When there is no complementarity, it tries to resell the innovation on the secondary market. The buyers are the firms having complementarity. The secondary market serves as a reallocation mechanism to make efficient use of innovations. A firm resells its innovation on a secondary market with asymmetric information. This is because the innovation step size is private information—a firm cannot prove it to others without telling them the technology details (Silveira and Wright, 2010; Chiu et al., 2017). The unobservable innovation step size leads to a lemons market where only low-quality in- novations are sold. Therefore, firms cannot fully capture the innovation’s full first-best-use value; that is, an innovation’s value decreases with the firm size which explains why not everyone goes to startups. This adverse selection feature in innovations is also true in Chat- terjee and Rossi-Hansberg (2012), which analyzes the spin-off decisions under asymmetric information. One special case is when the inventor works for a start-up. If the start-up keeps the innovation, either because of complementarity or failing to sell it, the start-up enters the market and begins to produce. On the other hand, if the start-up sells the innovation to another firm, it is the model counterpart to a start-up buyout—namely, an incumbent acquires the start-up for its technology. I use patents as a real-world proxy for innovations. I calibrate the model to moments from US patenting firms between 1982 and 1997, matching the aggregate growth rate, the probability of a firm selling its patents, the start-up buyout rate, the distribution of patents by firm size, and the average value of an innovation. The patent distribution is calculated using data from the US Patent and Trademark Office (USPTO), the Center for Research in Security Prices (CRSP), the linked CRSP/USPTO data provided by Kogan et al. (2017), 5 The step size is the change in firm quality resulting from an innovation. 4 and CRSP/Compustat Merged Database. A public firm sample is first constructed using the datasets. Then, I build a statistical model to estimate the weight for each firm, as if being in the public firm sample were stratified based on firm sizes. I use the statistical model to estimate the distribution of innovations across all patenting firms. Four data points in the distribution are targeted in the calibration; the rest are all untargeted. This model is able to match the majority part of the distribution of innovations across firms. Another untargeted moment is the growth rate gap between the 90th percentile and 50th percentile innovative firms, which my model can also match. Moreover, I use in-house R&D investment by firm sizes as the third set of untargeted moments. It confirms that the model can capture some of the trade-offs between in-house and outsourcing. Finally, I use the model to run counterfactuals. In addition to the experiment where shutting down the secondary market, I simulate the cases in which there are public signals forinnovationstepsizesandsubsidiesoninnovationtrade, bothofwhichalleviateinefficiency in the secondary market. The growth rate increases with efficiency. In the benchmark model, there is no signal about innovation quality. Due to the lemons market, only low-step-size innovations are sold. When potential buyers can observe a signal about the innovation step size, more innovations are sold. For example, if the R 2 of the signal is 0.6, which means that the signal can explain about 60% of the variance in an innovation step size, the probability of selling increases from 5.3% to 5.9%. Since firms can sell unwanted innovations more easily, smaller firms are more attractive. More innovations are created in start-ups and medium- small- to medium-large-sized firms—the share of innovations created in start-ups increases from 0.31% to 5.21%. The growth rate increases by 0.08 percentage points, from 2.00% to 2.08%. When buyers receive a subsidy when purchasing an innovation, it has a similar impact. Quantitatively, when the subsidy is 5% of the transaction value, the aggregate growth rate increases by 0.014 percentage points, and the share of innovations in start-ups increases from 0.31% to 1.1%. In the baseline model, both the contract terms and the inventor-firm matching are en- 5 dogenous. In each counterfactual, I also examine the role of endogenous contracting and matching. First, I assume that the the contract terms are fixed while firm-inventor match- ing is flexible. In the counterfactual case where there is no secondary market, the growth rate drops slightly less and the reallocation is less significant. This is because firms over- incentivize inventors by using a pre-fixed contract. Secondly, wehn the contracts are flexible but inventor-firm matching is fixed, the growth rate drops more, to more than 0.21 percent- age points. Meanwhile, the innovation distribution almost remains un-changed. Thirdly, when neither the contract nor the matching can adjust, the growth rate drops more than the baseline counterfactual. Overall, the endogenous contract terms and firm-inventor matching mitigate the growth rate drops and amplify the distribution effect. Related Literature My paper relates to a literature exploring the implications of trad- ing knowledge on firm innovations. Examples include Cassiman and Veugelers (2006), Hig- gins and Rodriguez (2006), Phillips and Zhdanov (2013), Bena and Li (2014), and Liu and Ma (2021), and the impact of the idea market on the economy growth (Eaton and Kortum, 1996; Silveira and Wright, 2010; Chatterjee and Rossi-Hansberg, 2012; Chiu et al., 2017; Cabral, 2018; Cunningham et al., 2021; Perla et al., 2021; Fons-Rosen et al., 2021). Recent papers studying the impact of the secondary market in a general equilibrium model include Akcigit et al. (2016) and Ma (2022). This work and my paper share a common framework, inherited from the endogenous growth literature 6 . The firm production framework is closest to Acemoglu et al. (2018), assuming the intermediate goods producers use innovations to improve qualities, and final goods producers assemble intermediate goods. This paper con- tributes to the literature by considering the endogenous contracting and matching. I show that the endogenous firm boundaries are important when thinking about the reallocation of inventors across firms in counterfactual scenarios, by mitigating the impact on the aggregate growth rate and amplifies the effect on the innovation distribution. This model thinking about where inventors work relates to the literature on the boundary 6 For example, Romer (1986), Aghion and Howitt (1992), Aghion et al. (2001), Klette and Kortum (2004), Acemoglu et al. (2018), and Akcigit and Kerr (2018) 6 of the firm, going back to Coase (1937), important examples of which include Grossman and Hart (1986), Hart and Moore (1990), and Hart and Moore (2008). Closest are Aghion and Tirole(1994)andSchmitz(2005)whoanalyzetheimplicationsoftheinnovations’ownership. This paper casts these ideas in a quantitative general equilibrium economic setting with heterogeneous inventors and heterogeneous firms. My findings show that the endogenous firm boundaries matter quantitatively for both firm-level results and aggregate growth rate. This paper is also connected to the discussion on intellectual property right protection. Examples include Boldrin and Levine (2013), Galasso and Schankerman (2015), and Budish et al. (2015). I focus on the general equilibrium implications of tradability. My results are consistentwiththeevidencederivedfromtwonaturalexperimentsbyMa(2022)andAcikalin et al. (2022); namely, decreasing tradability undermines small firms disproportionately, and more innovations are created in large firms. The rest of the paper is organized as follows: Section 1.2 describes the model and char- acterizes the equilibrium. Section 1.3 describes the calibration. Section 1.4 reports the quantitative results and counterfactuals, and Section 1.5 concludes. 1.2 A Theoretical Model I have built a theoretical model where firms compete to attract inventors to innovate in- house. The goal is to study the allocation of innovations across firms and how innovation market affects firms’ growth. There are two problems: in the primary market, an inventor chooses what size of the firm she wants to work for; in the secondary market, firms buy and sell innovations with each other. I first introduce the modeling environment and then describe the equilibrium. 7 1.2.1 Environment Time is continuous. There are four types of agents: households, inventors, intermediate goods producers, and final goods producers. A representative household provides labor and consumes final goods. Inventors use effort to produce innovations in intermediate firms and consume final goods. There is a continuum of intermediate firms. Each produces one unique type of product and uses innovations to improve its product quality. The final good producers assemble intermediate goods to produce. Preferences The household is long-lived. It supplies one unit of labor inelastically. The utility function is U H = Z ∞ 0 e − ρt log(C H (t))dt, (1.1) where ρ> 0 is the discount rate and C H (t) is the consumption of the household. An inventor is short-lived and lives for dt time periods. There is a continuum of inventors of measure 1 in every period. An inventor provides effort e I to produce inventions. She is risk averse and has a mean-variance utility: U I (c I ,e I )=E(c I )− var(c I ) 2¯q − R(e I )¯q, (1.2) where c I is the consumption, e I is the effort level, and R(e I )¯q is the associated cost. ¯q (defined below) is the average quality in the economy. The cost is scaled by ¯q to keep the problem stable over time. Denote the inventors’ aggregate consumption using C I . Technology Firms are owned by household. The final good producers produce final goods using a con- tinuum of intermediate goods j ∈ [0,N F ] with production technology which is similar to 8 Akcigit and Kerr (2018) 7 : Y(t)= 1 1− β Z N F 0 q β j (t)y 1− β j (t)dj. (1.3) In this function, q j (t) is the quality of the intermediate good j, and y j (t) is its quantity. I normalize the price of the final good to one in every period. The final good producers are perfectly competitive, taking input prices as given. Henceforth, I will drop the time index t when it does not cause confusion. The final goods are consumed by the household and inventors. The resource constraint of the economy is: Y =C H +C I . (1.4) There is a continuum of measure N F risk neutral firms producing intermediate goods. Each firm produces one kind of good, using a linear technology using only labor: y j = ¯ql j , (1.5) wherel j is the labor input, ¯q = 1 N F R N F 0 q j dj is the average quality. It means that innovations have positive externality, similar to Romer (1986). The cost is linear in wage w, which firms take as given. The labor market satisfies the constraint: Z N F 0 l j dj≤ 1. (1.6) The production technologies, together with the market setting on innovation, ensure that a firm’s value V(q j ) is linear in quality q j V(q j )=νq j , (1.7) 7 The difference is that in my specification, the final good producers only use intermediate goods, not labor, to assemble the final good. My model yields similar results if the final good producers use labor as well. 9 where ν is endogenous. The proof is in Section 1.2.2. This paper focuses on the balanced growth path. I normalize the variables using the average quality ¯q. I denote the normalized variables using tilde: ˜ q j ≡ q j ¯q , ˜ Q≡ Q ¯q , ˜ V (˜ q)≡ V (q j ) ¯q =ν ˜ q j , (1.8) where Q≡ R N F 0 q j dj is the total technology stock in the economy. A firm’s quality q is affected by both business shocks and innovations. The business shock δ follows a Poisson arrival rate normalized to 1 and is randomly drawn from Ξ( δ ). When hit by a shock, the firm’s quality becomes q+δq . In intermediate firms, innovations are produced by inventors using effort. Each inventor is born with one innovative idea type θ , that measures the idea’s effort sensitivity (θ ∈ (0,1),θ ∼ Ψ( θ )). Shejoinsafirm, eitheranincumbentorastart-up, andexertsunobservable effort e I to transform the idea into an innovation. Given e I , an innovation arrives with the instantaneous Poisson flow rate: λ θ (θ,e I )=µe θ I . (1.9) The innovation production function is based on the growth theory literature (Romer, 1990; Klette and Kortum, 2004; Akcigit and Kerr, 2018). The literature usually treats θ as a parameter, whereas here, θ is heterogeneous across innovations. It allows me to consider the mappingbetweeninnovationsandfirms. θ istheeffort-elasticityintheinnovationproduction function and I refer to inventors with high θ as effort-sensitive inventors. Additionally, in the literature, a unified firm produces and implements innovations; in my model, however, it is the inventor who creates innovations, and the firm only enjoys the outcome. It is costly to work, and the flow cost of choosing effort e I is R(e I )¯q. Thus, without incentives, the inventor chooses e I =0. Meanwhile, she can always effortlessly make up a useless innovation that does not improve quality. The usefulness is not verifiable. If the inventor creates a useful innovation, a patent is granted. An innovation’s step 10 size z is drawn from a distribution Φ( z). By implementing an innovation, a firm’s quality increment is ∆ q, where ∆ q = γ L zQ with probability (1− h(˜ q)) γ H zQ with probability h(˜ q),γ H >γ L . (1.10) h(˜ q) is the probability that a firm has complementarity with the innovation. I assume that h ′ (˜ q) > 0; larger firms are more likely to have complementarity with an innovation. The complementarity includes that a firm can apply the technology to a larger production scope, the innovation suits its business operation, or the firm has enough liquidity to meet potential requirements in commercialization. γ L andγ H capture the different efficiency in applying the technology. When there is complementarity, the firm is the efficient user of the innovation. In this case, the quality increment is ∆ q = γ H zQ. Otherwise, the quality improvement is lower. In this case, there are always some firms that have complementarity. 8 Because the firm value is linear in its quality, given an innovation step size, its value to any given firm only depends on the complementarity. If the firm is an efficient user, then there is no gain from trade. Otherwise, there is positive gain from trade. All firms try to sell innovations with which they don’t have complementarity, and buyers are the firms who complement with the innovation. For each innovation, assume that there are at least two firms have complementarity. Information Structure After an inventor is born, the type θ is publicly known. Then the inventor chooses for which firm to work and effort e I . The effort e I is unobservable and unverifiable. Hence, contracts cannot be contingent on the effort level. If the inventor created an innovation, the step size is not verifiable, which means the 8 Akcigit et al. 2016 and Ma (2022) also consider the technology mismatch, but in a different setting. In their papers, if there is a mismatch, the firm cannot use the innovation while in my model, the firm can still apply it. 11 existence of a useful innovation cannot be verified. Therefore, contracts cannot be contingent oneither. Thefirmthatownstheinnovationknowsitsstepsize,butcannotproveittoothers. Hence, firms sell and buy innovations under asymmetric information. The complementarity is public information—all firms know which firms have complementarity with the innovation. Employment Contracting Problem Intheprimarymarket, afirmhiresaninventorbyofferinganemploymentcontract. Iassume that the employment contract has two dimensions: a constant wage T and a share of equity a∈[0,1]. One may think that a better tool for providing incentives would be to offer a bonus con- tingent on the delivery of an innovation, since it would avoid exposing inventors to unrelated shocks. However, inventors can earn bonuses effortlessly by providing useless innovations. Therefore, using a bonus cannot solve the moral hazard problem. A firm could try to use a detailed contract, and try to directly link the inventor’s innovation to the firm’s profit increase and use this as a basis for a bonus. I will follow the incomplete contract literature (Grossman and Hart, 1986), who assumes that it is very hard to specify all innovation out- comes in contracts, and, in practice, it is usually not contractible (Acemoglu, 1996; Frésard et al., 2020); I assume that firms can only use equity and wages in the employee contracts, as commonly seen in the real world (Brickley and Hevert, 1991). The timing is as follows. All firms simultaneously offer firm-inventor-specific contracts to inventors. After viewing all contracts, the inventor chooses her favorite one and joins the firm. If the best contract is offered by q = 0, we call it a start-up. Even in this case, the inventor does not bear all the risk nor holding 100% equity. I assume that q = 0 still (e.g. venture capital funds) offer contracts with endogenous a. 12 Secondary Market Setup Firms can trade unwanted innovations on secondary markets. Sellers are the ones who don’t have complementarity with their innovations. Each innovation is unique and not substi- tutable by others—there is one market for each innovation. Buyers are the firms that have complementarity with the innovation. For each innovation for sale, a firm becomes a poten- tial buyer with a probability proportional to the normalized quality ˜ q. For an innovation, there are at least two potential buyers. Firms sell and buy under incomplete information because the step size is private informa- tion. Firms compete to buy innovations following Bertrand competition. Each buyer offers a price p z simultaneously, and the seller chooses whether to accept one. It is a one-shot game—neither sellers nor buyers keep track of past trades. The settings lead to a standard lemons market (Akerlof, 1978). Entry and Exit There are two entry types of. One is innovation-related, which happens when an inventor chooses to work in ˜ q = 0 firm, successfully creates an innovation and decides to keep it. Denote the amount of innovation-related entrants with λ I . The other is the exogenous entry where firms enter due to reasons other than innovations. Denote the amount of exogenous entrants with λ 0 . In both cases, upon entry, the firm draws a quality ˜ q from a distribution ˜ F q0 (˜ q) and incurs a cost which equals the firm value. This represents the spillover from incumbents to entrants. Assume that ˜ f q0 (˜ q/κ ) = ˜ f q (˜ q)/κ . When κ < 1, entrants are on average smaller. When the new entrant owns an innovation, then its quality q is boosted by ∆ q because of the innovation. Firms face an exogenous exit rate τ . I will focus on a balanced growth path such that entry equals exit τN f =λ I +λ 0 . (1.11) 13 1.2.2 Equilibrium I now characterize the equilibria of the economy in which aggregate variables (Y,C,R,w,¯q) grow at the constant rate g. Production The final good producer chooses {y j } j to maximize its profit using the technology described in Section 1.2.1: max {y j } 1 1− β Z N F 0 q β j y 1− β j dj− Z N F 0 y j p j dj. (1.12) The first-order condition yields the demand function of intermediate firms p j = q β j y β j . The intermediate goods are produced by corresponding firm j∈[0,N F ] using only labor y j = ¯ql j , where ¯q = 1 N F R N F 0 q j dj is the average quality, and l j is the labor input. Intermediate good producers are in monopolistic competition and choosel j ,p j ,y j to maximize their profit, given the wage level w: max l j ,p j ,y j y j p j − wl j . s.t. y j = ¯ql j p j =q β j y − β j (1.13) Therefore, the FOC yields y j =q j ¯q(1− β ) w 1 β ,l j =y j /¯q,p j = w ¯q(1− β ) . (1.14) In each period, the labor market clearing satisfies R N F 0 l j dj = 1, which gives that R N F 0 q j( ¯q(1− β ) w ) 1 β dj ¯q =1. The wage w can be solved w =N β F (1− β )¯q. (1.15) 14 Plug it back into the intermediate firm’s problem. Both production y j and profit π j are linear in quality y j = q j N F ,π j = βq j N 1− β F . (1.16) I drop the subscript j from the firm-level variable when it does not cause confusion. In this model, the firm size is linear in its quality and I will use q to denote firm size when it does not cause confusion. Intermediate firms are the ones who hire inventors to innovate. Because of competition, any value from innovations is captured by inventors, and any value from acquisitions is captured by the seller. The discounted value of being a firm of quality q is the same as the net present value under the assumption that the firm never innovates. Thus, the value function of intermediate firm q at time t can be written as V(q,t)= Z ∞ t e − (r+τ )(s− t) βq/N 1− β F ds=νq. (1.17) where ν = β (r+τ )N 1− β F . The value function is linear in q and does not depend on time. This result implies that for any firm, the value of a same quality improvement ∆ q is the same. The aggregate production is linear in average quality ¯q. The resource constraint of the economy is Y = C H +C I , where R is the total R&D spending in each period. The Euler equation is: g = ˙ Y Y = ˙ C H C H = ˙ ¯q ¯q =r− ρ. (1.18) Secondary market The secondary innovation market is where firms buy and sell innovations. For each innova- tion, there are at least two buyers. Buyers follow Betrand competition and offer prices to the seller. A seller accepts an offer only when it is profitable, which means γ L zν ˜ Q≤ ˜ p z . The left-hand side is the value of the innovation if the seller keeps it, whereas the right-hand side 15 is the value if the seller sells it. The threshold ˆ z satisfies γ L ˆ zν ˜ Q= ˜ p z . (1.19) Buyers set the price ˜ p z according to the zero-profit condition Z ˆ z 0 γ H νz ˜ Q− ˜ p z ϕ (z)dz =0, (1.20) where γ H νz ˜ Q is the innovation value to any buyer with complementarity; ˜ p z is the price of the innovation; ˆ z is the step size threshold. If z≤ ˆ z, an innovation is sold. If a firm is not the efficient user of an innovation, it chooses to sell the innovation if its step size is lower than the threshold ˆ z. Otherwise, the firm uses the innovation to improve its own technology. For each innovation, the probability of being sold is (1− h(˜ q))Φ(ˆz), where (1− h(˜ q)) is the probability that the innovation does not have complementarity with the firm ˜ q where the inventor works; Φ(ˆz) is the cumulative distribution function of z at ˆ z. The total amount of innovations sold per unit of time is Φ(ˆz) Z 1 0 λ ∗ θ (1− h(˜ q ∗ (θ )))ψ (θ )dθ, (1.21) where λ ∗ θ is the equilibrium innovation arrival rate of type θ . ˜ q ∗ denotes the firm for which inventor θ works in equilibrium. Recall that the probability of buying an innovation is λ b ˜ q, because the probability of buying an innovation is assumed to be proportional to ˜ q and the constant λ b is what I solve for. The secondary market clearing gives N F Z λ b ˜ qf q (˜ q(θ ))d˜ q =Φ(ˆz) Z 1 0 λ ∗ θ (1− h(˜ q ∗ (θ )))ψ (θ )dθ, (1.22) 16 which implies that N F λ b =Φ(ˆz) Z 1 0 λ ∗ θ (1− h(˜ q ∗ (θ )))ψ (θ )dθ. (1.23) It pins down the endogenous variable λ b , and the probability of buying an innovation. The innovation value for the original firm ˜ q satisfies x(z,˜ q)= γ L ν ˆ z ˜ Q no complementarity and z≤ ˆ z γ L νz ˜ Q no complementarity and z > ˆ z γ H νz ˜ Q with complementarity. (1.24) When there is no complementarity, and the step size z is lower than the threshold ˆ z, the firm sells the innovation on the innovation market at the price γ L ν ˆ zQ. If the step size z is higher than the threshold, the firm keeps it, and it is worth γ L νzQ . When there is complementarity, the firm is the efficient user of the innovation. Then the firm also implements it in-house, and its value is γ H νzQ . In a first-best allocation, all innovations would be sold to the efficient user. The asymmetric information on the secondary market, however, prevents this from happening. The expected innovation value E(x(˜ q)) can be written as E(x(˜ q))=γ L [Φ(ˆz)ˆ z+(1− Φ(ˆz))E(z|z > ˆ z)](1− h(˜ q))νQ (1.25) +γ H E(z)h(˜ q)νQ. It increases with the firm size ˜ q because h˜ q is increasing in ˜ q. Therefore, conditional on an innovation being created, its value is higher if it is in a large firm. Primary market The primary market is where firms compete to hire inventors using a combination of equity a and wage ˜ T. The setup yields a principal-agent framework. Firms are risk neutral, and 17 inventors are risk averse. Firms enjoy the innovations produced by inventors, but it is costly for inventors to work and firms cannot monitor the effort. Thus, firms want to split the surplus with the inventor by offering a constant wage; meanwhile, firms need to incentivize the inventor to exert effort by offering equity. Because of the competition, the firm’s problem is the same as maximizing the utility it offers to an inventor, subject to the zero profit condition: max a,T,e I ,c I U I a, ˜ T;˜ q,θ,e I =E(˜ c I )− 1 2 var(˜ c I )− R(e I )dt. s.t.λ θ (θ,e I )E(x(˜ q))dt− E(˜ c I )≥ 0 e I =argmax n U I a, ˜ T;˜ q,θ,e I o E(˜ c I )=a E ˜ V 0 (˜ q) +λ θ (θ,e I )E(x(˜ q))dt + ˜ T var(˜ c I )=a 2 λ θ (θ,e I )σ 2 x (˜ q)+σ 2 0 (˜ q) dt (1.26) The optimal contract is {a ∗ (θ, ˜ q),T ∗ (θ, ˜ q)}. The first line is the objective function U I a, ˜ T;˜ q,θ,e I , representing the inventor θ ’s utility when working for firm ˜ q with effort level e I ; a, ˜ T characterize the contract offered by the firm, and ˜ c I is the inventor’s con- sumption level, while R(e I )dt is the cost of effort. The second line is the firm’s individual rationality constraint, where x(˜ q) is the value of an innovation to a firm ˜ q. It multiplies by the probability of creating an innovation λ θ (θ,e I )dt, which gives the expected payoff of hiring an inventor. The cost of hiring is (˜ c I ). The firm will only participate if the expected gain is nonnegative. The third line is the inventor’s incentive compatibility constraint. The inventor chooses an effort level e I to maximize the expected utility U I a, ˜ T;˜ q,θ,e I given firm quality ˜ q and the wage scheme n a, ˜ T o . The fourth line describes the inventor’s expected consumption. The first part, a ˜ V 0 (˜ q)+λ θ (θ,e I )E(x(˜ q))dt , is expected value of the firm equity a, which is the sum of the value without innovations and the expected value of an innovation. The other component, ˜ T, is the constant transfer from the firm to the inventor. The fifth line shows the variance of the inventor’s consumption. The exposure is the equity a. The uncertainty comes from two sources: innovation-related part λ θ (θ,e I )σ 2 x (˜ q) and the 18 innovation-independent part σ 2 0 (˜ q). In start-ups, the second term is zero. I explain both in detail later. Given contracts, the inventor with idea θ chooses firm ˜ q ∗ (θ )∈[0,∞) and exerts effort e I to maximize her utility by solving the following problem: max ˜ q U I a(˜ q,θ ), ˜ T (˜ q,θ ),e I (˜ q,θ ),˜ q,θ . (1.27) Herea(˜ q,θ ), ˜ T (˜ q,θ )ande I (˜ q,θ )arethesolutionstothefirm’sprobleminEquation(1.26). If ˜ q ∗ (θ )>0, theinventorworksinanincumbentfirm; ˜ q ∗ (θ )=0meanstheinventorjoinsanew firm. The equilibrium effort is e ∗ I (θ ) and the corresponding arrival rate is λ ∗ θ =λ θ (θ,e ∗ I (θ )). The firm value is affected by four factors: in-house innovation, purchased innovation, business shocks, and exogenous death. The firm value without in-house innovations ˜ V 0 (˜ q) is ˜ V 0 (˜ q)=ν ˜ q+ − ν ˜ q τdt δν ˜ q dt γ H νz ˜ Q− ˜ p z | z≤ ˆ z λ b ˜ qΦ(ˆz)dt 0 1− (τ +1+λ b ˜ qΦ(ˆz))dt. (1.28) The first line is when the firm dies exogenous. The second line is when hit by a business shock δ that arrives with an arrival rate of 1. The third line is when the firm purchases an innovation. Though the expected gain from purchasing an innovation is zero, the realized gain usually is not zero—it is the gap between the price paid and the firm value improve- ment. The quality improves by γ H z ˜ Q, where z is a random draw from the Φ( z) distribution truncated at ˆ z. The cost of buying an innovation is νγ L ˆ z ˜ Q. The last line is the firm value otherwise. The uncertainty in the firm equity value leads to the variance in the inventor’s consump- tionc I . Because the in-house innovation is independent of other activities, the firm-level risk can be decomposed into two parts: in-house innovation-related risk and the rest. The for- 19 mer one is λ θ (θ,e I )σ 2 x (˜ q)dt, where σ 2 x (˜ q) is the second moment of the innovation value. The latter one is σ 2 0 (˜ q)dt which contains all shocks unrelated to innovations, including the risk from death, business shocks, and unknown step size in innovation trade. It can be written as σ 2 0 (˜ q)=ν 2 ˜ q 2 τ +σ 2 δ +λ b ˜ qΦ(ˆz)E (γ H z− γ L ˆ z) 2 |z≤ ˆ z . (1.29) This risk increases with the firm size ˜ q. Therefore, in a larger firm, the innovation-related risk takes a smaller share in the total equity variance. The firm determines how much equity to offer to the inventor, which depends on the firm size. In start-ups, the equity value depends solely on the inventor’s innovation. Thus, start-ups can offer compensation in the form of equity and wages merely restricted by the inventor’s risk aversion—the inventor can be well-incentivized to put effort into innovating. In larger firms, by contrast, the inventor’s innovation only makes up a small fraction of the equity value; much of the equity value depends on stochastic factors unrelated to the innovator’s effort (e.g. success of other product lines). If larger firms were to offer the same incentive as start-ups for the same expected compensation, inventors would be exposed to manyunrelatedrisksandfaceahighlyleveredcompensationpackage; thisisnotattractiveto risk averse inventors. Hence, both the optimal equity a and the resulting incentive decrease with firm size. After seeing all contracts, the inventor picks the contract offered by firm ˜ q ∗ (θ ), which gives her the highest utility. Equation (1.27) can be written as: max ˜ q,e I U I (˜ q;θ,e I )=λ θ (θ,e I )E(x(˜ q))dt− 1 2 a 2 (˜ q) λ θ (θ,e I )σ 2 x (˜ q)+σ 2 0 (˜ q) dt− R(e I )dt. s.t.e I =argmax{U I (˜ q;θ,e I )}. (1.30) The main trade-off an inventor facing is a higher chance to create an innovation versus a better value after an innovation is created, which is represented by the first term of Equation (1.30): λ θ (θ,e I )E(x(˜ q)). The two elements work in opposite directions. A firm with higher 20 quality ˜ q on average can implement an innovation more efficiently, which means that the value-added of an innovation E(x(˜ q)) is higher; meanwhile, a smaller firm can offer better incentive and hence higher arrival rate λ θ (θ,e I ). The inventor’s decision then depends on the relative strength of the two forces, which, in turn, is determined by the inventor type. For an effort-sensitive inventor with high θ , the incentive channel is more important. As a result, she chooses to work for a small firm. Similarly, when an inventor’s idea has low effort-elasticity, the value channel is more crucial and she works for a large firm. Next, I change the measure of the innovation arrival rate from “per inventor” to “per firm”. For each inventor, the equilibrium innovation arrival rate generated by herself in the firm ˜ q is λ ∗ θ . In firm ˜ q, the innovation arrival rate is affected by both the single innovation arrival rate and the number of inventors per firm. Denote the arrival rate in firm ˜ q by λ q (˜ q) and the firm probability distribution function by f q (˜ q). The total amount of innovations in a firm should be the same as the total innovations made by inventors who work for the firm. The inventor market clearing gives N F λ q (˜ q(θ ))f q (˜ q(θ ))d˜ q =λ ∗ θ ψ (θ )dθ, (1.31) which pins down the innovation arrival rate per firm. Entry and Exit Recall there are two types of entries in this model. One is innovation-related. It happens when an inventor chooses to work in a firm with q =0, successfully gets an innovation, and decides to keep it. The number of firms that enter this way can be written as λ I = Z ˜ q ∗ (θ )=0 λ ∗ θ Innovations kept z }| { (1− Φ(ˆz))(1− h(0)) | {z } Non-complementary & big step size + h(0) |{z} Complementary ψ (θ )dθ. (1.32) 21 The other type is the exogenous entry, which enters the market without an innovation. The number is denoted by a parameter λ 0 . On the exit front, the exit rate satisfies that the number of firms who exit equals the number that enters. τN F =λ I +λ 0 . (1.33) I use (1.33) to solve for the measure of firms N F . Growth rate The growth rate of the aggregate quality ¯q can be written as: g = E t (¯q(t+dt))− ¯q(t) ¯q(t)dt . (1.34) The growth rate is g =− τ (1− κ )+γ H E(z) Z h(q ∗ (θ ))λ ∗ θ ψ (θ )dθ +[γ L E(z)+(γ H − γ L )E(z|z≤ ˆ z)F (ˆ z)] Z [1− h(q ∗ (θ ))]λ ∗ θ ψ (θ )dθ. (1.35) It includes three parts. − τ (1− κ ) is quality destruction if there were no innovations. For each unit of time, τ fraction of firms with an average quality ¯q leave the economy, while the same fraction of firms enter the economy with an average quality κ . The rest reports the quality improvement due to the innovation. The rest of the first line is the quality improvement because of innovations that complement with the firms. The average quality is γ H E(z). The second line measures the improvement that comes from the innovations that do not complement the initial firm. If there were no secondary market, then the average quality would be γ L E(z). With the secondary market, some low-step-size innovations are sold to more efficient users, which increases the quality improvement by (γ H − γ L )E(z|z≤ ˆ z)F (ˆ z). Hence when the innovation is worth more or the arrival rate increases, the growth rate goes 22 up. Equilibrium I end this section by summarizing the equilibrium. The in-house R&D expenditure C I of the economy can be written as C I = Z λ θ (θ )E(x(q ∗ (θ )))ψ (θ )dθ. (1.36) It captures all transfers made to inventors. The total firm-level R&D expenditure is the sum of R and the spending on purchasing patents on the secondary market X =C I +λ b p z . (1.37) Based on Equation (1.12), the equilibrium out put level Y is linear in ¯q Y = 1 1− β ¯q N 1− β F . (1.38) and the consumption level is C H =Y − C I . (1.39) Definition 1. A balanced growth path of this economy for any combination of t,q, is the mappingbetweenq andθ , theallocation y ∗ j j ,Y ∗ ,X ∗ ,C ∗ I ,C ∗ H , theprices w ∗ , p ∗ j j ,ˆ z , the growth rate g ∗ , the entry rate λ ∗ eI , and the measure of firms N ∗ F , such that (1) for any j ∈ [0,1], y ∗ j and p ∗ j satisfy Equation (1.14); (2) wage w ∗ satisfies Equation (1.15); (3) measure of the intermediate producers N ∗ F satisfies Equation (1.33); (4) the mapping is the solution of Equation (1.27); (5) aggregate creative destruction τ e satisfies Equation (1.33); (6) the price offer ˆ z for inventions satisfies Equation (1.19); (7) the entry rates λ ∗ eI satisfy Equation (1.32); (8) in-house R&D spending C ∗ I satisfies Equation (1.36); (9) total R&D spending X satisfies Equation (1.37); (10) aggregate output Y ∗ satisfies Equation (1.38); 23 (11) aggregate consumption C ∗ H satisfies Equation (1.39); and (12) steady-state growth rate g ∗ satisfies Equation (1.35). 1.3 Calibration I calibrate the model to the United States. I construct the moments using firm-level infor- mation and use patents as a proxy for innovations. Section 1.3.1 describes the data. Section 1.3.3 explains how I calibrate the model. 1.3.1 Data This paper combines four datasets to calculate moments. The datasets are the US Patent and Trade Office (USPTO), the Center for Research in Security Prices (CRSP), the Merged CRSP-Compustat Database, and the linked CRSP-USPTO data provided by Kogan et al. (2017). In addition, I use the Survey on Business Strategies (ESEE) by Fundacion SEPI from Spain, which provides firm-level in-house R&D investment and outsourcing R&D ex- penditure. 9 . The main sample period is from 1982 to 1997. I also calibrate the model to 1998-2010 and the results are reported in the appendix. US Data Development I combine the four datasets to obtain firm-level in-house patenting features. I start with the linked CRSP/USPTO dataset. It covers all patents granted to public firms from 1926 to 2010. Thedatasetprovideslinksbetweenstocksandpatents, andanestimatedvalueforeach patent. The value is estimated based on the stock market reaction in a three-day window after the announcement of a patent.(Kogan et al., 2017) I deflate the value of patents using the Consumer Price Index (CPI) for all urban consumers from Federal Reserve Economic Data. 9 Spain has a similar employee invention policy as the US: firms can ask employees to give up ownership of all on-job innovation. Thus, the firm innovation decision in Spain is comparable with US. 24 Next, I match this dataset with patent information. I download the supplemental infor- mation, including application date and granted date, from USPTO. 10 I clean the data using the following steps. First, I only keep patents with nonmissing citation observations. Second, I omit any observations where the patent granted date is before the patent application date, or the patent is granted more than ten years after the initial application. 11 This gives me a dataset with all in-house patents owned by public firms. Then, I link it with firm-level data. I download firm-level employment data from Com- pustat and stock data from CRSP. I deflate the market capitalization using CPI. I clean the firms using three criteria: (1) positive employment, (2) located in the United States, and (3) obtained at least one patent during the sample period. My final sample is the universe of patenting US public firms. There are 3,175 unique firms in the sample. For comparison, Akcigit and Kerr (2018) uses census data and reports 23,927 firms in total. The discrepancy is because my sample includes only public firms, while their sample contains private firms as well. ESEE I use Survey on Business Strategies (ESEE) to evaluate the model implications on untargeted moments (in-house versus purchased innovations). ESEE is a panel survey of manufacturing firms in Spain conducted by Fundacion SEPI. The firms in the survey are selected through stratified, proportional, and systematic sampling with a random seed. It reports firms’ R&D activities, including annual expenses on in-house innovations and external innovation-related activities. Although Spain is admittedly different from the United States, the employee invention law in the two countries are similar regarding who owns the rights of employee inventions. Therefore, the innovation activities in the two countries are comparable. I use this dataset as a qualitative reference for the in-house versus external expenses comparison. 10 For patents granted after 1976, high-quality details are available at PatentsView, a patent data visual- ization and analysis platform supported by USPTO. 11 On average, it takes 2.4 years (29 months) for a patent to be granted. 25 0.001 1 1.8 10 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Cumulative Distribution Function of Public Firms Figure 1.1: Public Innovative Firm Size Distribution 0.001 1 1.8 10 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Pr(Public|Size) Figure 1.2: Probability of Being Public Conditional on Firm Size 1.3.2 Moment Estimation A Statistical Model The dataset described in the previous section only includes information for public firms. The problem with using public firm data is that the dataset includes large firms disproportion- ately, which means that the distribution is not representative—particularly it lacks input from small firms. I assume that being a public firm is a random sampling weighted by the firm size. Because firm size is linear in quality q in the model, I use q to denote the firm size. The public firm size follows distribution g(q) (Figure 1.1) while the population distribution of all innovative firms is f q (q). I obtaing(q) from the public firm data and need to estimate f q (q). Assume that for a firm size q, the probability of being public is p(q), which only depends on firm size q. I estimate p(q) parametrically using the generalized method of moments (GMM). g(q) and f q (q) satisfy f q (q)p(q)=g(q)E(p(q)), (1.40) 26 whereE(p(q)) is the share of public firms among all patenting firms: E(p(q))= Z f q (q)p(q)dq. (1.41) There are 23,927 patenting firms in the economy among which 3,175 are public. Therefore, E(p(q)) = 0.13. I assume that p(q) takes the form of the cumulative distribution function of a shifted gamma distribution: p(q)= Γ( q+q 0 ,Γ a ,Γ b ). Then I use a generalized method of moments (GMM) to estimate the parameters (q 0 ,Γ a ,Γ b ). There are five moments in the data. First, because f q (q) is probability distribution function, by definition, it satisfies R ∞ 0 f q (q)dq =1, which gives the first condition that p(q) must meet: Z ∞ 0 g(q) p(q) dq = 1 E(p(q)) . (1.42) Second, the average employee size of all patenting firms is 1,805 (Akcigit and Kerr, 2018). Set the unit of q to be 1,000: E(q)=1.805. It yields the moment below: Z ∞ 0 q g(q) p(q) E(p(q))dq =1.805. (1.43) The rest of the moments are the three quartiles. The n th quartile of the population q n satisfies Z qn 0 f q (q)dq = n 4 ,n=1,2,3. (1.44) The quartiles of size of all patenting firms are approximately 17, 70, and 370 employees (Akcigit and Kerr, 2018). Therefore, it gives three moment conditions: Z qn 0 g(q) p(q) E(p(q))dq = n 100 ,n=1,2,3. (1.45) The five moments described in Equation (1.42), (1.45), and (1.43) overidentified the parameters. I set up the estimation so that the parameters must satisfy the first moment 27 Table 1.1: Model Fit for Key Moments of the Statistical Model Moments Data Model Share of firms with fewer than 17 employees 0.25 0.25 Share of firms with fewer than 70 employees 0.50 0.45 Share of firms with fewer than 370 employees 0.75 0.76 Average firm size 1,805 1,450 and assign uniform weights to the other four. The estimated parameters are:q 0 =6.90× 10 − 5 , Γ a =0.70, and Γ b =6.0645. The estimated probability of being public is in Figure 1.2. Combining the size distribution of public firms in Figure1.1 and the probability of being publicinFigure1.2, Figure1.3reportstheestimateddistributionoffirmsizes. Theestimated moments of the firm size distribution are shown in Table 1.1. 0.001 1 1.8 10 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Cumulative Distribution Function Average Firm Size Figure 1.3: Estimated Firm Size Distri- bution of Innovative Firms Notes: This is estimated using the public firm data and the statistical model. 0.5 1 1.8 10 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Average Firm Size Figure 1.4: Estimated Innovation Dis- tributions across Innovative Firms (with more than 500 employees) Notes: The cumulative share of innovations in firms with fewer than 500 employees is set to 10% according to Figueroa and Serrano (2019). Innovation Distribution I estimate the patent distribution across firms as an empirical counterpart of the innovation distribution, which is used as moments in the quantitative analysis. Because the public firm data have only limited the observations of firms with a few employees, one concern is the sample of small public firms suffers from selection bias. That is, the reason why a small firm 28 Table 1.2: List of Parameters Used in the Model Parameter Description Identification β Quality share in final goods External calibration ρ Discount rate External calibration τ Exit rate External calibration σ 2 δ Business shock arrival rate External calibration λ 0 Exogenous entry Indirect inference κ Avg quality of exogenous entrants Indirect inference µ R&D arrival rate multiplier Indirect inference η Curvature of Pr(complementarity) Indirect inference k 1 Pr(complementarity|emp>500) Indirect inference k 2 Pr(complementarity|emp>500) Indirect inference -Pr(complementarity|start-up) m Scale para of step size dist Indirect inference α Shape para of step size dist Indirect inference β a ,β b Shape para of inventor dist Indirect inference γ L Scale of R&D without complementarity Normalized to 1 γ H Scale of R&D value with complementarity Indirect inference goes public is correlated with its innovations. Therefore, I only use the statistical model to estimate the share of patents held by firms with more than 500 employees and assume that the share of patents held by firms with fewer than 500 employees is 10%, as suggested by the literature (Figueroa and Serrano, 2019). The estimated cumulative share of patents is reported in Figure 1.4, for example, the 60th percentile is about 50,000 employees per firm. 1.3.3 Calibration In this section I calibrate the model. To solve the model, I use the following functional form assumptions. The cost function is quadratic: R(e I )= e 2 I 2 . The probability of having complementarity with an innovation is h(˜ q)=k 1 − k 2 exp(− η ˜ q),k 1 ≥ k 2 >0,η > 0. (1.46) The step size distribution of an innovation follows Pareto distribution: Φ( z)=1− m α z α +1 . (1.47) 29 The type distribution of inventors follows beta distribution: Ψ( θ )=B(θ ;β a ,β b ). (1.48) The business shock δ is a random draw from a truncated normal distribution: Ξ( δ ) δ ∈[− 1,1],E(δ )=0,var(δ )=σ 2 δ (1.49) The model has 16 parameters, which are listed in Table 1.2. The parameters (γ L ,γ H ,m) cannot be identified separately, so I normalize γ L to 1. I calibrate (ρ,β,τ,σ 2 δ ) externally. The discount rate ρ is set to 2%. I use β , the quality share in the production function, to target the firm profitability, defined as π j y j p j , which is 10.9% for the sample period between 1982 to 1997 (Akcigit and Kerr, 2018). In the model, the profitability equals β . Thus, I set β = 10.9%. The firm exit rate τ is 6.5%, the average annual entry rate of innovative firms. The entry rate is estimated as follows. I obtain the annual entry rate for the whole economy from Decker et al. (2016) which is about 11.6%. To estimate the entry rate for innovative firms, I use two other measures from the same paper: the share of employment at young firms for the whole economy (14.3%) and the information industry (8.0%). I calculate the ratio between the two and use this ratio as a proxy for the ratio between the economy-wide entry rate, and the innovative firm entry rate is the same as the employment share because the information industry constitutes a large portion of innovation firms. Then combining the ratio and the economy-wide entry rate, I estimate the entry rate for innovative firms as 6.5%. Meanwhile, σ 2 δ , the variance of business shock δ is related to the firm growth rate volatility. The median standard deviation of firm growth rates within 10 years is about 16% to 20% for all firms (Comin and Philippon, 2005). Using public patenting firms, the results are similar. I set σ δ to 14%. The external calibrated parameter values are listed in Table 1.3 There are 11 remaining parameters to be estimated. I identify these parameters using a 30 Table 1.3: External Calibrated Parameters ρ β τ σ 2 δ 0.02 0.109 0.065 0.14 simulated method of moments approach in the spirit of Lentz and Mortensen (2008), which is widely used in the growth literature to match firm-level evidence (Lentz and Mortensen, 2016; Acemoglu et al., 2018; Akcigit and Kerr, 2018). I calculate the model implied moments based on the model and compare them to the data-generated moment to minimize min 11 X i=1 model(i)− data(i) data(i) 2 . (1.50) Though every parameter affects every moment, below shows what parameter is especially important for matching each moment. The probability of selling an innovation—The secondary market is an essential feature of the model. To characterize the secondary market. I match three moments related to the probability of selling an innovation: the probability of a big firm 12 selling its innovation, the start-up buyout rate, and the average probability of a firm selling its innovation. In the model, the probability of selling is directly affected by the probability of having complemen- tarity. The first moment is 5.1% (Figueroa and Serrano, 2019) and governed by parameter k 1 . As shown in Figure 1.5, when k 1 increases (decreases), it shifts up (down) the probabil- ity of having complementarity h(˜ q). The startup buyout rate is governed by parameter k 2 (9.4%, from Gao et al. (2013)). Figure 1.6 confirms that k2 mainly affects the h(q) for low q. The average probability of a firm selling it innovation is matched to α , the shape factor of the invention step size distribution. When α is large, then the uncertainty in the inventor’s income stream is dominated by operational reasons that are unrelated to her own innovating effort. As α approaches 2, the innovation-related uncertainty goes up, which affects small to medium-small-sized firms disproportionately. Thus, I adjust α to match the average prob- ability of a firm selling a patent. In the data, the probability is about 5.4% (Figueroa and 12 Following USPTO, a big firm is defined as a firm with more than 500 employees. 31 5 10 15 20 25 30 35 40 45 50 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 Figure 1.5: The Impact of k 1 on h(q) 5 10 15 20 25 30 35 40 45 50 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 Figure 1.6: The Impact of k 2 on h(q) Serrano, 2019). Innovation distribution across firms —I obtain the innovation distribution across firms using the public firm data, patent data from USPTO, and a statistical estimation in Section 1.3.2. I target four data points from the distribution to pin down four parameters: η , µ , β a , and β b . First, η governs the curvature of the complementarity probability. For firms large enough, because they all offer sufficiently low equity shares, η governs the firm-inventor matching. When η increases, it acts as if those firms are larger by the same factor regarding the complementarity probability. The impact is shown in Figure 1.7. η mainly affects the slope of the cumulative share of innovations by firm size. For example, if η increases from 0.13 to 0.26, then the 60th percentile of the size of all patenting firms shrinks by half. So I match it to the 60th percentile of size (employment size=51,250, which is 28.47¯q). The second parameterµ , the R&D arrival rate multiplier disciplines the share of innovations held by firms with fewer than 2,000 employees. µ affects both the value of innovating and the innovation-related uncertainty linearly. It affects the innovation value in all firms uniformly; it acts as a normalization. However, it has a disproportional impact on the equity risk across firms, since it only affects the innovation-related risk. As a result, in the relevant parameter space, µ mainly governs the share of innovation held by small- to medium-sized firms. I match it to the share of innovations held by firms with fewer than 2,000 employees (22%). The last two parameters β a and β b determine the inventor type distribution Ψ( θ ). β a affects 32 5 10 15 20 25 30 35 40 45 50 55 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 q=27 q=52 Figure 1.7: The Impact of η on Innovation Distribution the amount of low effort-sensitive inventors whereas β b affects the amount of high effort- sensitive inventors. Therefore, I use the cumulative share of innovations held by firms with fewer than 5,000 employees (30%) and 500 employees (10%) to match β a andβ b , respectively. Average value of an innovation in firms with more than 500 employees —In the data, I calculate the average value of a patent in firms with more than 500 employees using patent value estimated by the stock market reaction (Kogan et al., 2017). I first keep all firms with more than 500 employees. Then I calculate the average patent value per firm and weigh it by the statistical model developed in Section 1.3.2. The average value per patent is about 3.6% of the average market capitalization. In the model, given the measure of firms N F , the size factor m of the step size distribution determines the average value per innovation. It pins down the value of m. The aggregate growth rate and the average growth rate—The aggregate growth rate mea- sures the growth rate of the average quality whereas the average growth is the unweighted average of firm-level growth rates. When the aggregate growth rate is lower, it means that firms with low q grow faster. The amount of exogenous firm entry λ 0 , from Equation (1.33), determines the measure of firms N F . Conditional on the innovation value parameter m, λ 0 impacts the aggregate growth rate through N F , for innovation value is scaled by the total technology stock (Q = ¯qN F ). Hence, I set λ 0 so that the aggregate growth rate g is 2%. In 33 Table 1.4: Indirect Inference Calibrated Parameters λ 0 k 2 m κ µ η 0.035 0.070 0.0147 0.014 1.87 0.13 k 1 α β a β b γ H 0.97 2.0013 0.081 1.19 1.07 the model, the average growth rate is mainly affected by κ , the average quality of exogenous entrants. When κ goes up, holding the aggregate growth rate constant, the measure of firms N F needs to adjust. Therefore, κ affects the innovations per firm and consequentially firm- level growth rate. The average productivity growth rate, based on Equation (1.14), can be written as: E ˙ q j q j =E ˙ l j l j ! +g, (1.51) whereE ˙ l j l j is the average employment growth controlling the working population constant and g is the aggregate growth rate. The average employment growth is 7.4%(Akcigit and Kerr, 2018) and the total employment growth is 2.1% (US Bureau of Labor Statistics). Therefore, the average employment growth controlling the working population is 5.3%, which gives the average productivity growth rate equal to 7.3%. Firm growth versus firm size regression —γ H measures the value of having complemen- tarity with an invention. When γ H goes up, having complementarity is more valuable, which affects the inventor’s trade-off; that is, higher γ H means big firms are more attractive, and hence grow faster. It is related to the growth rate versus firm size regression (Akcigit and Kerr, 2018): g ft =η +β g ln(qft)+ϵ ft . The empirical coefficient from the literature is β g =− 0.035. The results are shown in Table 1.4 34 Table 1.5: Model Fit for Key Moments—Targeted Moments Moment Data Model Parameter it informs Profitability 0.109 0.109 β Discount rate 0.020 0.020 ρ Entry rate 0.065 0.065 τ Firm growth volatility 0.17 0.18 σ 2 δ Aggregate growth rate 0.02 0.020 λ 0 Average patent value 0.036 0.036 m Startup buyout rate 0.094 0.094 k 2 Pr(Sell| big) 1 0.051 0.050 k 1 Pr(Sell) 0.054 0.053 α Growth-size relation β g 2 -0.031 -0.037 γ H % Inno, emp<500 3 0.10 0.079 β b % Inno, emp<2,000 0.22 0.23 µ % Inno, emp<5,000 0.30 0.33 β a 60th pctl ˜ q weighted by R&D 28.47 28.92 η Average growth rate 0.073 0.072 κ 1 A firm with more than 500 employees is defined as A “Big firm”, according to USPTO. 2 β g is the coefficient of the growth-size regression. 3 The % Inno is the cumulative density function of innovations created in firms with less than certain employment. For example, “% Inno, emp<500” means the share of innovations that are invented in a firm with fewer than 500 employees. 1.4 Results The calibration moments are reported in Table 1.5. Overall, my model matches closely the targeted moments. Next, I discuss model features in more detail. 1.4.1 Optimal Contracts Figure 1.8 shows what share of total risk is the innovation-related risk for firms of different sizes. As a result, it is more difficult for large firms to incentivize inventors without exposing them to unrelated risks. This is one optimal contract for the inventor each firm actually hires in equilibrium. I plot the contracts across firms in Figure 1.9. The top panel plots the resulting effort level e for the inventor works for a firm ˜ q. The bottom panel shows that, in equilibrium, the equity a(˜ q) decreases with the firm size. 35 0 5 10 15 20 25 30 35 40 45 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Average Firm Size Figure 1.8: The Innovation-Related Risk across Firm Sizes 1.8 25 50 75 0 0.01 0.02 0.03 0.04 Average Firm Size 1.8 25 50 75 0 0.005 0.01 0.015 Figure 1.9: The Equilibrium Effort and the Optimal Contract Notes: The figures report the actual effort level and the optimal contract for each equilibrium firm-inventor matching. 0 0.1 0.2 0.3 0.4 0.5 0.6 10 -3 10 -2 10 -1 10 0 10 1 Figure 1.10: The Relationship between Inventors and Firms Notes: Inventors with θ > 0.5 work for start-ups with q =0 are not shown in this figure, because the x-axis is in log. 1.4.2 The Distribution of Inventors across Firms Figure 1.10 shows that an inventor with a more effort-sensitive idea chooses a smaller firm, receives more equity and chooses a higher effort level. Namely, inventors with low θ are less effort-sensitive and work for big firms; instead of working for an incumbent firm, inventors with θ higher than a threshold ¯θ join a start-up. When choosing where to work, an inventor faces the trade-off between chances and value. The chance to successfully innovate decreases with firm sizes, because, as shown in Figure 36 0.5 1 1.8 10 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Average Firm Size Data Model Figure 1.11: Cumulative Share of Innova- tions Notes: The solid line plots the cumulative share for firms with more than 500 employees, the same as Figure 1.4. The stars represents the data points I used to calibrate the model. 10 0 10 1 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 Average Firm Size Data Model Figure 1.12: Probability Density of Inno- vation Notes: This figure shows the probability distribution function of innovations for firms with more than 500 employees both in data and in model. 1.9, smaller firms are better at incentivizing inventors to exert more effort. The value of an innovation increases with firm size, for larger firms are more likely to implement the innovation efficiently. When an inventor’s idea is more effort sensitive, the incentive channel weighs more. Therefore, a more effort-sensitive inventor chooses to work for a smaller-size firm. 1.4.3 Untargeted Moments I next compare the model with untargeted features in the data. Cumulative shares of inventions by firm sizes —I use four moments from the innovation distribution to calibrate the model; the rest is used as untargeted moments. The comparison is shown in Figure 1.11, and some moments are reported in Table 1.6. In general, the model is able to match the overall pattern of the cumulative shares. It overestimates when the firm size is medium-small to medium and large and underestimates when the size is around medium to medium-large. This is mainly because of the assumption that θ follows a beta distribution. Figure 1.12 shows the probability distribution function of Figure 1.11. The model follows a similar pattern as the data. Growth rate gap between 90th percentile and 50th percentile—Decker et al. (2016) shows that in high-tech industries (defined by Heckler 37 (2005)) the growth rate difference between 90th percentile and 50th percentile is about 31%. In the model, I simulate 5 million firms for one year to find the annual growth rate for each firm. Then I only keep firms that have innovations within this year. The growth rate of firms that die within this time interval is defined as − 1, following the literature. I rank the firms according to their growth rate and find the growth rate gap between the 90th percentile and 50th percentile. The comparison of untargeted moments is listed in Table 1.6. Overall I am able to match the data; specifically, I match the 90th–50th growth rate gap and cumulative share held by medium- to medium-large-size firms closely. In-house versus outsourcing choice—In addition, I report the comparison between in- house versus outsourcing choice both in the data and in the model in Table 1.7. In the model, a firm can both innovate in-house and purchase innovations from others. The data moments are obtained from ESEE in 1990 and 1994. The first wave of ESEE is in 1990. I run four specifications for 1990 and 1994 according to: y ft = constant t +coeff t × ln(Sales ft )+ϵ ft ,t=1990,1994, (1.52) where y ft is the dependent variable of firm f and time t. Four dependent variables are (1) total R&D expenditure to sales ratio, (2) internal R&D expenditure to sales ratio, (3) total R&D expenditure normalized by average sales, and (4) internal R&D expenditure normalized by average sales. The first two measure the innovation intensity, and the rest measure the absolute level of innovation expenditure. The full regression results are in the appendix. Both R&D and internal R&D expenditure intensity decreases with firm size and the absolute levels increase with firm size. Though the data are from Spain and the model is calibrated to the US the model can match at least the direction of the coefficients, which is reassuring because it captures some of the trade-offs between in-house and outsourcing. 38 Table 1.6: Model Fit for Untargeted Moments Moment Data Model Growth gap: 90th pctl - 50th pctl 1 0.31 0.38 % Inno, q <20 0.44 0.41 % Inno, q <50 0.59 0.59 % Inno, q <100 0.76 0.90 1 The difference between the 90th percentile of the firm-level growth rate and the median growth rate. Table 1.7: Model Fit for Untargeted Moments Using Spanish Data 1 DV R&D Sales Internal Sales R&D E(Sales) Internal E(Sales) 1990 − 0.009 − 0.006 0.016 0.012 1994 − 0.001 − 0.001 0.023 0.013 Model -0.025 -0.025 0.0030 0.0020 1 The data are from ESEE of the SEPI Foundation, Spain. I use the Spanish dataset because it shares a similar employee invention law as the US, and provides detailed information on R&D investment. 1.4.4 Growth Decomposition I now use the model to document the sources of growth. The growth rate in Equation (1.35) can be written as g = Destruction z }| { − τ (1− κ ) + Secondary market z }| { γ H Φ(ˆz)E(z|z≤ ˆ z) Z [1− h(q ∗ (θ ))]λ ∗ θ ψ (θ )dθ + Primary market z }| { E(z) Z {γ H h(q ∗ (θ ))+γ L }λ ∗ θ ψ (θ )dθ. (1.53) It depends on three forces: (1) destruction and replacement with an entrant subtracted the innovation effect, (2) firms purchasing innovation on the secondary market, and (3) firms hiring inventors from the primary market to innovate in-house. My model estimates that the average quality decreases by 6.41 percentage points due to the destruction and replacement process. For exogenous entries, this is because new entrants start at a quality on average lower than the firms who exit due to exogenous destruction. 39 Table 1.8: Counterfactuals 1 (1) (2) (3) (4) (5) (6) Moment Benchmark No Market Signal Signal Subsidy Subsidy α ε =1.5 α ε =2.5 5.0% 7.0% R 2 =0.4 R 2 =0.6 Growth rate 2.00 − 0.168 +0.025 +0.079 +0.014 +0.032 % Inno in startups 0.31 − 0.311 +0.814 +4.899 +0.836 +1.485 % Inno, emp<500 7.64 − 6.224 − 1.008 − 4.778 − 1.079 − 2.066 % Inno, emp∈[0.5k,20k) 33.20 − 8.897 +3.666 +8.026 +3.715 +5.312 % Inno, emp∈[20k,0.1m) 49.29 +4.410 +6.096 +1.420 +6.095 +4.836 % Inno, emp≥ 0.1m 9.57 +11.022 − 9.567 − 9.567 − 9.567 − 9.567 Pr(Sell) 5.31 − 5.309 +0.383 +0.629 +0.172 +0.213 1 The table reports thepercentage points difference with respect to the benchmark. For example, the growth rate change is calculated according to (gcounter− g)× 100. The first column reports the benchmark case in percentage points. The second column shows a case where firms cannot trade innovations at all. The third and fourth columns analyze cases in which there is a noisy signal when firms buy and sell innovations. The noise ϵ follows a Pareto distribution with scale parameter 1 and shape parameter α ϵ . The fifth and sixth columns report the results when there is a one-time innovation transaction subsidy. The subsidy rates are 5% and 7% of the transaction price. Average quality increases by 8.23 percentage points because of innovation in-house. On average about5% of all innovations are sold (Figueroa and Serrano, 2019), and the secondary market adds 0.18 percentage points to the aggregate growth rate. In Section 1.4.5 I analyze the effects of changes on the secondary market. 1.4.5 Counterfactual In this section, I consider counterfactuals about the innovation tradability on the secondary market. Theanalysisshowsthattradabilityaffectsbothfirm-levelinnovationallocationsand the aggregate growth rate. I start with an extreme case when the secondary market no longer exists in Section 1.4.5. Section 1.4.5 analyzes the case if there is a signal about the invention step size. Section 1.4.5 studies the case when innovation transactions are subsidized. Online appendix shows the results if there is an invention transaction tax. 40 Shut Down the Secondary Market Consider a case where there is no secondary market and firms cannot buy or sell innovations. It hurts small firms disproportionately because they are the ones who rely more on the sec- ondary market. The results are reported in the second column of Table 1.8. The benchmark results are shown in the first column for comparison. Inventors shift to bigger firms. The share of innovations created in start-ups drops by 0.31 percentage points, which is 100%— there are no innovations in start-ups anymore; start-ups cannot attract any inventors. The share of innovations in small firms also decreases dramatically, by 6.22 percentage points. This exercise speaks to one of the effects of intellectual property rights protection: without protection, firms cannot trade patents and the secondary market does not exist. The model implications are consistent with the empirical observations in Acikalin et al. (2022), which shows that when facing sudden patent invalidation, small firms lose disproportionately. In the contrast to the well-known “small firms innovate” idea, without the secondary market, innovations are created in big firms. For example, quantitatively, the share of innovations created in large firms (with more than 100,000 employees) increases by more than 11.02 per- centage points—from 9.57% to 20.59%. The impact of shutting down the secondary market goes beyond firm-level distribution shift—it also has an aggregate implication. The aggre- gate growth rate decreases by about 0.17 percentage points. The growth rate drops from 2.00% to 1.83%. Next, I explore what moments in the data are responsible for the quantitative finding— specifically, how much the growth rate drop would change if the observed data were different. To do this, I consider changes in two groups of data moments: the probability of selling an innovation and the share of innovations in small firms. After changing the moments, I recalibrate the model and re-run the counterfactual. The results are shown in Table 1.9. If firms are, in fact, more likely to sell an innovation than the data, the growth rate drop will be larger. For example, consider the case where the parameter k 1 =0.99, which means instead of 5.3%, on average 3.6% of innovations are sold if there is a secondary market. Shutting 41 Table 1.9: The Secondary Market’s Impact Depends on the Probability of Selling an Inno- vation 1 (1) (2) (3) Baseline Change Sell Only Sell+Distribution Parameters k 1 0.97 0.99 0.85 η 0.13 0.27 0.29 Moments Pr(sell) 5.3% 3.6% 15.1% % Inno, emp≥ 50k 58.6% 55.2% 90.4% Counterfactual ∆ g(No Market) 0.17% 0.10% 0.51% 1 Thetablereportsthecounterfactualaggregategrowthratedropwhenshuttingdownthesecondarymarket. Thefirstcolumn reports the baseline counterfactual as in Table 1.8. The second column shows when holding other moments constant, and the probability of selling decreases. In the third column, there are fewer innovations in big firms and more trade between firms. down the secondary market will result in a drop that is 6 base points lower than before—the growth rate decreases by 0.10 percentage points, compared with the 0.16 percentage points in the baseline calibration. If in contrast, more innovations are created in small firms than observed, then the results will be bigger. For example, consider a case in which on average 15% of innovations are sold, and no innovations are created in firms with more than 100,000 employees. Shutting down the secondary market leads to an additional 0.5 percentage points drop in the growth rate—now the growth rate would be 1.5% if there were no secondary market. This is because the impact of tradibility is mainly about innovation reallocation between firms. Both a high probability of selling and a small share of innovations in big firms means that the reallocation has a strong implication for efficiency improvement in which case shutting down the secondary market has a more salient effect. Signal In the benchmark model, the secondary market is frictional due to information asymmetry. This section consider a publicly available signal χ associated with each innovation z, which 42 alleviates the information asymmetry in the secondary market. The signal satisfies χ =εz, (1.54) where ε is noise. Assume that ε follows a Pareto distribution, where f ε (ε) = α ε ε α ε+1 ,α ε > 0. Buyers use this information to update the innovation step size distribution, which yields: f z|χ (z|χ )= − ˜ αz − ˜ α − 1 χ − ˜ α − m − ˜ α ifz≤ χ 0 otherwise, (1.55) where ˜ α =α − α ε . Buyers now use this information to determine their bid price on the secondary market. As before, the buyer chooses a price ˜ p z (χ ) according to the zero profit condition. Z max(ˆ z,χ ) 0 h γ H νz ˜ Q− ˜ p z (χ ) i f z|χ (z|χ )dz =0. (1.56) The innovations are sold if they are worth less than the price to the seller. Equivalently, all innovations with step size lower than the threshold ˆ z are sold, where γ L ˆ zν ˜ Q = p z (χ ). The solution ˆ z satisfies γ H ˜ α − 1 m − ˜ α +1 − ˆ z − ˜ α +1 = ˆ z ˜ α m − ˜ α − ˆ z − ˜ α , (1.57) if ˆ z ≤ χ . In this case, the bid price is independent of the signal. Otherwise, it means that the signal shows the step size is low enough, and the solution ˆ z satisfies ˆ z = ˜ α (γ H ) mχ ˜ α − χm ˜ α (˜ α − 1)[χ ˜ α − m ˜ α ] . (1.58) Using the updated expression for ˆ z, the firm’s problem and the inventor’s problem are both the same as in the benchmark model. To measure the informativeness of a signal, I regress lnz on the signal lnχ and use R 2 43 as the information’s accuracy. In the regression, the R 2 is R 2 = Var(lnz) Var(lnχ ) = α 2 ε α 2 +α 2 ε , (1.59) which increases with the shape parameter α 2 ε of the signal distribution. It implies that when the signal is more accurate, R 2 is higher, and consequentially, we can better infer z from the signal. I consider two different signal levels: α ε =1.5,2.5, and the corresponding R 2 are 0.4 and 0.6, which means the signal informativeness level is low and high, respectively. The results are reported in the third and forth columns of Table 1.8. Thefirstrowreportsthepercentagechangeintheaggregategrowthrate. When R 2 =0.4, the growth rate increases by 0.03 percentage points while it increases by 0.08 percentage points when R 2 = 0.6. Namely, when the R 2 of the signal is 0.6, the growth rate increases from 2.00% to 2.08%. This is because that more information means less severe adverse selection and thus it is easier for firms to sell their innovations. Therefore, the economy is more efficient and the growth rate is higher. ThesecondtothesixthrowsofTable1.8reportthechangesintheinnovationdistribution across firms. The signal has heterogeneous effects on inventors. The signal has a significant impact on the share in start-ups. Quantitatively, the share of innovations created in start- ups is more than triple, from 0.31% to 1.13%, with a signal as noisy as R 2 = 0.4. When the signal is more informative, with R 2 =0.6, the share of innovations in start-ups increases by 4.90 percentage points. Meanwhile, the share of innovations created in medium-small (with 500–20,000 employees) and medium-large firms (with 20,000–100,000 employees) also increases with the signal strength; the share of innovations in both small firms (with fewer than 500 employees) and large firms (with more than 100,000 employees) decreases with the signal strength. The heterogeneity is because the signal effect is two-fold on the inventor’s choice. Figure 44 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 10 -2 10 -1 10 0 10 1 Figure 1.13: The Mapping between Inventors and Firms 1.13 reports the change in the mapping between inventors and firms. For inventors with the effort sensitivity level θ > 0.15 and θ < 0.015, more precise information leads to them working for smaller firms. On the contrary, for inventors with the effort sensitivity level 0.015 < θ < 0.15, the more informative the signal is, the bigger firm they choose. This is because the signal precision affects the inventor’s decision in two ways. First, because firms can sell unmatched innovations more easily, the expected value of an innovation decreases more slowly with the firm size. Namely, for each innovation, there is a smaller value gap of being commercialized in different sizes. Second, innovations contribute more to firm’s equity, because of the adding variance to the innovation selling prices. The innovative risk takes a larger share in a firm’s total uncertainty profile, as shown in Figure 1.14. The increasing innovative uncertainty has two impacts: the firm equity variance is higher, and more uncertainty in the firm equity are related to the inventor’s own choice. The former channel restricts the start-ups’ and small firms’ ability to offer equity since there is more demand for risk-sharing. The latter channel, on the contrary, enables medium- to large- sized firms to offer more equity by lowering the relative background noise level. Figure 1.15 confirms the mechanism. The share of equity offered by start-ups drops from 4.3% to 3.9% when there is a strong signal (R 2 = 0.60). And the slope is flatter in the signal’s informativeness, especially at the left end. Therefore, the capacity to incentivize inventors drops more slowly with the firm size, especially for small firms. 45 5 10 15 20 25 30 35 40 45 50 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Average Firm Size Figure1.14: TheInnovation-RelatedRisk Share Increases with the Signal 5 10 15 20 25 30 35 40 45 50 0 0.005 0.01 0.015 0.02 0.025 0.03 0.035 0.04 0.045 Average Firm Size Figure 1.15: The Optimal Contract Changes with the Signal Therelativestrengthofthetwoforcesdeterminestheinfluenceofthesignal. Forinventors with θ > 0.15 and θ < 0.015, the increasing innovation value is more important. Hence, inventors work for smaller firms. For the rest, the incentive improvement plays a more critical role, and inventors shift to bigger firms. As a result, more innovations are created in start-ups and medium-sized firms. Subsidy This section considers when the transactions on the secondary market are subsidized, which, similar to the public signal, alleviates the friction in the secondary market. It is a fixed transfer Γ per transaction, paid from lump sum taxes. The buyer now pays (p z − Γ) to purchase an innovation. The rest of the settings are the same as in the benchmark model. The buyer’s zero-profit condition is Z ˆ z 0 (γ H νzQ − p z +Γ) ϕ (z)dz =0, (1.60) where ˆ z = pz γ L νQ is the step size threshold; a firm agrees to sell an innovation when it does not have complementarity and the step size is lower than ˆ z. Define ψ ≡ pz γ L νQ . The zero-profit 46 condition can be re-written as Z ˆ z 0 (γ H z− γ L ˆ z+γ L ψ )ϕ (z)dz =0. (1.61) With the subsidy, it is less costly to buy innovations. Therefore, ˆ z is higher than in the benchmark case, and a firm is more likely to sell an innovation. 13 I evaluate two cases where the subsidy levels are 5% and 7% of the transaction price, respectively. The fourth and fifth column of Table 1.8 reports the results. The result is similar to the effect of adding a signal. It improves innovation tradability, which has two impacts on firm-level outcomes. It also has heterogeneous effect on inventors. There are more innovations in both start-ups and medium-small- to medium-large-sized firms and fewer innovations in small and big firms. The second to the sixth rows of Table 1.8 report the changes in the distribution. When there is a 5% subsidy, quantitatively, the share of innovations in start-ups increases by 0.84 percentage points. When the subsidy is 5%, the share in start-ups increases by 1.49 percentage points. Meanwhile, the share of innovations created in medium-small (with 500–20,000 employees) and medium-large firms (with 20,000– 100,000 employees) also increases with the subsidy; the share of innovations in both small firms (with fewer than 500 employees) and large firms (with more than 100,000 employees) decreases with it. The seventh row confirms that it is more likely for a firm to sell its innovation. 1.4.6 Endogenous Contracting and Firm-Investor Matching This section explores how endogenous contracting and matching affect the quantitative re- sults. For each counterfactual exercise, in addition to the baseline counterfactual in Section 1.4.5, I consider three cases: first, when firms cannot change contract terms but inventors can move freely; second, when firms can modify contracts but inventors cannot move to 13 Adding a transaction tax has the opposite effect of a subsidy. The counterfactual results of a transaction tax are reported in Appendix 47 Table 1.10: Endogenous Contracts and Matching 1 (1) (2) (3) (4) (5) Moment Benchmark Baseline Model Same Contract Same Mapping All Same No Sell Growth rate 2.00 − 0.168 − 0.164 − 0.213 − 0.215 % Inno in startups 0.31 − 0.311 − 0.311 − 0.011 − 0.011 % Inno, emp<500 7.64 − 6.224 − 6.236 − 0.068 − 0.060 % Inno, emp∈[500,20k) 33.20 − 8.897 − 9.017 +0.000 +0.006 % Inno, emp∈[20k,100k) 49.29 +4.410 +4.552 +0.066 +0.055 % Inno, emp≥ 100k 9.57 +11.022 +11.012 +0.013 +0.011 Pr(Sell) 5.31 − 5.309 − 5.309 − 5.309 − 5.309 Subsidy: 7% Growth rate 0.02 +0.032 +0.023 +0.008 +0.007 % Inno in startups 0.00 +1.485 +1.503 +0.002 +0.002 % Inno, emp<500 0.08 − 2.066 − 2.496 +0.012 +0.010 % Inno, emp∈[500,20k) 0.33 +5.312 +6.708 +0.000 − 0.001 % Inno, emp∈[20k,100k) 0.49 +4.836 +3.852 − 0.012 − 0.010 % Inno, emp≥ 100k 0.10 − 9.567 − 9.567 − 0.002 − 0.002 Pr(Sell) 0.05 +0.213 +0.267 +0.097 +0.097 1 The table reports the percentage points difference with respect to the benchmark. The first column reports the benchmark case in percentage points. The second column analyzes the baseline case, where both the contracts and the firm-inventor mapping can adjust. The third and fourth columns study the cases in which the contracts and the firm- inventor mapping cannot adjust, respectively. The fifth column reports the results where neither the contracts nor the firm-inventor mapping can adjust. 48 another firm; and third, when neither the contract terms nor the firm-inventor matching can adjust. The top panel of Table 1.10 reports the decomposition for the first counterfactual exer- cise: when there is no secondary market. If firms cannot change the contract or the mapping, then the growth rate drops by 0.22 percentage points instead of 0.17 percentage points. For the firm-level outcome, the share of innovations almost stays unchanged. The share of inno- vations in firms with more than 100,000 employees only increases by 0.01 percentage points, whereas it increases by more than 10 percentage points in the endogenous contracting set- ting. In this counterfactual exercise, allowing firm-inventor matching to change can capture most of the effects. This is because, in the extreme case, large firms are benefited a lot, which makes it very difficult for firms (especially small firms) to adjust contracts to attract inventors. The effect of the endogenous contract itself is more significant in the less extreme counterfactual exercises. For example, when there is a 7% subsidy, as shown in the bottom panel, the growth rate increases by 0.023 percentage points, lower than 0.032. The firm-level effect is more significant. The share of innovations in small firms drops by 2.50 percentage points, compared with 2.07 percentage points in the baseline counterfactual. The firm-level changes are due to two reasons. First, since firms cannot adjust contracts, the contracts are not optimal–start-ups offer too much equity while large firms offer too little equity. Thus, inventors choose medium-sized firms, which leads to a rise in the shares of innovations in medium-small-sized and medium-large-sized firms. Second, inventors who still choose start- ups are over-incentivized. Therefore, they work harder than in the baseline counterfactual. As a result, there are fewer inventors in the start-ups, but everyone uses more effort. In equi- librium, the second channel dominates, and the share of innovations in start-ups is higher than in the baseline counterfactual. The decompostion of other counterfactuals is in the ap- pendix. Overall, the endogenous contracting setting mitigates the impact on the aggregate growth rate and amplifies the effect on the innovation distribution. 49 1.5 Conclusion This paper explores why some inventions are invented inside a large firm while others are in a small firm or start-up. The model characterizes the contractual relationship between two sets of heterogeneous agents: inventors and firms. Heterogeneous firms hire inventors to innovate in-house, and then trade non-complementary inventions on a secondary market. How much equity a firm offers depends on its size. Large firms find it difficult to give inventors high-power incentives without exposing them to unrelated risks. Therefore, both the incentive to exert effort and the optimal equity share decrease with firm size. A key trade-off that an inventor faces when choosing firms is between value and opportunities. On average, innovation value is higher in bigger firms, since they are more likely to have synergy with the innovation; meanwhile, the chance to innovate is higher in small firms, for they are better at incentivizing inventors using equity. The tradability of innovations on the secondary market is important to this trade-off. The model suggests that inventors work for small firms if their ideas are more sensitive to effort whereas inventors with ideas less sensitive to effort work for big firms. This model offers a framework to think about the boundaries of firms quantitatively. The counterfactual exercise shows that, if we shut down the innovation market, the growth rate would decrease by 0.17 percentage points, and the share of innovations created in start-ups and firms with fewer than 500 employees will drop from 7.9% to 1.4%. Itwouldbeusefultoextendandgeneralizetheanalysisinseveraldirections. Onepossible directionistoconsiderhowindividualscandecidewhethertobecomeaninventororaworker. Another direction is to incorporate financial frictions in the model and to think about when an inventor becomes an entrepreneur. 50 Chapter 2 Estimate the Belief Bias in Learning from Coworkers 2.1 Introduction Do we learn as much as we expected? This paper structurally estimates the belief deviation fromthe actualpeer effect ina learning-from-coworkersetting. Inreality, “youcan learnfrom experienced colleagues” is often seen in a job advertisement, and people do value it. One may agree to take a low wage just for the learning opportunities, for example, in apprenticeship. The question is, is it worth it? Will the future wage be high enough to compensate for the sacrifices today? Thisquestionischallengingbecausewecanobserveneitherpeople’sbeliefsnorthesorting in the hiring procedure. I develop a novel way to disentangle the belief from reality. I generalize the model proposed by Jarosch et al. (2021) by allowing the workers have biased belief about how much they can learn from coworkers. In the model, workers receive two types of compensation: learning and wage. The model allows for general forms of production functions and incorporates varieties of sorting functions. I use the linked employee-employer data from Germany and find that workers overestimate how much they can learn from 51 coworkers. Quantitatively, the actual learning parameters are only 12% of the perceived parameters. For example, if a worker believes her next period wage would increase by on average 10 Euros due to coworkers, in reality, it will only increase by about 1.2 Euros. Namely, the learning opportunity is largely overpriced. Junior workers overpay the learning while senior workers are overpaid for the overestimated positive effects. The model features two learning functions: one for reality and the other for belief. The correct learning function governs how knowledge evolves by learning from coworkers. Mean- while, the perceived learning function, which may be biased from the correct one, determines the wage a worker is willing to take. I assume a competitive labor market. The compensa- tion schedule includes a constant wage and an opportunity to learn from coworkers. Workers choose where to work by maximizing their lifetime earnings based on their belief. The structural model only relies on team information, wages, and ages. It is based on assumptionsincludingacompetitivelabormarket, completefinancialmarket, andstationary. On the other hand, it is flexible on the firm side. The method works with general forms of complementarities across workers and production functions. Instead of working with a specific functional form, the estimation method uses the realized team composition. I use the German matched-employee-employer data to estimate the model. The data basis is the Longitudinal Model 1975-2017 of the Linked Employer-Employee Data from the Institute for Employment Research (IAB). The data were accessed on-site at the Research DataCentre(FDZ)oftheFederalEmploymentAgency(BA)attheInstituteforEmployment Research(IAB)andviaremotedataexecutionattheFDZ.Thedatasetcontainsthecomplete workforce at a sample of establishments from 2008 to 2017, including characteristics like wage, age, and occupation. I use three dimensions of the data to calculate knowledge levels for each worker and identify the perceived and actual learning parameters. First, I use the wage gap in each group to back out the knowledge. Second, I use the time series of one’s wage and the team composition to estimate how the actual knowledge evolves. Third, I estimate the perceived learning functions by comparing the wages of workers who have the 52 same knowledge but work in different teams. Operationally, identifying both the perceived and correct learning parameters depends on the same assumption, but different samples. Methodologically, it is in the same spirit as the cross-validation method. I separate the sample based on workers’ knowledge. For each worker, if there is someone who has the same knowledge and age, but is in a different team, then I classify both into one of the paired groups. Otherwise, the worker belongs to the unpaired group. I use the former groups to estimate to what extent the present value of future wages can make up for the current wage gap, and use the latter group to analyze the knowledge evolvement over time. I find that workers overestimate how much they can learn from others. The actual impact of coworkers is only 12% of one’s belief. For example, if the average wage of a worker’s more knowledgeable coworkers increased by 100 Euros, according to the belief, her next period wage would increase by on average 8 Euros times the share of the more knowledgeable workers in the team. But in fact, her wage only increases by the share of the more knowledgeable workers times 0.9 Euros. Namely, the learning opportunity is largely overpriced. Junior workers overpay the learning while senior workers are overpaid for the overestimated positive effects. The results suggest that the bias in belief drives up wage inequality. In my sample, the unconditional variance of log wages is 0.21. If rather than holding the current belief, workers know the correct learning function and firms hire the same group of workers, the unconditional variance drops to 0.17. I perform a simple variance decomposition. In the data, the between-team inequality is 0.16, and the within-team inequality is 0.05. In the counterfactual case, the between-team inequality drops slightly to 0.15, and the within-team inequality decreases to 0.015. It suggests that there is positive knowledge sorting across firms, and within-team inequality is largely caused by overestimation. Quantitatively, the biased belief contributes to 71% wage inequality within teams. Relate Literature - My paper provides empirical evidence on biased belief. It relates to 53 astrandofbehavioreconomicliteraturestudyingthejudgmentbiases, examplesincludeKlos et al. (2005), Rabin and Vayanos (2010), Miller and Sanjurjo (2018), Gong et al. (2020) and Andrew and Adams-Prassl (2021). Specifically, this paper finds workers are over-optimism, which is related to the overconfidence literature, for example Weinstein (1980), Carver et al. (2010), and Windschitl and Stuart (2015). This paper provides a novel way for researchers to identify unobserved belief by analyzing observed actions. The paper is closely related to the empirical within team learning literature. The struc- tural model of this paper is based on Jarosch et al. (2021), which estimates the peer effect empirically. The main difference is that Jarosch et al. (2021) uses within-team information, while my paper also explores the between-team characteristics. Nix (2020) also shows that there is a positive peer effect in the workplace. Broadly, this is also a part of the peer effect literature (Mas and Moretti, 2009; Cornelissen et al., 2017). I contribute to this literature by providing a method to explicitly disentangle the correct peer effect and the perceived peer effect. This paper is also connected to the endogenous growth literature, which uses learning from others as the source of growth. Lucas Jr (2009), Lucas Jr and Moll (2014) and Perla etal.(2021)studythecaseinwhichagentsrandomlymeetotherstolearnandconsequentially generate growth. Other works (Jovanovic and MacDonald, 1994; Jovanovic, 2014; Perla and Tonetti, 2014; Luttmer, 2014) in growth theory also explore the growth from imitation and idea adoption. My paper contributes to this literature by providing direct evidence on sorted meetings and learning. The remainder of this paper is organized as follows. In Section 2.2, I present a model where agents learn from their coworkers based on the perceived learning function, and their knowledge evolves according to the correct learning function. Section 2.3 reports the main results. I propose an algorithm to structurally estimate the two learning functions using the data and discuss the implications of the biased belief. Section 2.4 concludes. 54 2.2 Model I build a model based on Jarosch et al. (2021). There is a unit mass of heterogeneous indi- viduals in the economy with knowledge z∈Z =[0,¯z]. Individuals supply labor inelastically in a competitive labor market and consume their income. I assume that all individuals have a probability τ of dying each period. Anyone, who remains alive at age N = 65, leaves the market exogenously. Each period, a mass δ of new individuals enters the market at age n 0 =22 with a knowledge drawn from the distribution B 0 (z). Agents are employed in firms and work in teams where they can learn from their coworkers. For a worker with knowledge z who has coworkers ˜ z, her next period knowledge level z ′ is drawn from the distribution G(z ′ |z,˜ z) in reality while the distribution is ˜ G(z ′ |z,˜ z) in belief. Here I assume workers can learn from their coworkers’ knowledge. In the appendix I extend the model to consider what if some part of the knowledge cannot be learnt. Financial markets are complete, so individuals maximize the expected present value of income. Firms produce in a competitive consumption good market. To enter a market, firms pay a fixed cost c, and then draw a technology a∈A(a∼ A(a)). In each period, a firm hires a team of workers z to produce according to its production function F (z;a), taking wages as given. Similar as in Jarosch et al. (2021), F (z;a) satisfies minimal structure requirements. Potentially, the complementarity between workers and technologies can vary across firms or periods. Therefore, teams generally are heterogeneous across firms and across time. 2.2.1 Firms Firms hire workers in a competitive labor market and take the wage schedule as given. Use n to denote the total number of workers in the team. z i is the ith element of z, and ˜ z − i is the set of i’s coworkers. The total wage W (z) if it hires the set of workers z can be written as W (z)= n X i=1 w(z i ,˜ z − i ). (2.1) 55 A firm chooses the team of workers to maximize its profit π (a)=max z F (z;a)− W (z), (2.2) The optimal team setting z(a) satifies z(a)=argmax z F (z;a)− W (z). (2.3) 2.2.2 Workers To maintain a stationary distribution of workers, assume that the birth rate δ satisfies δ =τ + 1 N− n 0 +1 (1− τ ) N− n 0 +1 , (2.4) Individuals decide where to work in each period. The payoff includes two parts: the wage w and the learning opportunities. Therefore, an individual is willing to accept a lower wage if there are more attractive learning opportunities. The wage depends on both one’s knowledge and the coworkers’ knowledge. Individuals discount the future using a discount rate β . Because the worker leaves the market mandatorily at age N, the value function dependsonthecurrentageoftheworker. Thevaluefunctionofanindividualwithknowledge z and age n is V(z,n)=max ˜ z∈ ˜ Z w(z,˜ z,n)+β Z V(z ′ ,n+1)d ˜ G(z ′ |z,˜ z), (2.5) where ˜ Z represents the set of all possible combinations of coworkers. It means that each period an individual chooses where to work to maximize her wage and her discounted future income. Based on the assumption of a competitive labor market, in equilibrium, the wage should be such that everyone is indifferent to working in any firm. Therefore, the value function does not depend on the coworker’s knowledge. Specifically, when n=N, the value 56 function equals the current wage V (z,N)=max ˜ z∈ ˜ Z w(z,˜ z,N). (2.6) This is because learning from co-workers does not matter anymore. Therefore, when n=N, the wage only depends on the worker’s own knowledge level z, not the coworkers’ w(z,˜ z,N)=w(z,N), (2.7) In equilibrium, the wage schedule adjusts so that workers are indifferent across teams. The wage satisfies w(z,˜ z,n)=V(z,n)− β Z V(z ′ ,n+1)d ˜ G(z ′ |z,˜ z), (2.8) for any z,˜ z realized in the equilibrium. One implication is that for any ˜ z,˜ z ′ w(z,˜ z,n− 1)− w(z,˜ z ′ ,n− 1)=− β Z V(z ′ ,n)d ˜ G(z ′ |z,˜ z)− Z V(z ′ ,n)d ˜ G(z ′ |z,˜ z ′ ) . (2.9) The wage difference in (n− 1) is used to compensate for the difference in learning opportu- nities. For example, two identical workersi andj work in different teams. If i can on average learn more from working than j due to the team setting, then she will receive a lower wage. 2.2.3 Labor market and Firm Entry Define B(z) the cumulative density function of z. For any team of workers z, let N(z,z) be the number of workers whose knowledge is not larger than z. Labor market clearing yields B(z)=m Z N(z(a),z)dA(a),∀z, (2.10) 57 where m is the mass of firms in the economy. Firm entry satisfies the free entry condition Z [π (a)− c]dA(a)=0. (2.11) 2.2.4 Knowledge Distribution Define O(˜ z|x): ˜ Z× Z→[0,1] to be the equilibrium share of workers with knowledgex whose coworkers are strictly dominated by the vector ˜ z. Therefore, for workers with knowledge x, the probability their knowledge will be less than z is R ˜ z∈ ˜ Z G(z|x,˜ z)dO(˜ z|x). The stationary distribution in equilibrium satisfies B(z)=τB 0 (z)+(1− τ ) Z x Z ˜ z∈ ˜ Z G(z|x,˜ z)dO(˜ z|x)dB(x). (2.12) 2.2.5 Equilibrium A stationary competitive equilibrium includes a wage schedule w, a profit function π , an employment arragement z, a value function V, a mass of firms m, a stationary distribution B and a coworker vector set ˜ Z, such that: (1) w and V solve (2.2) and (2.5); (2) z solves (2.2); (3) the labor market clears (2.10) for each knowledge level; (4) π satisfies the free entry condition (2.11); (5) the law of motion of the knowledge distribution 2.12 holds. Next, I summarize the assumptions in the model. The goal is to ensure that the value function V (z,n) strictly increases in z. When comparing two vectors z 1 and z 2 , they have the same length N and are ordered by the element size. z 1 > z 2 holds when each element in z 1 is larger or equal to the corresponding elements in z 2 : z 1i ≥ z 2i ,i = 1,2,...N, and inequality holds for at least one element. Assumption 1. F (z;a) strictly increases in z. This assumption means that if the knowledge of one or some workers increases, the firm’s 58 production will also increase. Assumption 2. ˜ G(z ′ |z,˜ z) and G(z ′ |z,˜ z) strictly decreases in z and ˜ z. Assumption 2 requires that both in belief and in reality, for two workers having the same set of coworkers, the one with higher knowledge will stochastically learn more than the other. Also, for two identical workers, both in belief and in reality, the one who works with higher knowledge coworkers will have stochastically more knowledge next period. Assumption 3. Free disposal of knowledge. Three assumptions together lead to following results. Lemma 1. w(z,˜ z 1 ,n)≤ w(z,˜ z 2 ,n), if ˜ z 1 >˜ z 2 . Proof. Assumption 2 says ˜ G(z ′ |z,˜ z) strictly decreases. The value function V (z,n) is weakly increasing in z according to Assumption 3. Hence, R V(z ′ ,n)d ˜ G(z ′ |z,˜ z) increases with ˜ z. w(z,˜ z,n) weakly decreases in ˜ z using Equation (2.9). Lemma 2. If a firm’s worker setting is z(a) = (z 1 ,˜ z). Then for ∀z 2 > z 1 , w(z 1 ,˜ z,n) < w(z 2 ,˜ z,n) must hold. Proof. Proof by contradiction. If there are two workers with z 1 and z 2 , where z 1 > z 2 and w(z 1 ,˜ z,n)≤ w(z 2 ,˜ z,n). It means that by hiring z 1 , the firm can offer a lower wage to this worker. By Assumption 1, firms can produce more output. Additionally, it is less costly for firms to hire the rest workers, because of Lemma 1. Hence, firms would always want to hire z 1 over z 2 . Proposition 1. V (z,n) strictly increases in z. Proof. Use Mathematical induction . Consider z 1 <z 2 . First, when n = N, V (z 1 ,N) = w(z 1 ,N) < w(z 2 ,N) = V (z 2 ,N) according to Lemma 2. 59 Second, assume V (z 1 ,n)<V (z 2 ,n) holds for some n. Then V (z 1 ,n− 1)=w(z 1 ,˜ z,n− 1)+β Z V(z ′ ,n)d ˜ G(z ′ |z 1 ,˜ z) <w(z 2 ,˜ z,n− 1)+β Z V(z ′ ,n)d ˜ G(z ′ |z 2 ,˜ z) =V (z 2 ,n− 1) (2.13) Hence, V (z,n) strictly increases in z. The proposition means that one’s life time income strictly increases in her knowledge z. If two workers are at the same age, then the one with a higher z has a higher expected life time value. 2.3 Structural Estimation Based on the model, this section structurally estimates the perceived and actual learning functions ˜ G(·) andG(·). The data basis is the Longitudinal Model 1975 – 2017 of the Linked Employer-EmployeeDatafromtheIAB.Thedatawereaccessedon-siteattheResearchData Centre (FDZ) of the Federal Employment Agency (BA) at the Institute for Employment Research(IAB)andviaremotedataexecutionattheFDZ.Thedatasetcontainsthecomplete workforce at a sample of establishments from 2008 to 2017. 2.3.1 Data This section briefly describes the key dimensions of the dataset. The establishment part includes all establishments that are surveyed by an annually conducted survey at least once from 2009 to 2016. The individual part includes workers who were employed in any one of the sample establishments for at least one day during 2008 - 2017. For each individual, it also contains the employment biographies from 1975 - 2017. The data include worker-level information on the establishment, occupation, wage, and individual characteristics like age, 60 0 5000 1.0e+04 1.5e+04 Frequency 0 20 40 60 80 100 Team Size Figure 2.1: Team Size Distribution Note: The figure plots the team size distribution in the year 2009 for teams with 2-99 workers. gender, and education. I define working groups following Jarosch et al. (2021): a group is defined as a set with at least two workers in the same establishment and occupation in a given year. Figure 2.1 reports the team size distribution in 2009. The figure truncated at 100 workers per team. The sample contains 24,118 teams. The size distribution is skewed, with a median of 5 people in contrast to an average of 23. Figure 2.2 shows the age distribution of all workers in teams. The average age is 43. The 10th, 50th, and 90th percentiles are 28, 44, and 56, respectively. Figure 2.3 reports the distribution of the average daily wages during the year 2009 in 2015 Euros. The mass-point observations are the top-coding of the wage data. 1 This paper builds upon the wage difference within teams. Figure 2.4 shows the distribu- 1 Following Jarosch et al. (2021), I treat the top-coded observations as the actual wages and do not correct for top-coding. In total, 10.6% of observations are top-coded. 61 0 5000 1.0e+04 1.5e+04 Frequency 20 30 40 50 60 70 age Figure 2.2: Age Distribution 0 1.0e+04 2.0e+04 3.0e+04 4.0e+04 5.0e+04 Frequency 0 50 100 150 200 Wage Distribution of mean daily wages during spell overlapping 01/31/2009 for full ime employees working subject to social security Figure 2.3: Wage Distribution 62 0 2.0e+04 4.0e+04 6.0e+04 Frequency −.5 0 .5 Wage Gap Figure 2.4: Wage Gap Distribution tion of the gap between the log wage of a worker and her coworkers. The mean is -0.015, the quartiles are -0.09, 0.00, and 0.07. 2.3.2 Identifying Perceived And Actual Learning Parameters I use three dimensions of the data to identify the perceived and actual learning parameters, as well as calculate knowledge levels for each worker. First, I use the wage gap in each group to back out the knowledge. Second, I use the time series of one’s wage and the team composition to estimate how the actual knowledge evolves. Third, I estimate the perceived learning functions by comparing the wages of workers who have the same knowledge level but work in different teams. The identification only relies on the worker’s Bellman equation V(z,n)=w(z,˜ z,n)+β E(V(z ′ ,n+1)|z,˜ z). (2.14) 63 The method does not impose any restrictions on firms’ production in addition to the previous assumptions, nor on the set of active firms – it simply uses all observed firms. Equation 2.14 is based on workers’ value function 2.5. Assume that the economy is in equilibrium. The observed data are the optimal allocation that maximize everyone’s life time value. Since the labor market is perfectly competitive, workers are indifferent from all available offers, which means that the value function only depends on the knowledge z and the age n. Because z does not have a natural cardinality, I choose a convenient one. If w(z,n) is the wage for workers who works along in the equilibrium, I choose a cardinality of z so that w(z,n)=z. Therefore, Equation 2.14 can be written as V(z,n)=z+β Z V(z ′ ,n+1)d ˜ G(z ′ |z), (2.15) with a boundary condition V (z,N)=z. (2.16) Worker i next period’s knowledge includes two parts z ′ i =E(z ′ i |z i ,˜ z − i )+ε i , (2.17) a conditional expectation and an unexpected shock ε i . Assume that ε i is fully random: it is independent of a worker’s own knowledge z i , and her coworkers knowledge ˜ z − i . According to this assumption, I construct the identifications. All moment conditions are based on E(ε i |z j ) = 0,∀i,j, which is implied by the assumption that the shock is independent of workers’ knowledge. I choose the functional forms for the two learning functions based on Jarosch et al. (2021). 64 The actual learning function takes the following parametric form: E(z ′ i |z i ,˜ z − i )= 1 I− 1 X j̸=i z i Θ z j z i , (2.18) and the perceived learning function is ˜ E(z ′ i |z i ,˜ z − i )= 1 I− 1 X j̸=i z i ˜ Θ z j z i . (2.19) I is the team size and both Θ( ·) and ˜ Θ are weakly increasing function. It means that when the coworkers’ knowledge increases, the expected learning results would improve in both belief and reality. Two learning functions share the same functional form but different parameters: Θ z j z i = 1+θ 0 +θ + z j z i − 1 if z j z i >1 1+θ 0 +θ − z j z i − 1 if z j z i ≤ 1 (2.20) and ˜ Θ z j z i = 1+ ˜ θ 0 + ˜ θ + z j z i − 1 if z j z i >1 1+ ˜ θ 0 + ˜ θ − z j z i − 1 if z j z i ≤ 1 (2.21) Operationally, this paper estimates a case where θ 0 = ˜ θ 0 ,θ + = γ ˜ θ + ,θ − = γ ˜ θ − . Intuitively, θ 0 = ˜ θ 0 means workers know how knowledge evolves without learning from coworkers. γ captures the perception gap. They may have a misperception about how much they are affected by coworkers though, due to reasons such as bias when estimating others’ knowledge levels. Based on the chosen functional form, the value function yields V (z i ,n i )=z i +βV ((1+θ 0 )z i ,n i +1). (2.22) 65 Use V (z i ,N)=z i , (2.23) and solve backwards gives V (z i ,n i )= N− n i X t=0 [β (1+θ 0 )] t z i = ˜ β i z i (2.24) where ˜ β i = 1− [β (1+ ˜ θ 0)] N− n i +1 1− β (1+ ˜ θ 0) is the discounting factor. It decreases with the age n i . Namely, when workers get older, the learning opportunities are worth less. The identification includes three steps. First, I calculate every worker’s knowledge level in each period. To do so, combine Equation 2.14 and Equation 2.24. The Bellman equation can be written as ˜ β i z i =w i + ˜ β ′ i ˜ E(z ′ i |z i ,˜ z − i ). (2.25) It means that a worker’s compensation includes two parts: current wage and the future payoff related to knowledge. The future knowledge, in turn, depends on one’s coworkers’ knowledge. Therefore, a worker’s wage is affected by not only her own knowledge, but also her coworkers’ knowledge—the wage and learning opportunity need to be such that workers are indifferent from existing offers. The value of learning, in this scenario, is measured using the perceived learning function, since workers evaluate learning based on their belief. Thus, as long as the team information is available, individual knowledge can be backed out using the within team wage distribution. Specifically, there is one equation for each worker. Given perceived learning parameters n ˜ θ 0 , ˜ θ + , ˜ θ − o , the knowledge level z i is a function of team members’ wages and ages. Second, Iestimatetheactuallearningparameters. Basedonthewithinteaminformation, I calculate workers’ knowledge level z i in each period—the knowledge evolvement is observ- able. Meanwhile, the dataset provides comprehensive information on the team composition: 66 in addition to the worker’s own knowledge over time, coworkers’ knowledge in every period are also known. Therefore, I calculate how a worker’s knowledge affected by her coworkers, by estimate the following equation z ′ i =(1+θ 0 )z i + 1 I− 1 θ − X z j ≤ z i (z j − z i )+θ + X z j >z i (z j − z i ) +ε i . (2.26) The parameters can be estimated n ˜ θ 0 , ˜ θ + , ˜ θ − o by running a linear regression. Thirdly, I estimate the perceived parameter γ by comparing the perceived learning and the actual learning. The former is evaluated using the fair price of learning—measured by wages. The latter is directly estimated. In practice, since the dataset is big enough, there are always some pairs of workers who share the same knowledge level and age while working in different teams. Based on the assumptions, it means that, in each pair, the workers are indifferent between the two jobs. Namely, the perceived team composition difference is compensated by the wage gap between the jobs. In a special case when the belief and reality coincides, the knowledge different should be exactly made up by the wage gap. If workers i and j have the same knowledge level and the same age, then from Equation (2.9) their wages satisfy: ˜ β i γ z ′ i − z ′ j =− (w(z i ,˜ z − i ,n i )− w(z j ,˜ z − j ,n j ))− ˜ β i γ (ε j − ε i ). (2.27) The left hand side of Equation 2.27 is the perceived life value difference in one period; the right hand side has two components: the wage gap and the perceived unexpected shocks. Whenγ is 1, it is the case where workers have the correct belief about learning. When γ < 1, the wage gap− (w(z i ,˜ z − i ,n i )− w(z j ,˜ z − j ,n j )) on average is larger than the next period life valuegap ˜ β i z ′ i − z ′ j ; workersscarifiesmorewagethanthevalueaddedbylearning—workers overestimate how much they can learn from coworkers. 67 Operationally, the moment conditions are E(z j ε i |∃k,z k =z i )=0 (2.28) For each age n i and corresponding ˜ β i , there is one moment condition. Operationally, I use the moment conditions for n={20,21,...64}. Together, the 45 moments identify γ . The algorithm is described below: 1. For every γ in a reasonable range [γ − ,γ + ], guess the perceived learning parameters n ˜ θ 0 , ˜ θ + , ˜ θ − o Guess . 2. Given theγ and n ˜ θ 0 , ˜ θ + , ˜ θ − o Guess , I can back out the knowledge z for all workers using Equation (2.25). 3. Pair workers according to knowledge and age: two workers are in a group if both knowledge and ages are the same. Workers who are paired are grouped into subgroups by age, and the rest are grouped together. 4. Use the unpaired group to run regression (2.26). Update the learning parameters n ˜ θ 0 , ˜ θ + , ˜ θ − o Guess accroding to the regression. 5. Repeat step 2 to 4 until convergence. 6. Use the paired groups to estimate the moment conditions from equation system 2.27. 7. Find the γ that minimizes the squared difference between data and model. Identifyingboth n ˜ θ 0 , ˜ θ + , ˜ θ − o andγ dependsontheassumptionthatshocksεareindependent with knowledge z. The key is to use two separate and representative samples to estimate the two sets of parameters. Methodologically, it is in the same spirit as the cross-validation method. In practice, I use the 45 paired samples to estimate γ , that is, to what extent the present value of future wages can make up for the current wage gap. I use the unpaired sampletoanalyzetheknowledgeevolutionovertimeandtoestimatethelearningparameters. 68 2.3.3 Estimation Results Using the estimation method developed to estimate both the perceived learning function n ˜ θ 0 , ˜ θ + , ˜ θ − o and the perceived parameter γ . The results are reported in Table (2.1). The perceived learning parameters are reported in panel A. ˜ θ 0 is positive, which represents that workers accumulate knowledge during work. This is similar to the common sense of learning- by-doing. The perceived learning from more and less knowledgeable workers is 0.08 and0.05, respectively. It means that workers think that they are affected asymmetrically by their coworkers: workers are more heavily impacted by more knowledgeable coworkers than the rest. The numbers can be interpreted in terms of expected wage changes. Consider a case in which a worker’s more knowledgeable coworkers become even better. For example, if the average wage of a worker’s more knowledgeable coworkers increase by 100 Euros, according to the belief, her next period wage will increase by on average 8 Euros times the share of the more knowledgeable coworkers in the team. On the other hand, if the more knowledgeable coworkers stay the same, while the less knowledgeable coworkers become better with a 100 Euros wage increases, then her next period wage will go up by on average 5 Euros times the share of the more knowledgeable workers in the team. This estimation is consistent with the results in Jarosch et al. (2021). They also find workers are more affected by more knowledgeable coworkers. The key finding is that the difference between real life and perception is huge, with γ =0.12. It means that the actual impact of coworkers is only 12% of one’s belief. Therefore, in the previous example, when the average wage of a worker’s more knowledgeable coworkers increase by 100 Euros, her next period wage will increase by on average the share of the more knowledgeable coworkers in the team times 0.9 Euros, instead of 8 Euros. If the more knowledgeable coworkers stay the same, while the less knowledgeable coworkers become better with a 100 Euros wage increases, then her next period wage will go up by on average theshareofthemoreknowledgeableworkersintheteamtimes0.6Euros,ratherthan5Euros. It means that workers overestimate how much they can learn from coworkers. Namely, the 69 learning opportunity is largely overpriced. Junior workers overpay the learning while senior workers are overpaid for the overestimated positive effects. Table 2.1: Estimation Results Panel A: GMM Estimation ˜ θ 0 ˜ θ + ˜ θ − γ obs Estimation 0.01688 0.07826 0.05000 0.120 4188203 (0.000001) (0.00003) (0.00003) (0.005) Panel B: The Actual Learning Parameters Based on the Estimation θ 0 θ + θ − Results 0.01688 0.00939 0.00600 [1] Note: the standard deviation is reported in the parenthesis. Belief is estimated using GMM. The reality is inferred based on the estimation. The estimation can be extended to incorporate other form of learning functions. For example, the assumption θ 0 = ˜ θ 0 ,θ + = γ ˜ θ + ,θ − = γ ˜ θ − can be relaxed. The same algorithm will still work. 2.3.4 Inequality This section considers what if the belief were correct, that is, θ 0 = ˜ θ 0 ,θ + = ˜ θ + ,θ − = ˜ θ − . According to the structural model, the learning opportunities contribute to the wage difference and consequentially income inequality. The overestimated learning opportunity, therefore, affects the income inequality. Assume that suddenly people realize that their belief is biased and know the correct learning function. I focus on a special case where firms are willing to adjust the wages according to the new belief without changing team compositions. As a result, the wages of workers with more knowledge go down, since they were overpaid before. Meanwhile, less knowledgeable workers receive a wage raise. The inequality levels are shown in Table (2.2). The first column shows the total wage inequality in the data, which is measured by the variance of log wages. The second column reports the wage inequality if there is no bias in belief. The total wage inequality drops from 70 Table 2.2: Inequality Change When Perception Is Correct Benchmark Correct Belief Wage Inequality 0.210 0.167 Between Teams Wage Inequality 0.158 0.152 Within Team Wage Inequality 0.052 0.015 Note: the table reports the change in inequality when there is no bias in belief. The inequality is measured by unconditional variance of log wages 0.21 to 0.17. To explore the mechanism behind, I calculate a simple inequality decomposition and find that the changes are mainly from the within-team inequality. It decreases by more than 70%: from 0.052 to 0.015. This result implies that the overestimation contributes significantly to the within-team wage difference, where more knowledgeable workers are paid even more. Moreover, the results show that the between team inequality almost remains unchanged: it is 0.158 in the data and 0.152 under the correct belief. This suggests that there is positive sorting between firms. 2.4 Conclusion This paper shows evidence suggesting people overestimate the magnitude of learning from coworkers. As a result, senior workers are overpaid for their overestimated impact on others, while junior workers are overcharged for the overstated learning opportunities they have by working with experienced coworkers. This suggests that the biased belief may lead to an increase in within-team wage inequality. This paper points to a plausible way to reduce inequality: to narrow the discrepancy between the perceived and correct learning function. It also provides suggestive evidence on the explanation of wage inequality. The between-firm inequality is largely due to knowledge differences while the within-firm inequality is mainly driven by learning opportunities. There are multiple paths for future research. One is to think about optimal policies to reduce inequality conditional on belief bias. Another is to explore the heterogeneity in bias and how to reduce biases. 71 Chapter 3 Elections and the Response to Crises: Evidence from the COVID-19 Pandemic 1 3.1 Introduction The Covid-19 crisis offers an unfortunate but unique opportunity to compare policy making in the face of a homogeneous and contemporary shock. A literature has studied the elections’ impact on the response to crises (Alesina and Drazen, 1991; Grilli et al., 1991; Satyanath, 2005; Lipscy, 2020; Flores and Smith, 2013). Empirical investigations usually face challenges from the heterogeneity in the nature and timing of crises across countries. We explore this question in the context of countries’ economic and social response to the first Covid-19 wave, during which countries had to make comparable choices between limited tools. We find that closer elections consistently predict significantly larger fiscal stimulus and weaker containmentmeasures. Themagnitudeissignificantlylargerformeasuresthatmostlyreduce economic activities. While literature on political budget cycles find such cycles have moderated over decades (Drazen, 2000), we find electoral incentives have predictive power on policy response facing 1 Coauthored with Dr.Jihad Dagher 72 a major crisis. Arguably, confronted with an unprecedented level of uncertainty of the pandemic, the electorate is less likely to distinguish policy needs from politically motivated policies (Stiglitz, 2020). Further, transfers, which constituted a substantial share of fiscal stimulus, are known to have a more consistent political payoff (Manacorda et al., 2009). The welfare effect of closer elections is however uncertain – governments facing elections might moved policy making along the indifference curve or improved lockdown at the intensive margin. Our paper is related to a literature on the Covid-19 pandemic. The fiscal front literature has studied the economic determinants of response such as income, debt and credit-rating (Alberola et al., 2021; Benmelech and Tzur-Ilan, 2020). Aizenman et al. (2021) explore eco- nomic, institutionalandpoliticalfactors, butnotelections, andfindpolarizationhinderedthe response. On lockdown measures, our paper is most related to Pulejo and Querubín (2021) who focus on presidential systems, and find only when incumbents can run for re-election, they implement less stringent restrictions when the election is closer. Our paper uses a differ- ent empirical model exploring the level of democracy instead of term limits as a conditioning effect of electoral incentives, in both presidential and parliamentary systems, consistent with the international political budget cycle literature (De Haan and Klomp, 2013). Another methodological difference is that we use panel data instead of cross-sectional, with nearly twice as many countries and over a longer time to control for significant time effects and infection levels. A related literature also identified electoral incentives at subnational levels (Chen et al., 2022; Gonzalez-Eiras and Niepelt, 2022). The literature also identified political repercussions of lockdown (e.g. Fazio et al., 2021). 3.2 Data We explore the containment response to the crisis, using data collected by Hale et al. (2020). The data on fiscal measures passed as of September 2020 is from the IMF. We also hand- 73 0 .1 .2 .3 Density 0 5 10 15 Fiscal Measures(% of GDP) Sep Jun A 0 .01 .02 .03 Density 20 40 60 80 100 Average Stringency Index B Figure 3.1: The distribution of fiscal and containment measures across countries. collected such data, as of three months (June 15) after the official start of the pandemic. We collect data on election dates, focusing on the executive branch election. We use parliamentary elections for parliamentary and semi-presidential systems, and presidential electionforpresidentialsystems. Basedonthis, wecreateameasurecalled“TimetoElection” (TTE) that indicates how many years to the next election. 3.3 Empirical Methodology Let Y i denote the policy measure in country i, our baseline regression is: Y i =α +µ j +β ⃗ X i +θ 1 Polity i +θ 2 TTE i +θ 3 Polity i × TTE i +ϵ i (3.1) 74 where µ j is a region dummy and X contains policy measure related controls. Infections per capita (per 10,000), GDP per capita, Polity, TTE and a parliamentary system dummy are common controls. The regression exploits the exogenous heterogeneity in TTE to explore the role of elec- tions. Ascommonintheliterature(e.g.BrenderandDrazen,2005), ourcoefficientofinterest is the interaction between TTE and Polity. Theoretically, elections only matter to the extent that the country has a reasonably high level of democracy (Polity). Note that TTE is pre- determined before our sample period and not correlated with other explanatory variables. For θ 3 ̸= 0 to be due to a confounding factor, unrelated to the above interpretation, an unobservable factor has to be unrelated but correlated to either democracy or TTE, and af- fect policy making through election timing or democracy level, respectively. Our identifying assumption is that neither is true (Brender and Drazen, 2005). For containment measures, we explore this relation in panel data using a hybrid model (Allison, 2009) since neither Polity and TTE has meaningful time variation. The hybrid model consists of differencing out country averages for the only time-varying independent variable(infectionratesinourcase)aswithin variables, andincludingtheaveragesasbetween variables. Then the panel regression is estimated with random effects. Note that even in the presence of endogeneity between fiscal and containment measures, which we do not observe, our estimates of the coefficients of interest in our reduced form setting are consistent (Wooldridge, 2010). 3.4 Results Table 1 presents the results for fiscal measures. The first column uses the June data. The interaction coefficient is negative and significant at the 5% significance threshold. It shows that democracy is associated with an increase in fiscal stimulus when elections are close (small TTE). The second column using the data as of mid-September from the IMF yields 75 Table 3.1: Fiscal measures (1) (2) (3) Stimulus Stimulus Fiscal Balance (June) (September) (2019) Infection rate 0.0217 0.0316 ∗ -0.00713 (1.31) (1.72) (-0.37) GDP per capita -0.103 -0.161 0.322 (-0.21) (-0.28) (0.47) Credit rating 0.0465 ∗∗ 0.0762 ∗∗∗ 0.0371 (2.08) (2.96) (1.25) Debt to GDP 0.0218 ∗∗∗ 0.0210 ∗∗ -0.00132 (2.64) (2.27) (-0.13) Polity 0.237 ∗ 0.232 ∗ -0.0545 (1.99) (1.70) (-0.44) TTE 0.242 0.264 0.165 (0.79) (0.78) (0.51) Polity× TTE -0.0763 ∗∗ -0.0825 ∗∗ -0.0210 (-2.07) (-2.02) (-0.53) Parliamentary -0.359 0.526 -2.407 ∗∗ (-0.50) (0.66) (-2.56) Constant -0.547 -1.202 -5.200 (-0.16) (-0.30) (-1.09) Observations 95 92 76 R 2 0.446 0.514 0.409 Notes: This table presents results regression (1) in columns 1 and column 2. All regressions control for regional fixed effects. Column 3 is from a falsification test where the dependent variable is 2019 fiscal balance . t statistics in parentheses. ∗ p<0.10, ∗∗ p<0.05, ∗∗∗ p<0.01 76 similar results. In terms of magnitude, a one standard deviation increase in TTE leads to a decrease of 0.33% of GDP in fiscal stimulus evaluated at mean polity. In the third column we run a falsification test where the dependant variable is the fiscal balance of 2019, and the earlier findings no longer hold. 3.4.1 Containment measures We create a dataset with weekly observations of the dependent variable, Stringency Index (SI), from March 25 th to September 9 th . Figure 1 shows the SI distribution, ranging from 0 to 100. Table 2 presents results from the hybrid model. In the interest of space we only show the coefficients on the key political economy variables. The dependent variable in the first column is the aggregate measure. The interaction variable is positive and significant at the 1% confidence level. Evaluated at average polity, a one standard deviation increase in TTE leads to an increase in SI by 1.37 points – When elections are further out the stringency index becomes larger in democratic countries. We next explore sub-indices and find that the interaction variable is positive and sig- nificant for the sub-indices. The coefficient is larger for more economically costly measures such as workplace closing and stay-at-home comparing to less costly measures such as school closing. In the last column we regress the difference between the index on workplace closing and on school closing. The results confirm elections have a significantly higher impact on the more economically costly measure. 3.5 Discussion One reasonable explanation for our results, is that, in the face of an uncertainty about voter preferences, politicians are more likely to opt for a time-tested approach, which is to preserve and grow the economy (Stiglitz, 2020). 77 Table 3.2: Containment measures, sub-index level. (1) (2) (3) (4) (5) (6) (7) (8) Stringency School Workplace Cancel public Restrictions on Close public Stay at Work-school Index closing closing events gatherings transport home difference Polity -1.744 ∗∗∗ -1.092 -3.452 ∗∗∗ -2.533 ∗∗∗ -1.386 -2.407 ∗∗ -2.064 ∗∗∗ -2.588 ∗∗∗ (-3.59) (-1.43) (-4.49) (-2.98) (-1.45) (-2.36) (-2.75) (-3.07) TTE -2.398 ∗∗ -0.524 -5.093 ∗∗∗ -4.029 ∗∗ -4.803 ∗∗ -2.960 -2.599 -4.977 ∗∗∗ (-2.21) (-0.31) (-3.01) (-2.16) (-2.26) (-1.31) (-1.57) (-2.69) Polity× TTE 0.533 ∗∗∗ 0.444 ∗∗ 0.876 ∗∗∗ 0.609 ∗∗ 0.569 ∗∗ 0.698 ∗∗ 0.723 ∗∗∗ 0.513 ∗∗ (4.01) (2.09) (4.07) (2.55) (2.16) (2.48) (3.47) (2.16) civil -6.536 ∗∗ -6.372 -0.907 -2.558 -9.013 ∗ -7.361 -10.93 ∗∗∗ 5.641 (-2.50) (-1.59) (-0.23) (-0.59) (-1.77) (-1.36) (-2.79) (1.31) Constant 118.8 ∗∗∗ 181.7 ∗∗∗ 85.82 ∗∗∗ 147.7 ∗∗∗ 110.1 ∗∗∗ 116.8 ∗∗∗ 100.0 ∗∗∗ -94.46 ∗∗∗ (10.92) (10.82) (5.14) (8.08) (5.17) (5.17) (6.10) (-5.22) Observations 3025 3025 3025 3025 3025 3025 3024 3025 Overall R2 .48 .41 .32 .26 .21 .27 .37 .22 Notes: This table presents results from a hybrid panel regression. Observations are at a weekly frequency between March 25 and September 9. The dependent variables are measures of lockdown stringency. ∗ p<0.10, ∗∗ p<0.05, ∗∗∗ p<0.01 78 Our results should not be interpreted as implying welfare costs from elections. We also cannot also rule out the possibility that elections led politicians to implement more efficient lockdown. 79 Appendix A The Distribution of Innovations across Firms A.1 Algorithm I use value function iteration to solve the model. First, I define a set of discrete points for normalized firm sizes {˜ q 0 ,˜ q 1 ,...,˜ q N } and for idea types {θ 0 ,θ 1 ,θ 2 ,...θ M }. The algorithm employs a computational loop with the following steps 1. Guess the measure of the firm, the innovation without complementarity, the death rate and the growth rate (N F ,λ b ,τ,g ). 2. Based on the Equation (1.7), solve for the corresponding ν . 3. For each inventor firm pair (θ, ˜ q), use Equation (1.26) to solve for the stock share offered a and the utility level. 4. For each inventor θ , find the firm ˜ q that offers the highest utility level according to Equation (1.27). 5. Update the guess on the measure of the firm, the innovation without complementarity, the death rate, and the growth rate (N F ,λ b ,τ,g ) using Equation 1.31, 1.23, 1.33, and 80 1.35. 6. Repeat steps 2 to 5 until convergence. A.2 Growth Rate The growth rate of the aggregate quality ¯q by definition is: g = E t (¯q(t+dt))− ¯q(t) ¯q(t)dt . (A.1) The expected average quality at t+dt includes two parts: the quality of incumbents and the quality of new firms. E t (¯q(t+∆ t)) can be written as E t (¯q(t+dt))= Incuments z }| { Z Eq(t+dt)f q (q,t)dq+ Entry z}|{ τκ ¯qdt + Innovative Entry z }| { [γ L λ I,L E(z|z > ˆ z)+γ H λ I,H E(z)]¯qdt, (A.2) where λ I,L = (1− h(0))(1− Φ(ˆz)) (1− h(0))(1− Φ(ˆz))+h(0) λ I ,λ I,H = h(0)) (1− h(0))(1− Φ(ˆz))+h(0) λ I ,. Rearrange the equation to get the growth rate: g =− τ (1− κ ) +λ b N F γ L E(z) F (ˆ z) +γ H E(z|z≤ ˆ z) +γ H E(z) Z h(q ∗ (θ ))λ ∗ θ ψ (θ )dθ. (A.3) A.3 Untargeted Moments—In-house versus outsourcing choice The data moments are obtained from ESEE in 1990 and 1994. The first wave of ESEE is in 1990. I run four specifications for 1990 and 1994 according to: 81 Table A.1: Firm R&D Investment Regressions 1 (1) (2) (3) (4) R&D Sales Internal Sales R&D Avg Sales Internal Avg Sales Panel A: Year = 1990 ln(Sales) − 0.009 ∗∗∗ − 0.006 ∗∗∗ 0.016 ∗∗∗ 0.012 ∗∗∗ (− 3.93) (− 4.75) (8.24) (8.00) Constant 0.174 ∗∗∗ 0.118 ∗∗∗ − 0.246 ∗∗∗ − 0.182 ∗∗∗ (4.61) (5.62) (− 7.63) (− 7.40) Observations 869 869 869 869 R-Squared 0.017 0.025 0.073 0.069 Panel B: Year = 1994 ln(Sales) − 0.001 ∗∗ − 0.001 ∗ 0.023 ∗∗∗ 0.013 ∗∗∗ (− 2.02) (− 1.79) (7.40) (6.71) Constant 0.042 ∗∗∗ 0.030 ∗∗∗ − 0.361 ∗∗∗ − 0.203 ∗∗∗ (3.53) (3.14) (− 6.96) (− 6.28) Observations 801 801 801 801 R-Squared 0.005 0.004 0.064 0.053 1 The dependent variable used for each column is listed at the top of the column. The data are from Survey on Business Strategies (ESEE) of the SEPI Foundation. The t statistics are in parentheses. ∗ p<0.10, ∗∗ p<0.05, ∗∗∗ p<0.01 y ft = constant t +coeff t × ln(Sales ft )+ϵ ft ,t=1990,1994, (A.4) where y ft is the dependent variable of firm f and time t. Four dependent variables are (1) total R&D expenditure to sales ratio, (2) internal R&D expenditure to sales ratio, (3) total R&Dexpenditurenormalizedbyaveragesales, and(4)internalR&Dexpenditurenormalized by average sales. The first two measure the innovation intensity, and the rest measure the absolute level of innovation expenditure. The results are reported in Table A.1. It shows that both R&D and internal R&D expenditure intensity decreases with firm size and the absolute levels increase with firm size. 82 A.4 Counterfactuals A.4.1 Counterfactual: Tax Now I add a sales tax to the secondary market. It is a fixed cost Γ per transaction, which means that the buyer now pays (p z +Γ) to purchase an innovation. The rest of the settings are the same as in the benchmark model. The buyer’s zero-profit condition is Z ˆ z 0 (γ H νzQ − p z − Γ) ϕ (z)dz =0, (A.5) where ˆ z = pz γ L νQ is the step size threshold; a firm agrees to sell an innovation when it does not have complementarity and the step size is lower than ˆ z. Define ψ ≡ pz γ L νQ . The zero-profit condition can be re-written as Z ˆ z 0 (γ H z− γ L ˆ z− γ L ψ )ϕ (z)dz =0. (A.6) With the tax, it is more costly to buy innovations. Therefore, ˆ z is lower than in the bench- mark case, and a firm is less likely to sell an innovation. I evaluate a case where the tax level is 1% of the transaction price. The fourth column of Table C1 reports the results. The tax mainly affects the innovation distribution. The tax, opposite to a signal, adds friction to the secondary market. As a result, firms sell a smaller portion of innovations. There is a larger value gap between innovations implemented by a big firm and those implemented by a small firm, which makes big firms more attractive. Similar to the signal case, the tax also affects firms’ risk composition—fewer risks come from innovations. In this case, the second force is weaker than the first for all inventors. All innovations shift to larger firms. Quantitatively, the proportion of innovations in start-ups slides by 0.079 percentage points, which is a 25% drop. The share in small firms (with fewer than 500 employees) and medium-small firms (with 500–20,000 employees) drop by 0.074 percentage points and 0.652 percentage points, respectively. The medium-large firms 83 Table C1: Counterfactual on Secondary Market Tax: 1% of the Transac- tion Price 1 Moment Benchmark Tax Growth rate 2.00 − 0.002 % Inno in startups 0.31 − 0.079 % Inno, emp<500 7.64 − 0.074 % Inno, emp∈[500,20k) 33.20 − 0.652 % Inno, emp∈[20k,100k) 49.29 +0.803 % Inno, emp≥ 100k 9.57 +0.001 Pr(Sell) 5.31 − 0.045 1 The table reports the percentage points difference with respect to the benchmark. For exam- ple, the growth rate change is calculated according to (gcounter− g)× 100. The second column reports the results when there is a one-time innovation transaction tax. The tax rate is as low as 1% of the transaction price. take a share 0.803 percentage points higher than before. For firms with more than 100,000 employees, the share increases slightly, by about 0.001 percentage points. The tax has a mild impact on the aggregate growth rate. This is because the effect is mitigated by endogenous contracts. A.4.2 Counterfactual Decomposition: Signal Table C2 compares the effects of adding a signal. The quantitative implications of endoge- nous contracting and matching have similar implications as in the subsidy case. The first column is the benchmark model when there is no signal or subsidies. The second column reports the baseline counterfactual, where firms can adjust the contract terms and inventors can reallocate. The third column is, in the same counterfactual setting, when the firm cannot adjust contracts, but inventors can freely move to other firms. The fourth column is when the firm can adjust contracts, but inventors cannot move to other firms. The fifth column considers the extreme case where neither contracts nor the inventor-firm mapping can change. The impacts on aggregate results and the firm-level changes are much smaller. Overall, both the free movement of inventors and the contract terms affect the firm-level outcomes and the aggregate growth rate. 84 Table C2: Endogenous Contracts and Matching—Signals (R 2 =0.4%) 1 (1) (2) (3) (4) (5) Moment Benchmark Baseline Model Same Contract Same Mapping All Same Growth rate 2.00 +0.025 +0.030 +0.022 +0.021 % Inno in startups 0.31 +0.814 +0.815 +0.001 +0.001 % Inno, emp<500 7.64 − 1.008 − 1.202 +0.008 +0.007 % Inno, emp∈[500,20k) 33.20 +3.666 +4.169 − 0.000 − 0.001 % Inno, emp∈[20k,100k) 49.29 +6.096 +5.785 − 0.008 − 0.006 % Inno, emp≥ 100k 9.57 − 9.567 − 9.567 − 0.002 − 0.001 Pr(Sell) 5.31 +0.383 +0.403 +0.279 +0.278 1 The table reports thepercentage points difference with respect to the benchmark. For example, the growth rate change is calculated according to (gcounter− g)× 100. The first column reports the benchmark case in percentage points. The second column analyzes the baseline case, where both the contracts and the firm-inventor mapping can adjust. The third and fourth columns study the cases in which the contracts and the firm-inventor mapping cannot adjust, respectively. The fifth column reports the results where neither the contracts nor the firm-inventor mapping can adjust. A.5 Extensions A.5.1 A More Recent Sample Period In this section, I calibrate the model to the sample period from 1998 to 2010 to cover more recent observations. The firm-level data are from the enter for Research in Security Prices (CRSP) and the Merged CRSP-Compustat Database. I apply the statistical model derived in Section 1.3.1 on the public firm data to estimate the economy-wide moments. I use patent data for innovations. The patent data are from Patent Examination Research Dataset and Patent Assignment Dataset (PAD), both provided by the US Patent and Trademark Office (USPTO). 1 I link firm-level data and the patent data using the linked CRSP-USPTO data provided by Kogan et al. (2017). I calibrate the model in the same way as explained in section 1.3. The external calibrated parameters are also the same as in Table 1.3. Other parameters are reported in Table C4 and Table C3. The targeted moments are reported in Table C5. Table C6 shows the 1 I estimate the probability of selling an innovation using PAD for two sample periods. Then scale all probability related moments using the ratio. This is because, rather than using all patents observations, Figueroa and Serrano (2019) keeps only corporate patents, which take about 75% in the whole patent sample. But “corporate patents” are not marked in PAD. Hence, to ensure the samples are consistent, I scale all patent trade-related moments using the same ratio. 85 Table C3: Directly Calibrated Parameters Given Indirect Inference Results (A More Recent Sample Period) λ 0 k 2 m 0.021 0.119 0.007 Table C4: Indirect Inference Calibrated Parameters (A More Recent Sample Period) κ µ η k 1 α β a β b γ H 0.006 1.821 0.213 0.937 2.001 0.203 1.953 0.937 counterfactual results. The patterns are the same as the baseline sample period. A.5.2 Financial Friction The model implicitly incorporates the idea of financial frictions in the complementarity. The financial friction is an additive part of the complementarity under certain assumptions. One example is described below. When an inventor finishes working, the firm pays the wage and purchases the stock from the inventor. The total spending is W (˜ q). Firms can borrow from outside at the interest rate r to finance the spending under a collateral constraint W (˜ q)≤ ε(˜ q)V (˜ q), (A.7) where ε(˜ q) is a random variable (ε>0, f ε (ε)). If the constraint binds, the firm cannot fully develop the business potential of the innovation. The firm can sell the innovation on the secondary market, where the buyer can further commercialize the innovation. In this case, the effect of financial frictions shows up exactly as the complementarity in the benchmark model. A.5.3 Innovations as Substitutions The benchmark model assumes that each innovation can improve a firm’s quality. Cunning- ham et al. (2021) suggests that some innovations may be substitutions of existing technology. 86 Table C5: Model Fit for Key Targeted Moments (A More Recent Sample Period) Moment Data Model Profitability 0.11 0.11 Discount rate 0.02 0.02 Entry rate 0.07 0.07 Firm growth volatility 0.17 0.17 Aggregate growth rate 0.02 0.02 Average patent value 0.03 0.03 start-up buyout rate 0.16 0.16 Pr(Sell|Big) 1 0.09 0.09 Pr(Sell) 0.09 0.09 Growth-size relation β g 2 − 0.04 − 0.04 % Inno, emp<500 3 0.07 0.07 % Inno, emp<2,000 0.24 0.24 % Inno, emp<5,000 0.34 0.35 60th pctl ˜ q weighted by R&D 19.91 16.16 Average growth rate 0.07 0.05 1 Pr(Sell|Big) measures the probability to sell an innovation given it is invented by a big firm. A “Big firm” is defined as a firm with more than 500 employees, according to USPTO. 2 β g is the coefficient of the growth-size regression. 3 The % Inno is the cumulative density function of innovations created in firms with less than certain employment. For example, “% Inno, emp<500” means the share of innovations, among all innovations created in this period, that are invented in a firm with fewer than 500 employees. Table C6: Counterfactuals (A More Recent Sample Period) (1) (2) (3) (4) (5) (6) Moment Benchmark No Market Signal Signal Subsidy Tax α ϵ =1 α ϵ =2.5 R 2 =0.2 R 2 =0.6 1% 1% Growth rate 2.003 − 0.266 0.031 0.122 0.003 − 0.004 % Inno in startups 0.24 − 0.240 0.32 4.27 0.098 − 0.075 % Inno, q <0.5 6.43 − 5.680 − 0.410 − 4.490 − 0.10 − 0.076 % Inno, q∈[0.5,20) 43.646 − 11.185 1.864 7.484 0.582 − 0.708 % Inno, q∈[20,100) 49.685 17.101 − 1.775 − 7.265 − 0.585 0.858 % Inno, q≥ 100 0 0 0 0 0 0 Pr(Sell) 2.451 − 9.090 0.39 0.7 0.069 − 0.079 1 The table reports thepercentage points difference with respect to the benchmark. For example, the growth rate change is calculated according to (gcounter− g)× 100. The first column reports the benchmark case in percentage points. The second and third columns analyze a case in which there is a noisy signal when firms trade innovations on the secondary market. The noise ϵ follows a Pareto distribution with scale parameter 1 and shape parameter α ϵ . The fourth (fifth) column reports the results when there is a one-time innovation transaction subsidy(tax). The tax rate is as low as 1% transaction value. The sixth column shows a case where firms cannot trade innovations at all. 87 The model can incorporate this by adding one assumption—firms are exposed to random random negative quality shocks, which are proportional to the aggregate new innovation arrival rate. A.5.4 Use Stock Options Instead of Equity Using the stock options mainly affect the constant transfer of a contract; stock options capture the equity variance and protected by a lower bound. The same logic still applies: in a bigger firm, the innovation contributes to a smaller share of the equity value. Therefore, the only difference is the functional form of the variance associated with the incentive. Essentially, the mechanism behind the incentive problem does not change. The qualitative results still hold while the quantitative results may be different. A.5.5 Patent Originality Distribution One possible mapping of effort-sensitivity in the real life is the patent originality, as defined by Hall et al. (2001): Originality j =1− n j X i s 2 ij where s ij denotes the percentage of citations made by patent j that belong to patent class i. Higher Originality j means the patent j relies on technologies in many different fields, and hence it is more novel. Figure D1 plots the average originality by firm size in 1997. It decreases with firm size, which is consistent with the model implications. 88 Figure D1: The Average Originality By Firm Size in 1997 89 Appendix B Estimate the Belief Bias in Learning from Coworkers B.1 Multi-dimension Knowledge In the baseline model, I assume that workers can learn from their coworkers’ knowledge. One concern is that the knowledge, as a multi-dimension aggregation, may include components that one cannot learn. This section studies the impact of non-transmissible knowledge. As- sume that the knowledgez =x+y includes two parts: transmissiblex and non-transmissible y. A worker’s next period non-transmissible knowledge level stays the same, and the trans- missible knowledge level x ′ is drawn from the distribution ˆ G(x ′ |x,˜ x). Assume the actual learning function takes the same functional form as (2.18) E(x ′ i |x i ,˜ x − i )=(1+θ 0 )x i + 1 I− 1 θ + X x j >x i (x j − x i )+θ − X x j <x i (x j − x i ) (B.1) Thus, the knowledge z follows E(z ′ i |z i ,˜ x − i )=y i +(1+θ 0 )x i + 1 I− 1 θ + X x j >x i (x j − x i )+θ − X x j <x i (x j − x i ) (B.2) 90 The same functional form applies to the perceived learning function ˆ ˜ G(x ′ |x,˜ x). The wage function is w(x i ,y i ,˜ x − i ,n i )=x i +y i − ˜ β i I− 1 θ + X x j >x i (x j − x i )+θ − X x j <x i (x j − x i ) (B.3) In the benchmark model, a worker’s next period knowledge level z ′ is assumed to be drawn from the distribution G(z ′ |z,˜ z). The wage function is w(z i ,˜ z − i ,n i )=z i − ˜ β i I− 1 θ + X z j >z i (z j − z i )+θ − X z j <z i (z j − z i ) (B.4) Potentially, themisspecificationmeansworkersareonlyaffectedbypartialoftheircoworkers’ knowledge but the model treats it as if workers are affected by their coworkers’ aggregate knowledge. To understand how the misspecification affects the result, I consider two specific functional forms: the non-transmissible part is linear or independent of knowledge. In both cases, the belief bias is not affected by the learning function misspecification. B.1.1 Case 1: the non-transmissible is linear in knowledge Assume that y =αx . The wage is w(x i ,y i ,˜ x − i ,n i )=(1+α )x i − ˜ β i I− 1 θ + X x j >x i (x j − x i )+θ − X x j <x i (x j − x i ) . (B.5) which can be written as w(z i ,˜ z − i ,n i )=z i − ˜ β i I− 1 θ + 1+α X z j >z i (z j − z i )+ θ − 1+α X z j <z i (z j − z i ) . (B.6) If the model is estimated using the misspecified model in Equation (B.4), it is the same as using equation (B.6). Therefore, the estimated knowledge level is z, the aggregate knowledge 91 level. The estimated learning parameters ˆ θ + and ˆ θ − satisfy that ˆ θ + = θ + 1+α , ˆ θ − = θ − 1+α . The same bias applies to the perceived learning parameters ˆ ˜ θ + = ˜ θ + 1+α , ˆ ˜ θ − = ˜ θ − 1+α . As a result, the relationship between θ,θ and ˜ θ + , ˜ θ − – the estimation of γ is not affected by the misspecification. B.1.2 Case2: thenon-transmissibleisindependentwithknowledge Assume that y i ⊥z j ,∀i,j. The wage is w(x i ,y i ,˜ x − i ,n i )=x i +y i − ˜ β i I− 1 θ + X x j >x i (x j − x i )+θ − X x j <x i (x j − x i ) . (B.7) which can be written as w(z i ,˜ z − i ,n i )=z i − ˜ β i I− 1 θ + X z j >z i (z j − z i )+θ − X z j <z i (z j − z i ) +u i . (B.8) whereu i = ˜ β i I− 1 θ + P z j >z i (y j − y i )+θ − P z j <z i (y j − y i ) , andu i ⊥x j ,∀i,j. Therefore, the estimated knowledge level is ˆ z i = z i +η i , z i ⊥ η i . The regression results are not affected by the misspecification; namely, the estimation of γ is not affected by the misspecification. 92 Appendix C Bibliography Acemoglu,D. (1996): “A Microfoundation for Social Increasing Returns in Human Capital Accumulation,” The Quarterly Journal of Economics, 111, 779–804. Acemoglu, D., U. Akcigit, H. Alp, N. Bloom, and W. Kerr (2018): “Innovation, Reallocation, and Growth,” American Economic Review, 108, 3450–3491. Acikalin, U., T. Caskurlu, G. Hoberg, and G. M. Phillips (2022): “Intellectual Property Protection Lost and Competition: An Examination Using Machine Learning,” . Aghion, P., C. Harris, P. Howitt, and J. Vickers (2001): “Competition, Imitation and Growth with Step-by-Step Innovation,” The Review of Economic Studies, 68, 467–492. Aghion, P. and P. Howitt (1992): “A Model of Growth through Creative Destruction,” Econometrica, 60, 323–351. Aghion, P. and J. Tirole (1994): “The Management of Innovation,” The Quarterly Journal of Economics, 109, 1185–1209. Aizenman, J., Y. Jinjarak, H. Nguyen, and I. Noy (2021): “The Political Economy of the COVID-19 Fiscal Stimulus Packages of 2020,” Tech. rep., NBER. Akcigit, U., M. A. Celik, and J. Greenwood (2016): “Buy, Keep, or Sell: Economic Growth and the Market for Ideas,” Econometrica, 84, 943–984. Akcigit, U. and W. R. Kerr (2018): “Growth through Heterogeneous Innovations,” Journal of Political Economy, 126, 1374–1443. Akerlof, G. A. (1978): “The Market for âLemonsâ: Quality Uncertainty and the Market Mechanism,” in Uncertainty in Economics, Elsevier, 235–251. Alberola, E., Y. Arslan, G. Cheng, and R. Moessner (2021): “Fiscal response to the COVID-19 crisis in advanced and emerging market economies,” Pacific Economic Review, 26, 459. Alesina, A. and A. Drazen (1991): “Why are stabilizations delayed?” The American Economic Review, 81, 1170. 93 Allison, P. D. (2009): Fixed effects regression models , SAGE. Andrew, A. and A. Adams-Prassl (2021): “Revealed Beliefs and the Marriage Market Return to Education,” . Bena, J. and K. Li (2014): “Corporate Innovations and Mergers and Acquisitions,” The Journal of Finance, 69, 1923–1960. Benmelech, E. and N. Tzur-Ilan (2020): “The determinants of fiscal and monetary policies during the COVID-19 crisis,” Tech. rep., NBER. Boldrin,M.andD.K.Levine (2013): “The Case Against Patents,” Journal of Economic Perspectives, 27, 3–22. Brender, A. and A. Drazen (2005): “Political budget cycles in new versus established democracies,” Journal of Monetary Economics, 52, 1271. Brickley, J. A. and K. T. Hevert (1991): “Direct employee stock ownership: An em- pirical investigation,” Financial Management, 70–84. Budish, E., B. N. Roin, and H. Williams (2015): “Do Firms Underinvest in Long- term Research? Evidence from Cancer Clinical Trials,” American Economic Review, 105, 2044–2085. Cabral, L. (2018): “Standing on the Shoulders of Dwarfs: Dominant Firms and Innovation Incentives,” . Carver, C. S., M. F. Scheier, and S. C. Segerstrom (2010): “Optimism,” Clinical Psychology Review, 30, 879–889. Cassiman, B. and R. Veugelers (2006): “In Search of Complementarity in Innovation Strategy: Internal R&d and External Knowledge Acquisition,” Management Science, 52, 68–82. Chatterjee, S. and E. Rossi-Hansberg (2012): “Spinoffs and the Market for Ideas,” International Economic Review, 53, 53–93. Chen, Q., Q. Huang, C. Liu, and P. Wang (2022): “Career incentives of local leaders and crisis response: A case study of COVID-19 lockdowns in China,” European Journal of Political Economy, 102180. Chiu, J., C. Meh, and R. Wright (2017): “Innovation and Growth with Financial, and Other, Frictions,” International Economic Review, 58, 95–125. Coase, R. H. (1937): “The Nature of the Firm,” Economica, 4, 386–405. Comin, D. and T. Philippon (2005): “The Rise in Firm-level Volatility: Causes and Consequences,” Nber Macroeconomics Annual, 20, 167–201. 94 Cornelissen,T.,C.Dustmann,andU.Schönberg (2017): “Peer Effects in the Work- place,” American Economic Review, 107, 425–56. Cunningham, C., F. Ederer, and S. Ma (2021): “Killer Acquisitions,” Journal of Po- litical Economy, 129, 649–702. DeHaan,J.andJ.Klomp (2013): “Conditional political budget cycles: a review of recent evidence,” Public Choice, 157, 387. Decker, R. A., J. Haltiwanger, R. S. Jarmin, and J. Miranda (2016): “Where Has All the Skewness Gone? the Decline in High-growth (young) Firms in the Us,” European Economic Review, 86, 4–23. Drazen, A. (2000): “The political business cycle after 25 years,” NBER Macroeconomics Annual, 15, 75. Eaton, J. and S. Kortum (1996): “Trade in Ideas Patenting and Productivity in the Oecd,” Journal of International Economics, 40, 251–278. Fazio, A., T. Reggiani, and F. Sabatini (2021): “The political cost of lockdown’s enforcement,” Tech. rep., MUNI ECON Working Paper. Figueroa, N. and C. J. Serrano (2019): “Patent Trading Flows of Small and Large Firms,” Research Policy, 48, 1601–1616. Flores, A. Q. and A. Smith (2013): “Leader survival and natural disasters,” British Journal of Political Science, 821. Fons-Rosen, C., P. Roldan-Blanco, and T. Schmitz (2021): “The Aggregate Effects of Acquisitions on Innovation and Economic Growth,” . Frésard, L., G. Hoberg, and G. M. Phillips (2020): “Innovation Activities and Inte- gration through Vertical Acquisitions,” The Review of Financial Studies, 33, 2937–2976. Galasso,A.andM.Schankerman (2015): “Patents and Cumulative Innovation: Causal Evidence from the Courts,” The Quarterly Journal of Economics, 130, 317–369. Gao, X., J. R. Ritter, and Z. Zhu (2013): “Where Have All the IPOs Gone?” Journal of Financial and Quantitative Analysis, 48, 1663–1692. Gong,Y.,R.Stinebrickner,andT.Stinebrickner (2020): “Marriage, Children, and Labor Supply: Beliefs and Outcomes,” Journal of Econometrics. Gonzalez-Eiras,M.andD.Niepelt (2022): “The political economy of early COVID-19 interventions in US states,” Journal of Economic Dynamics and Control, 104309. Grilli, V., D. Masciandaro, and G. Tabellini (1991): “Political and monetary in- stitutions and public financial policies in the industrial countries,” Economic Policy, 6, 341. 95 Grossman, S. J. and O. D. Hart (1986): “The Costs and Benefits of Ownership: A Theory of Vertical and Lateral Integration,” Journal of Political Economy, 94, 691–719. Hale, T., A. Petherick, T. Phillips, and S. Webster (2020): “Variation in gov- ernment responses to COVID-19,” Blavatnik School of Government Working Paper, 31, 2020. Hall, B. H., A. B. Jaffe, and M. Trajtenberg (2001): “The Nber Patent Citation Data File: Lessons, Insights and Methodological Tools,” . Hart, O. and J. Moore (1990): “Property Rights and the Nature of the Firm,” Journal of Political Economy, 98, 1119–1158. ——— (2008): “Contracts As Reference Points,” The Quarterly Journal of Economics, 123, 1–48. Heckler, D. E. (2005): “High-technology Employment: A Naics-based Update,” Monthly Lab. Rev., 128, 57. Higgins, M. J. and D. Rodriguez (2006): “The Outsourcing of R&D through Acquisi- tions in the Pharmaceutical Industry,” Journal of Financial Economics, 80, 351–383. Jarosch, G., E. Oberfield, and E. Rossi-Hansberg (2021): “Learning from Cowork- ers,” Econometrica, 89, 647–676. Jovanovic,B.(2014): “MisallocationandGrowth,” American Economic Review,104,1149– 71. Jovanovic, B. and G. M. MacDonald (1994): “The Life Cycle of a Competitive Indus- try,” Journal of Political Economy, 102, 322–347. Klette, T. J. and S. Kortum (2004): “Innovating Firms and Aggregate Innovation,” Journal of Political Economy, 112, 986–1018. Klos, A., E. U. Weber, and M. Weber (2005): “Investment Decisions and Time Hori- zon: Risk Perception and Risk Behavior in Repeated Gambles,” Management Science, 51, 1777–1790. Kogan, L., D. Papanikolaou, A. Seru, and N. Stoffman (2017): “Technological Innovation, Resource Allocation, and Growth,” The Quarterly Journal of Economics, 132, 665–712. Lentz, R. and D. T. Mortensen (2008): “An Empirical Model of Growth through Product Innovation,” Econometrica, 76, 1317–1373. ——— (2016): “Optimal Growth through Product Innovation,” Review of Economic Dy- namics, 19, 4–19. Lipscy, P. Y. (2020): “COVID-19 and the Politics of Crisis,” International Organization, 74, E98. 96 Liu, E. and S. Ma (2021): “Innovation Networks and Innovation Policy,” . Lucas Jr, R. E. (2009): “Ideas and Growth,” Economica, 76, 1–19. Lucas Jr, R. E. and B. Moll (2014): “Knowledge Growth and the Allocation of Time,” Journal of Political Economy, 122, 1–51. Luttmer, E. G. (2014): “Knowledge Diffusion, Growth, and Inequality,” University of Minnesota, Working Paper (forthcoming). Ma, Y. (2022): “Specialization in a Knowledge Economy,” . Manacorda, M., E. Miguel, and A. Vigorito (2009): “Government Transfers and Political Support,” . Mas, A. and E. Moretti (2009): “Peers at Work,” American Economic Review, 99, 112– 45. Miller, J. B. and A. Sanjurjo (2018): “Surprised by the Hot Hand Fallacy? a Truth in the Law of Small Numbers,” Econometrica, 86, 2019–2047. Nix, E. (2020): “Learning Spillovers in the Firm,” . Perla, J. and C. Tonetti (2014): “Equilibrium Imitation and Growth,” Journal of Po- litical Economy, 122, 52–76. Perla, J., C. Tonetti, and M. E. Waugh (2021): “Equilibrium Technology Diffusion, Trade, and Growth,” American Economic Review, 111, 73–128. Phillips, G. M. and A. Zhdanov (2013): “R&d and the Incentives from Merger and Acquisition Activity,” The Review of Financial Studies, 26, 34–78. Pulejo, M. and P. Querubín (2021): “Electoral concerns reduce restrictive measures during the COVID-19 pandemic,” Journal of Public Economics, 198, 104387. Rabin, M. and D. Vayanos (2010): “The Gambler’s and Hot-hand Fallacies: Theory and Applications,” The Review of Economic Studies, 77, 730–778. Romer, P. M. (1986): “Increasing Returns and Long-Run Growth,” Journal of Political Economy, 94, 1002–1037. ——— (1990): “Endogenous Technological Change,” Journal of Political Economy, 98, S71– S102. Satyanath, S. (2005): Globalization, politics, and financial turmoil: Asia’s banking crisis , Cambridge University Press. Schmitz, P. W. (2005): “Allocating Control in Agency Problems with Limited Liability and Sequential Hidden Actions,” Rand Journal of Economics, 318–336. 97 Silveira, R. and R. Wright (2010): “Search and the Market for Ideas,” Journal of Economic Theory, 145, 1550–1573. Stiglitz, J. E. (2020): “The pandemic economic crisis, precautionary behavior, and mo- bility constraints: an application of the dynamic disequilibrium model with randomness,” . Weinstein, N. D. (1980): “Unrealistic Optimism about Future Life Events.” Journal of Personality and Social Psychology, 39, 806. Windschitl, P. D. and J. O. Stuart (2015): “Optimism Biases: Types and Causes,” The Wiley Blackwell Handbook of Judgment and Decision Making, 2, 431–455. Wooldridge, J. M. (2010): Econometric analysis of cross section and panel data, Cam- bridge: MIT press. 98
Abstract (if available)
Abstract
"This dissertation aims to understand the source of growth and how countries react to COVID-19 using policy tools. Chapter 1 builds a macroeconomic framework with heterogeneous firms and heterogeneous inventors to quantitatively understand the allocation of innovations across firms and the implications of innovation tradability. The model characterizes the entire innovation process, including both firms hiring inventors to innovate and trading innovations across firms. In a counterfactual scenario where firms cannot sell innovations, inventors move to larger firms, and growth drops by 0.166 percentage points.
Chapter 2 studies belief bias in the workplace. I build a structural model where workers can learn from coworkers. They choose where to work based on both wage and perceived learning opportunities. I propose a methodology to separately estimate the perceived and the correct learning functions, building on the observation that learning is priced by a competitive market based on belief. Using German administrative data, the estimation results show that workers overestimate how much they can learn from coworkers by eight times. It implies that better knowledgeable workers are overpaid, which increases within-team inequality.
Chapter 3, co-authored with Jihad Dagher, studies how elections shape policy-making during a crisis. We use the Covid crisis to examine this issue in a homogeneous and contemporaneous shock. We find that closer elections predict more generous fiscal packages and less restrictive containment measures. Exploring the heterogeneity in containment measures, elections reduce restrictions that impact economic activity the most."
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on narrative economics, Climate macrofinance and migration
PDF
Essays on Japanese macroeconomy
PDF
The changing policy environment and banks' financial decisions
PDF
Essays on firm investment, innovation and productivity
PDF
Three essays on heterogeneous responses to macroeconomic shocks
PDF
Three essays on macro and labor finance
PDF
Intergenerational transfers & human capital investments in children in the era of aging
PDF
Innovation: financial and economics considerations
PDF
Inter-temporal allocation of human capital and economic performance
PDF
Essays on estimation and inference for heterogeneous panel data models with large n and short T
PDF
Essays in environmental economics
PDF
Essays on information and financial economics
PDF
The impact of the COVID-19 pandemic on K–12 public school districts in southern California: responses of superintendents, assistant superintendents, and principals
PDF
The aftermath of the Korean War: traumas and memories in the Korean post-war generation and visual art
PDF
Essays on competition and strategy within platform industries
PDF
We care about you during trying times: analyzing U.S. Fortune 500 companies' Facebook posts on COVID-19 responses
PDF
Perception of Work Intensification and Well-Being Among Hybrid University Staff in the Post-COVID-19 Context
PDF
Essays on sovereign debt
PDF
The impact of minimum wage on labor market dynamics in Germany
PDF
In health, in sickness: romantic relationships during the COVID-19 pandemic
Asset Metadata
Creator
Yang, Shaoshuang
(author)
Core Title
Essays on innovation, human capital, and COVID-19 related policies
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Degree Conferral Date
2023-05
Publication Date
04/04/2023
Defense Date
03/07/2023
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
COVID-19,growth,human capital,innovation,OAI-PMH Harvest,policy
Format
theses
(aat)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Imrohoroglu, Ayse (
committee chair
), Kurlat, Pablo (
committee chair
), Hoberg, Gerard (
committee member
), Zeke, David (
committee member
)
Creator Email
shaoshuang998@gmail.com,shaoshuy@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-oUC112932357
Unique identifier
UC112932357
Identifier
etd-YangShaosh-11563.pdf (filename)
Legacy Identifier
etd-YangShaosh-11563
Document Type
Dissertation
Format
theses (aat)
Rights
Yang, Shaoshuang
Internet Media Type
application/pdf
Type
texts
Source
20230405-usctheses-batch-1016
(batch),
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the author, as the original true and official version of the work, but does not grant the reader permission to use the work if the desired use is covered by copyright. It is the author, as rights holder, who must provide use permission if such use is covered by copyright. The original signature page accompanying the original submission of the work to the USC Libraries is retained by the USC Libraries and a copy of it may be obtained by authorized requesters contacting the repository e-mail address given.
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Repository Email
cisadmin@lib.usc.edu
Tags
COVID-19
growth
human capital
innovation
policy