Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Three essays on behavioral finance with rational foundations
(USC Thesis Other)
Three essays on behavioral finance with rational foundations
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Three Essays On Behavioral Finance With Rational Foundations Suk won Lee Dissertation prepared for the degree of Doctor of Philosophy in Business Administration Faculty of the USC Graduate School University of Southern California May 10, 2019 Contents I Disaster In My Heart: A Visceral Explanation for Asset Pricing Puzzles 6 1 Introduction 7 2 A Simplied Model 12 2.1 The Notion of Dis-utility Shocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.1.1 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.1.2 Utility Function With Dis-utility Shocks: A First Pass . . . . . . . . . . . . . 14 2.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.3 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.4 The Utility Function Under Assumption 2 . . . . . . . . . . . . . . . . . . . . . . . . 19 2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3 Full Model: the Setup 20 3.1 Multiple Agents, Longer Horizon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.2 Consumption Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 3.3 Dis-utility Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.4 Utility Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.5 The Nature of Market Incompleteness and the State Space . . . . . . . . . . . . . . . 24 3.5.1 Non-insurability of Dis-utility Shocks . . . . . . . . . . . . . . . . . . . . . . . 24 3.5.2 The Consumption Goods Market . . . . . . . . . . . . . . . . . . . . . . . . . 24 3.5.3 Relevant State Variables and Their Transition Probabilities . . . . . . . . . . 25 4 Asset Pricing with Dis-utility Shocks 26 4.1 Model Intuition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2 Sequential Markets Equilibrium (SME) and Asset Prices . . . . . . . . . . . . . . . . 27 4.2.1 The Optimization Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 4.2.2 Equity and Bond Prices in Equilibrium . . . . . . . . . . . . . . . . . . . . . 28 4.2.3 The Stochastic Discount Factor . . . . . . . . . . . . . . . . . . . . . . . . . . 29 5 Numerical Method 30 5.1 State Space Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 5.2 Computing the Stochastic Discount Factor . . . . . . . . . . . . . . . . . . . . . . . . 30 5.3 Stationary Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 1 5.4 Convergence Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6 Results and Discussion 31 6.1 Stock Returns and the Risk-free Rate . . . . . . . . . . . . . . . . . . . . . . . . . . 32 6.1.1 Why Dis-utility Shocks Raise Stock Returns . . . . . . . . . . . . . . . . . . . 33 6.1.2 Why Dis-utility Shock Lowers the Risk-free Rate and Keeps it Stable . . . . 33 6.2 Multiple Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6.2.1 Why Equity Premium Rises With N A . . . . . . . . . . . . . . . . . . . . . . 34 6.2.2 Why Asset Prices Will Converge With N A !1 . . . . . . . . . . . . . . . . 36 6.3 \Risk aversion": . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 6.4 Persistence of Shock () . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 6.4.1 Stock Returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.4.2 Risk-free Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.5 Strength of Assumption 1: q 2 q 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.6 Strength of Assumption 2: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 6.7 Size of Shock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 6.8 Price-Dividend Ratio: Pro-cyclical and Autocorrelated . . . . . . . . . . . . . . . . . 41 6.9 Price-Dividend Ratio: Return Predictability . . . . . . . . . . . . . . . . . . . . . . . 43 6.9.1 The Mechanism of Return Predictability . . . . . . . . . . . . . . . . . . . . . 43 6.9.2 The Relationship BetweenB t and r e t+1 . . . . . . . . . . . . . . . . . . . . . . 44 6.9.3 Predictability Regression and an Alternative Specication . . . . . . . . . . . 45 7 Conclusion 46 8 References 48 Appendices 52 A Support for Assumptions 1&2 52 A.1 Additional Supporting Data for Assumption 1 . . . . . . . . . . . . . . . . . . . . . . 52 A.2 Lemma 2: \D Increases MU C " . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 B Lemma 1 54 B.1 Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 B.2 A `Sanity Check' Using Lemma 1-(iii) . . . . . . . . . . . . . . . . . . . . . . . . . . 54 C Sources of Dis-utility Shocks Remain Stable Over Time 55 2 D Lemma 3: \Single-Period Bond is Enough" 55 E Lemma 4 57 F Construction of the Probability Transition Matrix (P) 58 G Numerical Method 60 G.1 `Progressive Tax' Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 G.2 State-Space Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 G.2.1 jCj: Time Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 G.2.2 N A : Exploiting Symmetry of the Problem . . . . . . . . . . . . . . . . . . . . 62 G.2.3 N B : Discretization ofB t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 G.3 Faster Convergence (State Space Regulation) . . . . . . . . . . . . . . . . . . . . . . 63 G.3.1 `Lebesgue Approximation' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 G.3.2 Monte Carlo vs. Nullspace Approach . . . . . . . . . . . . . . . . . . . . . . . 65 II Rolling the Skewed Die: Economic Foundations of the Demand for Skew- ness 67 1 Introduction 68 2 Setup 71 2.1 A Motivation: Local, Bulky Status Goods . . . . . . . . . . . . . . . . . . . . . . . . 71 2.2 A Formal Representation of Aspirational Utility . . . . . . . . . . . . . . . . . . . . . 73 2.3 The Aspirational Agent's Choice Set: Binomial Martingales . . . . . . . . . . . . . . 73 2.4 Utility Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 3 Single Jump Optimization 78 3.1 The `Four Seasons of Gambling' . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.2 Comparative Statics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.2.1 Size of Jump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.2.2 Initial Endowment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.2.3 Risk Aversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 4 Double Jump Optimization 84 4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.2 Optimal Choice Under Two Jumps . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3 5 Departure from Fair Gambles: Sub-martingale and Super-martingale 88 5.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 5.2 > 0: `Winter (keep C 0 )' Is Replaced by Near-Arbitrage . . . . . . . . . . . . . . . 89 5.3 > 0: Lowers Demand for Positive Skewness . . . . . . . . . . . . . . . . . . . . . . 91 5.4 < 0 :Super-martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 6 Volatility and Skewness: A Role Analysis 93 6.1 The New Consumption Scheme: Tri-nomial Martingales . . . . . . . . . . . . . . . . 94 6.2 Principle of Maximal Volatility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 6.3 A Rule of Thumb Under Limited Choice of Skewness . . . . . . . . . . . . . . . . . . 99 7 Conclusions 100 8 References 101 Appendices 103 III The Quest for Status in Two Flavors 124 1 Introduction 125 2 Setup 127 2.1 Preference Over Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 2.1.1 The Local Nature of Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 2.1.2 A Representation of the Local Preference for Status . . . . . . . . . . . . . . 129 2.1.3 Mollication: An Alternative Representation . . . . . . . . . . . . . . . . . . 130 2.1.4 Extending the Local Preference to Adjacent Brackets . . . . . . . . . . . . . 130 2.2 The Goods: Consumption and Status . . . . . . . . . . . . . . . . . . . . . . . . . . 131 2.3 The (Single Period) Budget Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . 132 2.4 The (Single Period) Utility Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 2.5 The Full, Multi-period Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 2.6 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 3 The Reduced Problem 135 3.1 Solution to the Reduced Problem Under Step Function Representation ( s ) . . . . . 136 3.1.1 Characterization of C (W ) and S (W ) . . . . . . . . . . . . . . . . . . . . . . 136 3.1.2 `Resetting' the Marginal Utility of Consumption . . . . . . . . . . . . . . . . 137 4 3.1.3 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 3.2 Solution to the Reduced Problem Under Mollied Representation ( ) . . . . . . . . 139 3.2.1 Assumption On the Strength of Mollication . . . . . . . . . . . . . . . . . . 139 3.2.2 Characterization of C (W ) and P S (W ) . . . . . . . . . . . . . . . . . . . . . 140 3.2.3 Characterization of U (W ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 3.3 A Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 4 The Full Solution 146 4.1 A Numerical Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 4.2 Status-seeking Across the Wealth Cycle . . . . . . . . . . . . . . . . . . . . . . . . . 148 5 References 150 Appendices 151 H Mollier 151 I Reduced Problem 151 J Corollary 1 151 K Proofs 151 5 Part I Disaster In My Heart: A Visceral Explanation for Asset Pricing Puzzles Abstract I introduce the notion of `dis-utility shocks': rare but large negative idiosyncratic deviations from the consumption-implied utility level. Dis-utility shocks represent an unmistakable aspect of human life - that it can sometimes be unusually painful. I embed dis-utility shocks in a rational, consumption-based asset pricing model and develop a method to compute their impact on asset prices numerically. Despite their idiosyncratic nature, calibration results show that they are priced. Moreover, contrary to many other asset pricing models, I add dis-utility shocks in a parsimonious manner - in just three parameters - yet show that they help address many of the standard asset pricing puzzles. 1 Introduction A key notion in asset pricing theory is that any security can be priced by breaking it down to its constituent states. Consequently, the price of a security is determined by the amount of money (dividends) the asset pays o in each state, appropriately weighed by the `value of money' in each state (state price). Since the payo in each state is stipulated in the security itself, the main quest then is to determine the value of money in each state, a rather vague idea. A well-established endeavor in this regard, consumption-based asset pricing, turns this vagueness into a model by making an unassailable observation: the purpose of money, ultimately, is to consume. Given this observation, the value of money in a state is equivalent to how dearly we value consumption in that state, or essentially, the marginal utility of consumption (MU C ). Despite its logical appeal, however, it is well-known that the pure versions of these models have diculty matching salient features of perhaps even the most representative of all assets available: the aggregate stock (e.g., the S&P 500). The source of the problem is that consumption stream, and therefore also MU C , is too smooth. If the value of money is so smooth and yet this is what is driving asset prices, why should its risk be compensated by such a high premium (equity premium puzzle)? Moreover, why should the premium be so volatile (excess volatility puzzle)? And if the same forces drive asset prices, why is the behavior of bonds so dierent (risk-free rate puzzle)? The list of such puzzles has grown, and so did the complexity of models that attempt to address them, to the point that adding a dozen parameters now seems to be a conservative norm. The smoothness of consumption - which is a leading source of all the aforementioned puzzles - is in stark contrast to the average human experience, who is typically exposed to several other sources of risks in life. 1 The consumption-based asset pricing literature has largely abstracted away from these risks, simply because they are deemed inconsequential to the value of money, and hence should not aect asset prices. The goal of this paper is to challenge this conventional wisdom by distilling the non-consumption risks into the notion of `dis-utility shocks', imposing some structure on dis-utility shocks based on data and micro-foundations from related literature, and then show its impact on the behavior of the aggregate stock price. The consumption-based model I propose houses dis-utility shocks in a parsimonious manner (with essentially three new parameters,) yet delivers predictions that explain many of the puzzles such as those outlined above, including the high and volatile equity premium, low and stable risk-free rate, procyclical and autocorrelated price-dividend ratio, and also derives interesting implications on return predictability. 1 This point is captured pithily in the writing of, for example, Tolstoy (in Anne Karenina): \Happy families are all alike; every unhappy family is unhappy in its own way." 7 Dis-utility shocks represent the unfortunate, but unmistakable fact that human life can sometimes be unusually painful; critical health problems, divorce, loss of loved ones, crime victimization, and unexpected accidents just to name a few. In spirit, the rarity and severity of these human ad- versities remind us of the `rare disaster' literature, but the clear contrast is that dis-utility shocks are idiosyncratic and that they are not direct subtractions from consumption. Therefore dis-utility shocks are modeled as level drops in the CRRA consumption-based utility, rather than drops along its argument, consumption. It then begs the question: why should they aect asset prices? I adopt two assumptions based on data and micro-foundations. First, I assume that the likelihood of dis- utility shocks, although low, increases somewhat during low consumption states. This assumption encapsulates the perception that \economic problems invite further problems" 2 , and is corrobo- rated by data on divorce rates, medical problems, crime rates, etc. Second, I assume that MU C increases during dis-utility shocks. For example, a health shock commands consumption of medical services, leaving less to be spent on ordinary consumption goods, even when the total consumption is unchanged. This increase ofMU C - as well as the more general notion of `state-dependentMU C ' - nds ample support from the literature, e.g., health economics. Essentially, dis-utility shocks command composition shocks to the consumption bundle, leading to MU C shocks. Alternatively, this assumption can also be motivated purely through psychological considerations. These two assumptions together saddle the dis-utility shock, which then aect asset prices. Agents in this model are completely rational expected utility maximizers, with a clear understand- ing that the two assumptions dictate the world they live in. They then optimize, with access to nancial assets. The heft of dis-utility shocks looming over them incentivise the agents to save more - given the high MU C during dis-utility states - pulling down the risk-free rate. Moreover, the risk-free rate is reasonably stable across consumption states, because a good consumption state today does not necessarily imply low dis-utility shock probability tomorrow, the incentive to save is still quite strong even when the economy is doing well. 3 In asset-pricing language, the precau- tionary savings motive dominates the inter-temporal substitution motive, and this dominance is essentially maintained across consumption states. Hence the risk-free rate stays low and relatively stable. Also, given that dis-utility shocks elevate the marginal utility during low consumption- growth states, and that the dividend on equity (by denition) is low during these states, equity is 2 This perception is also found in the English literature, for example, in Shakespeare's Hamlet, \when sorrows come, they come not in single spies, but in battalions." 3 This is in contrast to many `habit-based' models where the risk-free rate is extremely volatile when parameters are congured to explain the equity premium, for example, Abel (1990). 8 now a more dangerous security to hold. Therefore, price of equity drops vis-a-vis pure consumption (CCAPM) models, generating an equity premium. Together, these mechanisms resolve the equity premium puzzle and risk-free rate puzzle. Also, excess market return turns out to be quantitatively more volatile than the consumption process itself (excess volatility puzzle). We can obtain more results by adding a small, but realistic structure to dis-utility shocks: that they last for multiple periods, decaying out over time. This decaying persistence leads to pro-cyclicality of the price-dividend ratio, as well as predictability of stock returns. The intuition runs as follows. Suppose the economy is in low consumption growth. Then by the rst assumption, some of the agents are likely to be laden with dis-utility shocks, therefore, MU C today is likely to be high. Conditional on today's recession, tomorrow looks better, because not only will today's shock decay, but tomorrow may turn out to be a high consumption-growth state where dis-utility shocks are less likely. The agents therefore expect dis-utility and MU C to `mean-revert' down to a lower level tomorrow, which means that the agent values today's consumption over that of tomorrow. There- fore, the discount rate applied to tomorrow's payos increases, leading to a lower asset price and consequently, a lower price-dividend ratio. Hence, given a decaying and therefore mean-reverting dis-utility process, the price-dividend ratio is pro-cyclical, and asset returns are predictable by price-dividend ratios. 4 The rare, disastrous nature of dis-utility shock reminisces the in uential rare disaster literature, initiated by Rietz (1988) and resurrected by Barro (2006). The rare disaster literature proposes that the low, but positive probability of disastrous drops in the aggregate consumption process - such as in the form of global con icts, epidemics, or large-scale nancial crises - renders an expla- nation for asset pricing puzzles. While this notion is intuitively appealing, it is questionable how modern such disasters are. Figure 1 describes the frequency (and magnitudes) of rare disasters and Table 1 contains equity premia for various time horizons. The frequency of rare disasters have decreased almost ve-fold since 1945. This is much in line with our intuition - humanity has grown more skilled at controlling epidemics, economists seem better at handling crises (compare the Great Depression against the Great Recession), and the rational understanding that another full-scale global con ict means total annihilation seem to have contained con icts at regional levels. If rare disaster indeed is the key driver of asset prices, it is reasonable to expect signicant change in asset price movements over time - in particular, a decline in the equity premium - yet this does 4 The persistent, mean-reverting nature of dis-utility shock is reminiscent of the reverse-engineered `habit' intro- duced by Campbell and Cochrane (1999), possibly suggesting that the dis-utility process could in fact represent a real-world example of such a `habit'. 9 not seem to be the case at all (Table 1). The alternative possibility I propose here is that the rare disasters can also be `in our hearts 5 ', with the clear dierence that these are idiosyncratic and not direct subtractions from income or consumption per se. Dis-utility shocks culminate in preference shocks (induced by composition shocks described above), which is distinct from a process shock as is the rare disaster. This conceptual dierence leads to dierences in predictions. Namely, unlike the rare disaster platform, my model can make realistic predictions on the dynamic behavior of quantities such as the price-dividend ratio and volatility of excess returns. 6 Figure 1: Disaster Frequencies (annual) and Magnitudes over Time (Source: Barro and Ursua 2008) Table 1: U.S. Returns, 1889-2000 (Source: Mehra 2003) Market Index Riskless Risk (%) Security (%) Premium (%) 1889-2000 7.0 1.0 6.9 1926-2000 8.7 0.7 8.0 1947-2000 8.4 0.6 7.8 The dis-utility shocks I introduce are not fully insurable. This, together with the idiosyncratic nature of shocks means that the model needs heterogeneous agents who trade in incomplete mar- kets. These models have been studied in an asset-pricing context, but results shown here are in sharp contrast to previous models in similar settings, most prominently perhaps Telmer (1993) 5 Unlike these marked decline in the magnitude and frequency of `rare disasters', sources of dis-utility shocks do not seem to exhibit the same declining trend over time. (See Appendix) 6 More recent developments in the rare disaster literature (e.g., Wachter 2013) have shown that dynamic predictions can be made by assuming time-varying likelihood of rare disaster. How to calibrate this time variance still remains an open question. 10 and Lucas (1994). Their results show that introducing idiosyncratic labor income shocks fails to generate equity premia commensurate with those observed in data, even with extraordinarily high levels of risk aversion. These `impossibility results' represent the simple fact that even if the shocks themselves are uninsurable, agents can still buer them away by borrowing against their own future, especially if the idiosyncratic shocks are transient as in the way Telmer (1993) and Lucas (1994) model them. The lineage of skepticism - towards considering idiosyncratic shocks as a candidate driver of equity premium - still seems common today 7 , yet the mechanism I propose matches its core features by imposing some realistic structure on the nature of idiosyncratic shocks. Constantinides and Due (1996) give an interesting response to these `impossibility results', where idiosyncratic income shocks are made permanent, as opposed to transient, while retaining hetero- geneity of agents in incomplete markets. In similar vein, I model dis-utility shocks as persistent (but decaying) shocks. Typically, accommodating persistent shocks within a heterogeneous agents framework involves the challenging task of keeping track of the trades of every agent over long horizons, making the model intractable. Constantinides and Due circumvent this issue by posit- ing a no-trade condition and reverse-engineering an equilibrium. So their model boils down to an `existence theorem', namely that there exists an equilibrium within their particular model with heterogeneous agents, incomplete markets and so forth. My model however, addresses this issue face to face, and given the notorious diculty of solving these models, the numerical method I develop has some computational value in its own right. The direct approach I use makes the model computationally tractable and amenable to some comparative statics. An interesting and related study by Schmidt (2016) merges the spirit of the rare disaster literature with idiosyncratic income shock, and explores the dynamic consequences of idiosyncratic, disastrous labor income shocks. Similar to my model, his study also highlights the importance of (the dynamic structure of) tail-risk in a heterogeneous agent setting with non-insurable shocks, and essentially extends the Constantinides and Due (1996) setting to incorporate Epstein-Zin preferences and a richer interplay between state variables and idiosyncratic income shocks. One consequence of using the Constantinides-Due (1996) setting is that it remains silent about the consequences of trading among agents - these class of models essentially use no-trade condition. 8 An advantage of a more 7 For example, Cochrane (2008): \In order to generate risk premia, then, we need the distribution of idiosyncratic risk to vary over time; it must widen when high-average-return securities (stocks vs. bonds,value stocks vs. growth stocks) decline. It needs to widen unexpectedly, to generate a covariance with returns, and so as not to generate a lot of variation in interest rates. And, if we are to avoid high risk aversion, it needs to widen a lot." 8 One potentially problematic consequence of using such a setting is that idiosyncratic income shocks will automat- ically be passed-through to individual consumption shocks. While this is natural - considering the fact that all actions must come from consumption in a consumption-based model - this may imply a counterfactually high idiosyncratic 11 direct approach, as in my model, is that it enables us to explore the quantitative consequences of both inter-temporal and cross-sectional (inter-agent) trading. Exploring resource re-allocation sep- arately leads to both qualitative (i.e., dierent mechanism) and quantitative predictions, namely we can draw numerical conclusions on how asset returns vary as we allow more and more hetero- geneous agents to trade in the economy. Finally, whereas Schmidt's model is about the supply of resources (i.e., income) during idiosyncratic shocks, my model - to the best of my knowledge - is the rst to focus on the demand for resources (i.e., MU C ) during idiosyncratic shocks in an asset pricing context. Given that most analyses in economics requires simultaneous descriptions of both supply and demand, my model contributes to understanding a more complete picture of the impact of idiosyncratic shocks on asset prices. The rest of the paper proceeds as follows. Section 2 introduces a simplied model that encapsulates the assumptions and glimpses the main results. Section 3 develops the main model. Sections 4 and 5 describe asset prices in the dis-utility setting and brie y outlines numerical strategies to obtain asset prices. Section 6 presents the key results and provides interpretations. Section 7 concludes. 2 A Simplied Model To formally introduce the idea, I rst provide some results in a simplied model that captures the key features of the full model which follows. 2.1 The Notion of Dis-utility Shocks Suppose all agents in the economy are spending identical consumption bundles, and that we are able to observe the utility levels of the agents. A strong belief in consumption-based utility function U(C) would mean that the true level of utility (`true' U) may look like the rst panel of Figure 2; namely, U(C) is a good proxy for the true level of utility enjoyed by the agents. However, some introspection may lead to the realization that the second panel may better re ect `true'U. Namely, there may be instances in our lives when we go through `rough times', quite independently from the consumption bundle we consume. (Recall that all agents here are consuming identical bundles.) Some examples may include; critical illness, being the victim of a crime, divorce, persecution, permanent separation from loved ones, untimely death in the family, etc. I model this as `dis- utility shocks', signicant downward drops in utility, over and above the utility level implied by consumption volatility, especially if the income shocks are assumed to be `disastrous' as in Schmidt (2016). A related work by Constantinides and Ghosh (2017) obviates this issue by using data on individual consumption. While this is prospective, this data is currently much less rened than data on individual income. 12 consumption. Figure 2: Consumption Bundles as a Proxy for True Utility Level 2.1.1 Discussions (1) The Focus on Negative Shocks Dis-utility shocks are dierent from everyday `mood-swings'. Rather, they are rare events that we typically do not expect to strike us, but if they do, are devastating. This is the reason why the focus is on downward shifts in utility, not upward shifts. Everyday mood swings can certainly be modeled as symmetric deviations away from U(c), much like the rst panel of Figure 2. Mean- while, what I am claiming is that such rare but large deviations are typically on the downside. By analogy, when we talk about aggregate consumption, we speak of the `rare disaster' literature, but sadly there is no corresponding `rare bonanza' literature. Perhaps this is because we live in competitive environment: good shocks have to be earned, whereas bad shocks can come uninvited. 9 (2) Impact on Individual Income and Consumption In reality, the dis-utility shocks I model can reduce individual income, and consequently, individual consumption. For example, a medical event can disrupt the income stream of a laborer if it impairs labor capacity. I abstract away from this possibility because income uctuations at individual levels have been explored extensively in the literature, perhaps most prominently Telmer (1993), and Lucas (1994). These results typically report that adding idiosyncratic income variations do not 9 This focus on downward shock does not necessarily preclude upward shocks: large, pleasant surprises can occur. However, in terms of modeling, the only requirement is that shocks which satisfy Assumption 1 and 2 (to be introduced in the next section) are more frequently on the downside. To negate the actions generated by dis-utility shock with positive utility shocks, one will have to assume very specic conditions on the nature of the upward shock, as we will see, is highly counter-intuitive and restrictive. Thus, I abstract away from positive utility shocks. 13 alter asset prices by meaningful quantities, because the ability to borrow and lend in the risk-less funds market alone is enough to smooth out idiosyncratic shocks. I emphasize that this abstraction is a conservative one, because the abstraction only weakens the eect of dis-utility shocks on asset prices. Therefore, our focus going forward is to look at direct eects on the level of utility itself, not through its argument, consumption. 2.1.2 Utility Function With Dis-utility Shocks: A First Pass A utility function that incorporates dis-utility shocks discussed so far is expressed as: U(C;D) =u(C) 1 fDg B; (1) given a dis-utility shock of size B. To match a number with a concept, I set: B :=u(C)u(C); = 0:8 which species a unit size of the shock. C is the historical average consumption level, soB represents a utility drop that is commensurate to a sudden 20% drop in consumption given = 0:8. This choice of allows us to mentally gauge the size of dis-utility: it is comparable to what an average individual (representative agent) would have gone through during a `rare-disaster' event in the spirit of Rietz (1988), estimated by Barro (2006) 10 , while keeping in mind the crucial dierence that the dis-utility shocks I model are not on the consumption process itself. Although B is only a suggestive and nominal gure, the purpose is just to set a `meter stick' for the dis-utility shock. We will separately vary the size of B to investigate the impact of variations to this unit size. Of course, I will x the size of B while other parameters of interest are varied for comparative statics analysis. D is the state variable that represents the event of dis-utility shock, hence 1 fDg is an indicator function that is positive only when the agent is under the in uence of a dis-utility shock. u(C) can be any standard utility function, in this case we specify it to be the CRRA (power utility) function. The timeline below will provide an illustration of this setup. 2.2 Timeline The agent starts at t=0 with endowment C 0 . The consumption level is determined at t = 1 2 , and will be either of the two: high or low: C2fC H ;C L g: 10 Barro (2006) estimates historical `rare disasters' to have a magnitude of around 20% drop in aggregate consump- tion, with a distribution that ranges from 15% to 60%. 14 The probability of C H arising is p. (i.e., Pr(C H ) = p, Pr(C L ) = 1p) As introduced in the description of the utility function, the agent also faces a small, but non-zero chance of getting dis- utility shocks subtracted from the consumption-based utility (u(C)) he gets at t = 1 2 . We assume that if C H is realized at t = 1 2 , the dis-utility shock arises with probability q 1 at t = 1 and if C L is realized at t = 1 2 , the dis-utility shock arises with probability q 2 at t = 1. The diagram below illustrates the timeline and the subsequent realization of utility under specication (1). The intermediate node (t = 1 2 ) is only for the sake of exposition. The realization of utility - which includes the utility from consumption and dis-utility - ultimately occurs at t=1. u(C 0 ) u(C L ) u(C L ) \B" Disutility (q 2 ) u(C L ) No Disutility (1q 2 ) Low Growth (1-p) u(C H ) u(C H ) \B" Disutility (q 1 ) u(C H ) No Disutility (1q 1 ) High Growth (p) 2.3 Assumptions Dis-utility shock have properties that naturally translate to actions on asset prices. These properties are outlined in the two Assumptions below, which nds support from data and micro-foundations from related elds. Assumption 1: Low Consumption Implies More Frequent Dis-utility Shocks (q 2 >q 1 ) Assumption 1 imposes a higher likelihood of dis-utility shock when consumption level is low (C L ) than when consumption level is high (C H ). Assumption 1 is born out in the data on health, divorce, crime victimization and crime perpetration rates, all of which constitute example of dis- utility shocks. For example, Table 2 below compiles recent studies on the adverse eects of economic hardships on human health. Most notably, medicine consumption rose by 35% during the Great Recession. In similar vein, Engelberg and Parsons (2016) show a link between low stock returns and hospital admissions. Numerous studies (see Table 3 below) document the strains on marriage and relationships during economic diculties. Similar patterns have been documented in crime 15 rates and crime victimization rates. (See Appendix.) Overall, it seems very plausible to assume q 2 > q 1 . In terms of modeling, I set q 1 to be a low number (1%) since dis-utility shocks are rare events. I then vary q 2 q 1 > 0 and probe its eect on asset prices in sections that follow. Table 2: Economic Troubles and Health: Recent Evidence Dependent Independent Size of Eect Data Variable Variable Blood Pressure Great Recession +12mmHG US (2000-2012) a Glucose +11%p ibid Obesity +4.1%p UK (2001-2013) b Severe Obesity +2.4%p ibid Diabetes +1.5%p ibid Mental +4.2%p ibid Medicine Consumption +36% ibid Mortality Rate Wealth Drop 3.6%! 6.5% US (1994-2014) c Zero Net-Asset 3.6%! 7.3% ibid a Source: T.Seeman, D.Thomas, S.Merkin, K.Moore, K.Watson, A. Karlamangla, (2018) b Source: M.Jofre-Bonet, V.Serra-Sastre, S.Vandoros, (2018) c Source: L.Pool, S. Burgard, B.Needham, M.Elliot, K.Langa, C.Mendes de Leon, (2018) Table 3: Economic Troubles and Marriage Dependent Independent Size of Eect Data Variable Variable Dissolution Income +0.43%p (/ %p decrease a ) US (1967-1987) of Marriage +3-4%p (/ $1,250 decrease b ) NL (1989-2000) +2%p (/ $100 decrease c ) US (1985-1994) +5-9%p (/ $10,000 decrease d ) US (1968-1985) Exercise of Unstable Employment +33% e US (1988, 1994) Violence Financial Strain +42% ibid a Source: S. Homan and G. Duncan. (1995) b Source: M. Kalmijn, A. Loeve, D. Manting, (2007) c Source: A. Lewin (2005). This study focuses on welfare recipients. d Source: H. Ono, (1998) e Source: M.Benson, G.Fox, A.DeMaris, J.Van Wyk, (2003) Assumption 2: `Shield Value' of Consumption 11 `Shield value' of consumption means that agents prefer to be rich (C H ) than poor (C L ) when they are struck by dis-utility shocks. The main thrust of this assumption draws from notions similar to 11 In essence, this assumption postulates a `non-separability' requirement of D and C 16 those in the health economics literature, while the same conclusion also follow from a psychological consideration, both outlined below. Either way, I will show that the two avenues can both be neatly packaged mathematically as: (with an abuse of notation that D actually denotes 1 fDg ) @MU C @D > 0: (2) Namely, `shield value' either directly implies, or is equivalent to higher marginal utility of consump- tion, conditional on a dis-utility shock. (a) Motivation 1: Reparation Activities Once dis-utility shocks arise, it typically entails reparation activities: some measure to deal with the event. For example, medical problem commands consumption of medical services, divorce often require legal activities. Since these services must also be consumed, it detracts from resources that would have otherwise been used for ordinary consumption (i.e., consumption that has nothing to do with reparation.) To develop an intuition, we can decompose consumption into two parts: C =C o +R; whereC o is ordinary consumption andR is reparation consumption whose demand is positive dur- ing dis-utility events, leaving less to spend on C o , thereby raising the overall marginal utility of C. The health economics literature provides ample support for the notion that MU C increases during health shocks 12 . For example, Lillard and Weiss (1997) nd that MU C increases when ill, using data on health shocks and consumption allocation for the elderly. Kools and Knoef (2018) report similar nding in a European context. De Nardi, French and Jones (2010) nd that the main driver for precautionary saving (especially among the wealthy elderly) is to caution against states with high medical costs and the subsequently highMU C in these states. Although health shock is only a subset of the dis-utility shock I model, the analogue of medical cost would most likely extend to R in the more general case of dis-utility shocks. In the Appendix, I provide a model that micro-founds Assumption 2 within a model, using an assumption that is even weaker than what is typically used in the health economics literature. (See Appendix A, Lemma 2.) (b) Motivation 2: Double-Jeopardy Aversion 12 A related line of inquiry in the health economics literature is to explore the change in marginal utility of ordinary consumption (Co) during health events. These studies report much more varied results, ranging from no eect to marked decrease or increase. See, for example, Finkelstein et al (2013). In the current paper however, the focus is on the eect to the marginal utility of C, not of its component Co. 17 Consider the following thought experiment in Figure 3. Figure 3 depicts a hypothetical consumption stream over time. Suppose people were asked at what point they would choose to be hit by given dis-utility shock, if they had a choice. It is unlikely that agents would choose \B" over \A", perhaps because they are afraid of multiple sources of stress striking them simultaneously (double-jeopardy aversion 13 .) The following lemma (in particular, (i) () (ii)) shows that this preference is indeed equivalent to (2). Lemma 1-(iii) is, in spirit, an `equivalent variation' result which facilitates a `sanity check' for Assumption 2 and the size of dis-utility shocks in the model (see Appendix). Lemma 1. The following are equivalent. (i) \A" \B" (ii) MU c 1 fDg =1 >MU c 1 fDg =0 (or ` @MU @D > 0') (iii)9;> 1, such that (C H ;B) (C L ; (1 1 )B) Proof. See appendix. Figure 3: A Thought Experiment Motivations (a) and (b) both allude to a shield value of high consumption states. In channel (a), the shield value is real; it helps agents deal with dis-utility shocks. In channel (b), the shield value is 13 This would be especially true if the agent is concerned about `surviving', or dealing with the dis-utility shock; he would prefer to have the shock occur when there is some psychological buer, or more colloquially speaking, \one less issue to worry about." A nancial analogy would be an asset manager who prefers a crash when she has a buer rather than not if she is concerned about survival in the market. 18 psychological; it is better to smooth out sources of stress than have them pile up at the same time. Whichever is the case, in terms of incorporating it into the model, they both lead to @MU @D > 0. 2.4 The Utility Function Under Assumption 2 The utility function (1) is now modied to incorporate Assumption 2: U(C i ;D) =u(C i ) 1 fDg B C L C i ; i2fL;Hg; (3) where (say) B :=u(C)u(C GD ), with C the historical average consumption level, C GD is a his- torically `low' level of consumption as before, D is the event of dis-utility shock, and u(C) is the CRRA (power utility) function. The controls the strength of the shield value, namely higher values of implies weaker shield values: e.g., If 0:1; then 8 > < > : C H : suer 50% of B C L : suer 100% of B , and if 0:2; then 8 > < > : C H : suer 75% of B C L : suer 100% of B Given this setup, it is easy to verify that indeed ` @MU @D > 0' holds (a special case of Lemma 1). 2.5 Results As this is a simplied model to introduce and x ideas, I assume a representative agent 14 and follow Mehra-Prescott calibrations of the moments of the consumption process ( = 0:02; = 0:036; = 0:96,P (C H ) = 0:43). The consequence of relaxing the representative agent restriction is addressed in the following sections. I also use 0:15, which means the `shield value' of high consumption is about 33%. Using the setting depicted in the timeline diagram and the utility function (1), I use the Euler Equation and standard methods to calculate the risk-free rate and the equity premium. (Table 4) We see three results on this simplied model already, which previews the full model. (1) First, the equity premium is increasing inq 2 q 1 . In other words, if lower consumption states are more likely to increase the chance of dis-utility shocks, equity premium rises. This aligns with our intuition. By assumption 2, marginal utility is high in dis-utility states. On the other hand, higher q 2 q 1 14 Clearly, this assumption goes against the very nature of the dis-utility shock I am introducing, which is funda- mentally idiosyncratic. The simplied model serves only as an overview, and it ultimately requires a full model for an appropriate treatment of dis-utility shocks. 19 Table 4: Simplied Model Results Risk-free Return on Equity Rate (%) Equity (%) Premia (%) [q 1 = 1%;q 2 = 5%] = 1 5.4 6.0 0.6 = 3 3.1 5.5 2.4 = 4 -12.5 -7.6 4.9 [q 1 = 1%;q 2 = 3%] = 1 5.5 6.1 0.6 = 3 5.0 7.2 2.2 = 4 -5.0 -1.0 4.0 will generate a higher correlation between low consumption states and dis-utility states. Since dis- utility shocks increase marginal utility, this correlation eectively increases the likelihood of high marginal utility states when the consumption is low. Therefore, higher q 2 q 1 tightens the joint likelihood of low consumption states and high marginal utility states, over and above that induced by the consumption process. Since this implies a higher covariance between marginal utility and return process, equity premium is increasing in q 2 q 1 . (2) Risk aversion ( ) increases the equity premium, which is a common feature in any standard asset pricing model. (3) Once risk aversion exceeds a certain threshold (in the results I am tabulating, at some point in between = 3 and = 4), the risk-free rate turns negative, yet the equity premium is far below the approximate historical average of 6%. 3 Full Model: the Setup While the simplied model is illustrative, it engenders many questions. If dis-utility shocks are idiosyncratic in nature, how meaningful is the single representative agent setting in the earlier simplied model? Moreover, if agents were able to trade with each other, would it allow for an opportunity to share the shocks? If so, what is the mechanism that translates this into asset prices? Also, how restrictive is the single-period assumption? Would it matter whether shocks are transient or persistent? How would agents react to persistence in shocks? Answering these questions requires a departure from the simplied model. I now describe the setup of the full model which enables us to fully probe the consequences of the questions raised above, among others. 20 3.1 Multiple Agents, Longer Horizon The full model imports the same utility function and assumptions from the simplied model, while expanding on two main fronts: (1) Innite time horizon (T =1) Agents are able to trade dynamically, a feature which is not fully exploited in the simplied model with two-periods. Moreover, dis-utility shocks are typically shocks that last for multiple periods rather than one-time transient shocks. It is therefore essential to incorporate persistent elements in the dis-utility process, calling for an extension to longer time horizon. (2) Multiple agents (N A > 1) It is unrealistic to assume, as in the simplied model, that there is a representative agent subject to a representative dis-utility shock. Moreover, agents must be able to borrow and lend amongst each other, and one may argue that such strong lending and borrowing activity could smooth away the impact of dis-utility shocks. Multiple agents are required to explore the consequences of trading activities. 3.2 Consumption Process The natural analogue of the simplied model's consumption setup in the dynamic setting is an aggregate consumption process. Its growth is assumed to be such that: C t C t 2fH;Lg; (H := 1 + +;L := 1 +); where is the mean of consumption growth and encapsulates its volatility. I use the notations C H ;C L to generically denote the respective consumption levels (C t ). To generate results that are directly comparable to the Mehra-Prescott (1985) results, I follow their calibrations of the consumption process: = 0:02; = 0:036; H;H = Pr(H j H) = 0:43, with Markov transition matrix given as P C = 2 4 H;H H;L L;H L;L 3 5 = 2 4 H;H 1 H;H 1 H;H H;H 3 5 (4) This consumption process is in the aggregate. I do not assume any structure on the individual endowment process (e.g., labor income). As it can be shown in my setting, the equilibrium is determined irrespective of the individual endowment process 15 , so it suces to specify the aggregate 15 This is due to perfect risk-sharing. See Lemma 3 and Lemma 4, Appendix. 21 consumption process alone. In the interest of parsimony and tractability, I also abstract away from a separate dividend process on stocks. 3.3 Dis-utility Process I import the same structure as in the simplied model. Agents can be hit by dis-utility shocks at each given period. Again, these shocks are rare, but sharp negative level shocks to the agent's utility that arrives with probability q i 2fq 1 ;q 2 g; with q 2 >q 1 as before. The probability is assumed conditionally i.i.d, identical for all agents. D t denotes the event of the dis-utility shock at time t, hence B 1 fDtg denotes the `meter stick' of utility that gets drawn away from the agent, contingent on a dis-utility shock. As before, B is a xed quantity while we carry out comparative statics, and the B will later be varied to explore its eects. Unlike the simplied model, here, the dis-utility shock has multi-period eects, so it is eectively modelled as a processB t . Given a time period (t) and its realization of consumption process C t and D t , I specify the dis-utility process as: B t+1 =B t +B 1 fD t+1 g ; (5) where 0 1. This specication indicates that the dis-utility shock is persistent. The idea is that a large dis-utility shock typically haunts agents for multiple years; the scars of divorce or a major health problem takes years to fade away. If = 1, a dis-utility shock is never forgotten until death. If = 0, a dis-utility shock is completely forgotten the next period. The illustration below depicts sampleB t processes for dierent values of ; conditional on a unit shock at t = 0. As a starting point, a reasonable value of is 0:7, which implies a `half-life' of 2-3 years. Later, results well be shown for dierent values of . 3.4 Utility Function The utility function in (1) is modied to accommodate the current dynamic setting: U(C t ;B t ) =u(C t )B t 1 1 C t C L C H C L (6) There are two modications from (1). First, B is replaced byB t to account for persistence in the dynamic setting. Second, the functional form of `Assumption 2', as depicted in (1) is simplied to 22 Figure 4:B t Decays Over Time Given Unit Shock for Dierent Values of be linear. The generality of (1) is redundant here because we restrict consumption states to two: fH;Lg: The analogue of is `', which determines how strong the shield value is: e.g., If = 4; then 8 > < > : C H : suer 75% ofB t C L : suer 100% ofB t and if = 6; then 8 > < > : C H : suer 84% ofB t C L : suer 100% ofB t Note that dierentiating with respect to C t yields: @U @C t = @u @C t + B t k ; (7) where k is essentially the (scaled 16 ) `shield value' . The rst term in (7) is the usual marginal utility of consumption. The second term is the additional marginal utility of consumption, induced by Assumption 2. 16 This scaling makes MUC `unit-invariant', i.e., invariant to the nominal magnitude of C. 23 3.5 The Nature of Market Incompleteness and the State Space Market incompleteness is innately a vague concept because its denition is deconstructive; it is dened as what it is not. The use of an incomplete market setting therefore deserves a precise description. In short, I assume that dis-utility shocks themselves are not insurable, but also assume that the (consumption) goods market is complete. 3.5.1 Non-insurability of Dis-utility Shocks In general, dis-utility shocks are likely to be non-insurable. To fully insure against a dis-utility shock, it must at the very least be observable, contractible, and veriable. Moreover, if insuring against a dis-utility shock involves moral hazard or adverse selection problems, the market for insurance could unravel even when the shock itself satises the above requirements. Hence the personal nature of dis-utility shocks will most likely obstruct insurability. Realistically, some shocks are perhaps more insurable than others; medical insurance may defray at least part of medical costs incurred, and there may be some social help available for victims of crime. However even in these cases, there is no insurance against the pain involved in the treatment process. In other words, it would not be possible to nd a sharp insurance against these dis-utility shocks, even when it is relatively easy to `quantify' the shocks. The vast majority of the cases are likely to be far less insurable; for example, incarceration (presumably due to severe moral hazard involved), or loss of a child (simply because such a loss cannot be compensated by any conceivable pecuniary sum), and insurance against divorce or bankruptcy is not commonly observed either. To make the analysis transparent, we abstract away from partial insurability and assume dis-utility shocks themselves are completely non-insurable. 17 3.5.2 The Consumption Goods Market The non-insurablity assumption begs the question: if dis-utility shocks themselves are not insurable, why would they aect asset prices? The link is in Assumption 2, encapsulated in (7). Assumption 2 suggests at least two channels through which dis-utility shocks raise the marginal utility of consumption. Although dis-utility shocks themselves are not insurable, agents can still trade on the marginal utility induced by the dis-utility processB t . In other words, the goods market is complete up to the marginal utility induced byB t . This requirement may seem imposing, in the sense that it requires agents to be able to contract on all their future marginal utility states of each and every agent in the economy, which may not even be quantiable, much less veriable. However, 17 Such abstraction is common in the literature, for example, Mankiw (1986), Lucas (1994), Constantinides and Due (1996), among many others. 24 it can be shown that in the current setup, the same allocation (as in the complete goods market allocation) can be attained simply by allowing for trades of 1-period bonds alone. (See Lemma 3 in Appendix.) Thus, the requirement of a complete consumption goods market amounts to simply allowing for a frictionless savings and lending market. 3.5.3 Relevant State Variables and Their Transition Probabilities The `state space' may at rst sight seem insurmountably big, given the variety of dis-utility shocks that it must accommodate. There are a few conceptual steps we can take to narrow it down. First, as this model is consumption-based, the aggregate consumption state (C t ) must be relevant. Second, given C t , asset prices must depend heavily on how this aggregate is allocated across the agents. The structure we have imposed thus far already allows us to surmise that dis-utility shocks will aect this allocation, since we are allowing agents to trade resources in the consumption goods market, assumed to be complete. In particular, the exact nature or cause of the dis-utility shock is irrelevant, but its magnitude (B) is. Given the heterogeneity ofB across agents, we can further infer that the cross-sectional distribution ofB, (denotedfB i g i=N A i=1 ) is also crucial in the determination of asset prices, because it will determine how the aggregateC t will be traded 18 amongst the agents. Lastly, it is intuitive that the state variables are Markov, since C t is assumed to be Markov, and (5) indicates thatB t is too. More formally, let F be the ltration generated by the processes n (C t ;fB i t g i=N A i=1 ) o t=1 t=0 and letF t denote the corresponding information at timet. Our setting given by (4) - (5) ensures thatF t (C t ;fB i t g i=N A i=1 ), namely, (C t ;fB i t g i=N A i=1 ) are jointly Markov. Given this discussion, we can intuit that the relevant state variables would boil down to: S t (!) =fC t ;B 1 t ;B 2 t ; ;B N A t g2 R N A +1 + ; (8) where S t (!) denotes the state variables at time t given the sample path !. That the current state S t (!) depends only on contemporaneous variables C t andfB i t g i=N A i=1 is precisely the Markovian property. Once the relevant state variables are dened, the next task is to specify their dynamics. Given its Markovian nature, we can encode the joint dynamics of the distribution of the dis-utility process and the aggregate consumption process in a single transition probability matrix. This is not a completely trivial task, as we now need to keep track of the transition of the entire distribution 18 This trading rule is not linear, unlike in the settings of for instance, Krusell and Smith (1998). Therefore, the typical computational methods that deal with equilibrium with heterogeneous agents cannot readily be applied here, calling for the development of a separate numerical method to compute the equilibrium, which is the focus of discussion on Section 5. 25 fB i t g i=N A i=1 , intertwined with the aggregate consumption states, which aectB t dierentially under Assumption 1. Loosely speaking, this can be done by rst xing consumption (C t ) and agent (i) to get a conditional transition probability matrix (P fCt;ig , whose rough shape is as in Figure 5 below), then manipulating these `building blocks' suitably to arrive at the full probability transition matrix P , which completely characterizes the dynamics of (8). See the Appendix for a more detailed description of the procedure for numerical application. Figure 5: Conditional Transition Probability Matrix 4 Asset Pricing with Dis-utility Shocks 4.1 Model Intuition The model aims to describe the following situation. Agents live long, and they live each day con- suming goods. But as in real life, they also worry about dis-utility shocks, in particular, the very low but non-zero chance ofB t spiking up tomorrow and leaving a rather persistent scar that decays at a geometric rate. Agents have no sharp insurance against dis-utility shock itself, but they know it is better to be in high-consumption state than low-consumption states when the shock hits. They also know consumption itself is contractible, and they have two main sources of shifting resources (consumption) around states and time. The rst is to buy a share of the aggregate economy, namely `equity', and the second is simply to save or lend through risk-less `bonds'. Each day, using these securities, agents sequentially plan the next day, using their best forecasts about tomorrows states and likelihood of dis-utility shocks. Thus the consumption states tomorrow - conditional on their currentB t and aggregate consumption state C t - are priced, and in the process, they optimize consumption across all states and time. The agents in this model have two ways of smoothing out risk. First, they can share risk among themselves, much like perfect risk-sharing in the standard models. Second, they can also borrow and 26 save inter-temporally, as long as the inter-temporal price (risk-free rate) is such that the aggregate supply matches aggregate demand. The model however is not a neoclassical growth model, and hence does not feature a way of smoothing out aggregate shocks through accumulating capital, etc. This model still resides in the Lucas (1978) endowment (tree) economy. 4.2 Sequential Markets Equilibrium (SME) and Asset Prices Let N A denote the number of agents, who I assume to be price takers to obviate any strategic activity. (Alternatively, we can letN A be `types' of agents with innitely many atomistic agents of each type.) For technical comfort, I assume that the agents live innitely long, have equal initial wealth, and thatB i 0 = 0 (8i). As per our discussion on state space, agents are distinguished by their induced marginal utilities of consumption. The setup is in sequential markets, and the equilibrium (SME) is dened as consumption streamfc i (S t )g t=1 t=0 (8i,) and set of (sequential) state prices that (1) satisfy the agents' optimization problem for all types, while (2) clearing the market. Given the state S t as in (8), we refer to `equity' as a claim to the dividend streamfD e (S t )g t=1 t=0 across all states and time, which in turn need to be equal to the aggregate endowment (consumption) process,fC(S t )g t=1 t=0 . I denote P e (S t ) as the price of `equity', at S t : I also introduce a risk-less, single-period `bond' at S t , which is a security that pays o one unit of consumption in all (t + 1) states subsequent to S t , whose price is similarly denoted as P b (S t ). Let i e (S t ) be the share of the `equity' held by the i th agent, and let i b (S t ) be the share of the `bond' held by the i th agent. 4.2.1 The Optimization Problem The i th agent choses her consumption streamfc i t (S t )g t=1 t=0 (over all S t ) such that it maximizes the lifetime expected utility: U i :=E 0 " 1 X t=0 t U c i t (S t );B i t (S t ) # = 1 X t=0 X S t t U c i t (S t );B i t (S t ) (S t ): Given our current setup, every agent faces sequential budget constraints: c i (S t ) + X S t+1 2S t+1 (S t ) P a (S t+1 jS t )a i (S t+1 jS t ) =a i (S t jS t1 );8(t;S t ); (9) where S t+1 (S t ) denotes the set of all (t + 1) states subsequent to S t . a i (S t+1 j S t ) denotes the number of `sequential Arrow-Debreu contracts' (a contract that pays o one unit of consumption at S t+1 signed at S t ) purchased (a i (S t+1 jS t )> 0) or sold (a i (S t+1 jS t )< 0). P a (S t+1 jS t ) denotes the S t -price of that particular sequential Arrow-Debreu contract. The opportunity to sign these 27 contracts represents the assumption that the goods market is sequentially complete. Note that (9) must hold point-wise; i.e.,8(t;S t ). 4.2.2 Equity and Bond Prices in Equilibrium Because our primary purpose is to price the risk-free bond and equity, we can specialize (9) to the following sequence of constraints 19 : c i t (S t ) +P e (S t ) i e (S t ) +P b (S t ) i b (S t ) P e (S t ) +D e (S t ) i e (S t1 ) + i b (S t1 ); (10) where, again, budget constraints need to hold point-wise; i.e.,8(t;S t ). For the purpose of pricing assets, (9) and (10) are essentially the same (see Lemma 4, Appendix) but specializing (9) to (10) has the advantage of explicitly leading to Euler Equations, using rst order conditions for the i th agent: P e (S t ) =E t 2 4 ~ c i (S t+1 ) + ~ B i (S t+1 ) k ~ c i (S t ) + ~ B i (S t ) k ~ P e (S t+1 ) + ~ D e (S t+1 ) 3 5 ; (11) and similarly for bonds P b (S t ) =E t 2 4 ~ c i (S t+1 ) + ~ B i (S t+1 ) k ~ c i (S t ) + ~ B i (S t ) k 3 5 ; (12) which again hold point-wise. The expectations are taken conditional onS t over the entire S t+1 (S t ) whose transition probabilities are given explicitly byP , the transitions probability matrix. In order to pin downP e (S t ) andP b (S t ), we still need to specify the optimal consumption streamf ~ c i (S t )g 1 0 that goes inside the conditional expectation. This can be done by imposing market clearing conditions 20 (13)-(14): N A X i=1 i e (S t ) = 1 (13) N A X i=1 i b (S t ) = 0; (14) 19 This is because Pe(S t ) = P S t+1 2S t+1 (S t ) Pe(S t+1 jS t ) +De(S t+1 jS t ) Pe(S t+1 jS t ) for equity and similarly P b (S t ) = P S t+1 2S t+1 (S t ) 1P b (S t+1 j S t ) for 1-period bonds, as per their denition. These assets are weighed by i e (S t ) and i b (S t ) respectively, whose lagged payos appear on the right-hand side. 20 (13) represents the dening properties of `equity': P N A i=1 c i t (S t ) = De(S t ) = C(S t ), and (14) follows from the dening property of `bonds'. 28 along with the fact that: ~ c i (S t+1 ) + ~ B i (S t+1 ) k ~ c i (S t ) + ~ B i (S t ) k = ~ c j (S t+1 ) + ~ B j (S t+1 ) k ~ c j (S t ) + ~ B j (S t ) k ;8(i;j); (15) namely that the stochastic discount factors (SDF) need to be equalized across all agents. (See Lemma 4, Appendix.) Once conditional prices are found, the unconditional price is simply its inner product with the stationary distribution of the Markov Chain, so the nal task is to nd the stationary distribution, . This can either be done by (Monte Carlo) simulations, or more directly by nding the eigenspace of the P T (where P is the transition probability matrix), which can be computed by normalizing the solution to: =P: (16) Key variables, such as the stock returns and risk free rates can be computed from asset prices once the stationary distribution is known. 4.2.3 The Stochastic Discount Factor Let M t+1 := c t+1 + 1 k B t+1 c t + 1 k B t denote the stochastic discount factor. This means that (given c t > 1, which is typically in the denition of CRRA utility function) M t+1 B t+1 B t as !1: (17) This expression deserves two remarks. First, ( B t+1 Bt ) is the dis-utility growth that plays crucial role in this model, over and above the traditional CCAPM's consumption growth ( C t+1 Ct ) which is also alive in M t . Second, serves two purposes in the dis-utility model: it represents the traditional risk- aversion of the CRRA utility function, as well as the relative weight on consumption growth versus dis-utility growth. As increases, not only do agents grow more risk averse, but also relatively more weight is given to ( B t+1 Bt ), the dis-utility growth. In the limit ( !1), only dis-utility growth matters, as is shown in (17). 29 5 Numerical Method Like many models in similar settings, the main numerical challenge is the large state space. Recall that the relevant state variables were S t =fC t ;B 1 t ;B 2 t ; ;B N A t g. This means that (with appro- priate discretization, and N B denoting the number of bins forB i t , andj Cj denoting the number of consumption states in the discretization) the number of states will bej CjN B N A , which can quickly grow beyond computational capacity. There is no fundamental remedy for this dimension- ality problem, so the numerical strategy is to devise and use thrift tricks wherever possible. Here, I outline the main steps: 5.1 State Space Reduction The root challenge is the large state space, so it is ideal to start by reducing its size as as much as possible. Fortunately, there is some scope for reduction by exploiting some features of the model. For example, we can exploit the symmetry of agents to lump and reduce the number of states as much as possible. Also, as it turns out, it is numerically innocuous to reduce the size of consumption states (from [0;1)) to just two (high and low growth states) in our setting. (See Appendix for details and justication.) 5.2 Computing the Stochastic Discount Factor Next step is to calculate the stochastic discount factor (SDF) from equation (15) point-wise. For- tunately, it can be shown that the task reduces to solving numerically (See Lemma 4, Appendix): c i (S t ) +B i (S t ) =c j (S t ) +B j (S t ); (8i;j) (18) subject to N A X i=1 c i (S t ) =C(S t ): Even this reduced task can be computationally cumbersome, especially if this calculation needs to be done for N B N A times with N B > 50. In the Appendix, I outline the `progressive tax' algorithm I use to nd the solution eciently. 5.3 Stationary Distribution From Perron-Frobenius theorems (see Grimmett and Stirzaker, 2001), we know that the Markov Chain in this model has a stationary distribution. To derive asset prices, this distribution is a prerequisite. Computing this amounts to nding the solution to (16). This can be done either 30 outright (i.e., nd the left null-space of PI) or through Monte Carlo simulations. The two avenues have pros and cons computationally, as outlined in the Appendix, and it turns out that the null-space way is more fruitful in the current setup. The computational challenge of the null-space avenue is the geometrically exploding size of state space (j CjN B N A ), combined with the fact that N B typically needs to be large (> 50) in order to approach some hint of convergence. Again, there is no fundamental remedy for this, but state-space regulating ideas (for example, `Lebesgue- style approximation,' detailed in the Appendix) makes the task manageable enough to get close to convergence. Once the stationary distribution and optimal consumption process are determined, the Euler Equations can be used to compute unconditional asset prices, which can in turn be used to calculate returns and risk premia. 5.4 Convergence Issues A combination of techniques mentioned above allows us to explore multiple agents ( 8) and plot the returns on `equity' and `bond'. Figure 6 illustrates that for any given number of agents (N A ), we need a reasonably large N B (> 50) to reach convergence. Yet the exponentially increasing state space makes the computation dicult with large number of agents (N A ), increasingly so for larger values ofN B . The state space regulating techniques, along with the aid of High-Performance Computing resources, extends the computational frontier from 3 to 8 agents. With this extension, the trend of convergence becomes clearer. The economic intuition behind this convergence (as N A increases) will be discussed when I interpret the comparative statics results. 6 Results and Discussion Table 5 provides a rst glimpse of the key variables in a baseline setting. The parameters used to generate this table (q 1 = 1%;q 2 = 4%; = 3:36; = 0:6;k = 3; = 0:99) are all subjects of comparative static analyses in sections that follow. Despite its parsimony, overall, the dis-utility model matches key features of asset pricing moments reasonably well, including the equity premium puzzle, risk-free rate puzzle and the excess volatility puzzle. 21 Note also that increasing the number of agents (N) also changes results. This feature, along with its underlying mechanism, will also be explored in a subsequent section. 21 The model does magnify the underlying consumption volatility by more than a factor of 2, but does not match the quantitative magnitude of the excess volatility. To match moments more closely, one could use an alternative specication for the dividend process to inject an alternate source of volatility in stock returns, but for sake of parsimony, I do not pursue this possibility. 31 Table 5: A Snapshot of Key Variables N A = 2 N A = 3 Equity Premium E[r e m ], (%) 6.5 7.9 [r e m ], (%) 4.7 6.1 Risk-free Rate E[r f ], (%) 2.6 1.9 [r f ], (%) 3.0 3.7 Consumption Process E[c], (%) 2.0 2.0 (\Mehra-Prescott") [c], (%) 3.0 3.0 AC 1 [c], (%) -14.0 -14.0 The dis-utility model holds Mehra and Prescott's pure consumption-based model (CCAPM) as a special case. Namely, it can be shown numerically that when the probability of dis-utility shocks are 0 (i.e.,q 1 =q 2 = 0, andN A = 1), the model reverts back to CCAPM. Hence the departure from the Mehra-Prescott world are all contained in the dis-utility parameters. Table 6 shows sample values from the two models given some dis-utility parameters while imposing common consump- tion parameters, including risk aversion. The rst row shows that the computation results recovers the equity premium puzzle in the Mehra-Prescott setting. The second row shows that the equity premium is exhausted under the given the dis-utility parameter setting. (q 1 = 1%;q 2 = 4%;N A = 3) Table 6: Dis-utility Model Holds CCAPM As a Special Case Risk-free Return on Equity Rate (%) Equity (%) Premium (%) Mehra-Prescott 9.22 9.83 0.61 (q 1 = 0%;q 2 = 0%;N A = 1) Dis-utility 0.03 10.00 9.97 (q 1 = 1%;q 2 = 4%;N A = 3) 6.1 Stock Returns and the Risk-free Rate Adding dis-utility has two immediate consequences. It lowers the risk-free rate and increases stock returns vis-a-vis the CCAPM benchmark. Also, the risk-free rate is relatively stable, with standard deviation of around 3.5%, which is close to the values observed in historical data. 32 6.1.1 Why Dis-utility Shocks Raise Stock Returns Stock returns are higher in the dis-utility model because low consumption states (C L ) induce more pain (in the form of higher marginal utility of consumption) than they do in the pure CCAPM. To facilitate an intuitive understanding, we can view this model as a world with three states: the `good' (C H ) state, the `bad' (`C L ') state and the rare but `intolerable' (B t 0) state. Critically, (1) the bad states are more likely to invite the intolerable states than the good state does (Assumption 1) and (2) the bad state makes the intolerable states even worse (Assumption 2). In other words, intolerable, stressful events tend to happen more frequently during lower levels of consumption, which in turn means that these tough events tend to hit us precisely when we are less prepared to deal with them. This means that the bad consumption states are perceived worse than in the pure CCAPM world, due to its association with the intolerable state. Therefore, a security that correlates positively with the aggregate consumption process is traded at a lower price. At a more fundamental level, this is about \why are we afraid of pecuniary distress, low consumption and recessions?" If uctuation in consumption was all about substituting high-quality produce to regular produce, or about having to postpone consuming durable goods for some time, stock returns based on these considerations alone is perhaps preposterously high: the equity premium puzzle. Yet if we acknowledge that pecuniary distress, albeit only on rare occasions, have the potential to invite events that destroy the very fabric of our lives, it is less puzzling why equity - a security that does not help during the most distressful of times - is traded at such low prices. At a more formal level, it is interesting to note that the dis-utility processB t satises the two conditions laid out in Merton's ICAPM (1973) required for a state variable (other than consumption) to aect asset prices. In the ICAPM language, the presence of dis-utility shocks alter the CCAPM prices because high consumption (C H ) is a (weak) hedge against rare, but poignant adversities. 6.1.2 Why Dis-utility Shock Lowers the Risk-free Rate and Keeps it Stable The ip side of the equity premium puzzle is the `risk-free rate puzzle'. In consumption-based models, the risk-free rate is determined as the point which balances the precautionary savings and inter-temporal substitution motives. A strong precautionary savings motive pulls the risk-free rate down, whereas inter-temporal substitution (given positive expected consumption growth) motive pushes the risk-free rate up. The risk-free rate puzzle stems from the fact that for reasonable values of risk-aversion (< 10 say), the inter-temporal substitution eect dominates and it takes nonsensi- cally high risk aversion for the precautionary savings motive to lower the risk-free rate to the levels we observe in reality. 33 The precautionary savings motive is determined by how the agents perceive the future: if the future is too risky, agents will save more, thereby pulling the risk-free rate down and vice versa. In the dis-utility setup, the main task is to compare today's dis-utility against the risk of a fresh dis-utility shock arriving tomorrow. Faced with the possibility of a pungent dis-utility shock, agents perceive the future as being risky enough to save up for, even when they know that consumption is only an incomplete `insurance' against it. In short, dis-utility shocks make tomorrow scary enough to bring the precautionary savings to the forefront, and to counteract the inter-temporal substitution motive to keep the risk-free rate low and stable. For many of us who `save for the rainy days', this aspect would speak to our hearts, perhaps more than models where inter-temporal substitution motive dominates, which dictates that we would eat away the future because we expect the economy to grow. These two forces counterbalance each other to generate relatively stable risk-free rate: 3:5%, for parameter values whose equity returns and risk-free rates match historical values. 6.2 Multiple Agents The goal of this section is to understand Figure 6, especially what happens as the number of agents (N A ) increases. The parameters used to generate this Figure are (q 1 = 1%;q 2 = 4%; = 3:5; = 0:7;k = 3; = 0:99) 22 . The horizontal axis is the N B , how thinly we partition [0;B max ] interval. The vertical axis is the equity premia, each curve representing dierent values ofN A . In particular, we are interested in establishing convergence of the quantity: lim N A !1 lim N B !1 f(N A ;N B ) (19) Convergence as N B !1 is clear from the gure, and is a straightforward consequence of ner discrete approximation of theB t process. Of greater interest is the outer limit. We can make two immediate observations from the gure. (1) The equity premium increases monotonically with N A . (2) The monotonic increments diminish over N A and already show some sign of convergence with N A = 8. The question is why. 6.2.1 Why Equity Premium Rises With N A The reason why equity premium rises with the number of agents (N A ) can best be understood by comparing a Robinson Crusoe economy (N A = 1) against a full- edged economy equipped with savings and lending platform (N A !1.) If Robinson Crusoe is hit by a dis-utility shock, it is impossible to share the burden with any other entity. With multiple agents, however, agents are 22 We recycle this baseline conguration for all the results going forward unless otherwise stated. 34 Figure 6: Convergence able to share the burden by borrowing and lending, which we can see explicitly in Equation (18). Roughly speaking, agents trade until their induced marginal utilities are equated. This trade gen- erates a form of `contagion' of dis-utility shock, in the sense that the higher (induced) demand from an agent in dis-utility is met by lending from other agents, hence increasing the marginal utility of the lender as well. It is this `contagion' that drive the price of equity downwards, because with multiple agents (N A > 1), a given C L is now associated with multiple sources of dis-utility shocks, rather than a singular source. Ironically, the presence of `contagion' is good news to all. A Robinson Crusoe with dis-utility shock would be eager to share the induced marginal utility spike, but is unable to do so, and hence the motive is not priced. In the multiple-agent economy with banks however, an agent is able to trade o the very sharp spikes of dis-utility shocks for a more frequent chance of less sharp dis-utility spikes arising from these `contagions.' This dulling eect provides every agent with a layer of insurance, whose `insurance premium' is now priced into the equity premium. Fundamentally, this form of risk-sharing is perhaps the very raison d'^ etre of the nancial market. 35 6.2.2 Why Asset Prices Will Converge With N A !1 Now that we understand why equity premia increase with N A , the question now is to understand why the increments will vanish with N A !1. Recall: S t (!) =fC t ;B 1 t ;B 2 t ; ;B N A t g; hence specifying a sample point ! in the Markov state space means specifying the consumption state and the cross-sectional distribution ofB t . Meanwhile, the distribution ofB t essentially de- termines the Arrow-Debreu price, and hence equilibrium asset prices. It is therefore important to understand how the typical distribution ofB t would evolve as we increase N A . Figure 7 compares the typical !'s for N A = 1 (left panel) and N A = 8 (right panel). The vertical axis denotes multiples of B, the `meter stick' unit of dis-utility shock. For the sake of pricing, it is the `topography' of this graph that matters, perhaps more succinctly, its mean, variance, skewness, kurtosis, etc. The topography, and hence the relevant moments of the graph will change drastically Figure 7: SampleB t Distributions for Given !'s as N A goes from 1 to 2. However the magnitude of these changes will decay fast as N A goes from 2 to 3, 3 to 4, , etc. This convergence is analogous to the convergence of binomial to normal distribution. We can then easily intuit that the Arrow-Debreu prices implied by the distributions will converge fast as well. We conrm that Figure 6 is indeed the typical shape of convergence (over N A ) of every parameter congurations we have tried. Going forward, we will report results forN A = 3, andN B = 50. This is a conservative estimate of the equity premium, and they can be 36 obtained by ordinary computational power. 6.3 \Risk aversion": Figure 8 shows stock returns, risk-free rate and equity premium for dierent values of . As in the toy model and standard asset pricing models, higher increases stock returns. Similarly, higher `scares' agents, leading to more precautionary savings and lower risk-free rate. Dis-utility shocks further amplify these channels. Intuitively, the presence of dis-utility shocks `concavies' the utility function because it induces higher marginal utility during low consumption states, leading to higher stock returns. In addition, as per the previous discussion on the stochastic discount factor (M t ), higher increases the agent's sensitivity to dis-utility growth ( B t+1 Bt ), further strengthening the covariance with the aggregate consumption process and the precautionary savings channel. Overall, results are sensitive to . Figure 8: Risk Aversion ( ) 6.4 Persistence of Shock () As seen in Figure 9 persistence of dis-utility shocks () aects asset returns, and the relationship is non-monotonic. 37 6.4.1 Stock Returns Stock returns increase with low values of but declines for larger values of . This phenomenon has most to do with correlation betweenB t andC t : When 0, low consumption state (C L ) is not terrible, since they invite dis-utility shocks that last only a single period. However, when 0, dis-utility shock has more bite because it lasts longer. This adds more pain to the low consumption state (C L ), and commands bigger stock returns. However, this relationship is not monotonic. When 1, the persistence is so large that it is entirely dominated by past events, not by the current consumption state. This weakens the relationship betweenB t and C t : Bad consumption states no longer translate directly to highB t and hence, stock returns drop. 6.4.2 Risk-free Rate When 0, shocks are completely transient. Transient dis-utility shock can be smoothed away inter-temporally by borrowing from the future, and hence risk-free rate must move up to clear the market. On the other hand, if dis-utility shocks are persistent ( 1), the future is as grim as today so the shock cannot be smoothed away by borrowing into the future. This induces more precautionary savings from the agent, and drives the risk-free rate down. Figure 9: Persistence of Shock () 38 6.5 Strength of Assumption 1: q 2 q 1 The quantity q 2 q 1 captures the strength of Assumption 1. It measures the degree to which bad consumption states induce dis-utility shocks, over and above the baseline likelihood during good consumption states. Because our focus is on devastating shocks rather than `mood swings', the reasonable range of q 2 q 1 is probably from 0% to less than 10%. Figure 10 xes q 1 at 1 % and varies q 2 from 0 to 10%. The result is intuitive. Stock returns go up because higher likelihood of dis-utility shocks makes C L more painful. Again, precautionary savings motives drive the risk-free rate down. Figure 10: Assumption 1: q 2 q 1 6.6 Strength of Assumption 2: The strength of Assumption 2 is encapsulated in parameter of (6). Figure 11 depicts asset returns for values of k in the range of 2 to 5. Recall that = 2 implies a `shield value' of 50% (high), and = 5 implies a `shield value' of 20% (low). If `shield value' is high, an extra unit of consumption is dear to the agent especially during bad consumption state (C L ). Namely, higher shield value leads to higher `induced' marginal utility, due either to reparation costs or double jeopardy aversion. Therefore, a security that does not help during C L states (equity) must be priced at a further 39 discount. Hence, stock return is highest when shield value is highest ( = 2). Similar argument goes for risk-free rate. Higher shield value encourages more precautionary savings because of its value during C L , and this drives the risk-free rate down. Figure 11 aligns with these intuitions. Higher shield values (lower values) correspond to higher stock returns and lower risk-free rates, and hence higher equity premia. Figure 11: Assumption 2: 6.7 Size of Shock The size of shocks also aects equity premium. As previously outlined, the `meter stick' of dis-utility shock (B) has been set to be comparable to a 20% drop in consumption level. Figure 12 varies the shock from 1 4 B to 2B, where B is the `meter stick' size as before. With 1 4 B, the asset prices are close to the Mehra-Prescott benchmark. As expected, higher magnitudes of shock command higher equity premia. 40 Figure 12: Size of Shock 6.8 Price-Dividend Ratio: Pro-cyclical and Autocorrelated The dis-utility model also matches key features of price-dividend ratio: pro-cyclicality and autocor- relation that decays out over time. As seen in Table 7 below, the price-dividend ratio in this model is (mildly) pro-cyclical, in line with empirical observations. However its magnitude (volatility) is much lower than in the data. Table 7: Price-Dividend Ratio by Consumption States P/D ratio C H 27.9 (High Consumption) C L 26.6 (Low Consumption) One key element in the determination of the price of asset today is the comparison of the `value attached to money' today versus that of tomorrow. Suppose the consumption state is good today 41 (C H ). On average, this implies lowB t today, and therefore the marginal utility of consumption today (MU c ) is low today. This in turn means that the conditional stochastic discount factor, (conditional on high consumption state today) is high, leading to high asset price and in particular, high price-dividend ratio, ceteris paribus. By symmetric arguments, price-dividend ratio is low during low consumption states (C L ), hence the pro-cyclicality. As an illustration of how price-dividend ratio is aected by the dis-utility shock, Figure 13 depicts the price-dividend ratios state by state for the simplied case of N = 2. The states on the hor- izontal axis are as is dened in the previous section: the rst half pertains (states 1 to 625) to the good consumption states (C H ) and the second half (states 626 to 1250) pertains to the low consumption state (C L ), both increasing in the level of dis-utility shock (B t ), lexicographically in theB t level of the rst agent, then the second. (e.g., State 1 meansC H withB t = 0 for both agents 1 and 2.) The picture clearly shows that the price-dividend ratio is decreasing in the amount of dis-utility shock (B t ) present in the system. Moreover, since the high dis-utility states (with lower price-dividend ratios) are given higher probability mass in the bad consumption states (C L ), the average price-dividend ratio will be lower in the low-consumption state and vice versa for high consumption states, hence generating the pro-cyclicality we observe. Lastly, the price-dividend ratios in the dis-utility model are autocorrelated with its magnitude decaying over time, as is observed empirically (Table 8.) This feature is a direct consequence of the decaying nature of theB t process. Lag N=2 N=3 1 0.73 0.70 2 0.51 0.46 3 0.34 0.30 4 0.21 0.18 5 0.12 0.11 6 0.07 0.06 7 0.04 0.04 8 0.02 0.02 9 0.01 0.01 Table 8: Price-dividend Ratio: Autocorrelation 42 Figure 13: Price-Dividend Ratios: State by State 6.9 Price-Dividend Ratio: Return Predictability Stock return predictability - in this case with price-dividend ratios - is an interesting question, both empirically and theoretically. Earlier results on this issue, such as Fama and French (1989) nd some evidence of predictability, whereas more recent results, such as Goyal and Welch (2008) are cautiously more inconclusive. In this section, I show that the dis-utility model has a natural platform to deliver return predictability, and suggest modications to the baseline model that would match its return predictability results closer to the observed empirical values. 6.9.1 The Mechanism of Return Predictability The consumption-based asset pricing literature has uncovered at least two distinct strands of mech- anisms (and many more `hybrids') known to deliver return predictability. One mechanism imposes structure predominantly on consumption process itself (e.g., Bansal and Yaron, 2004), while an- 43 other imposes structure on the discount factor (e.g., Campbell and Cochrane, 1999.) The latter typically involves a slow-moving, mean-reverting process (e.g., `habit(S t )') that aects the price of risk (or risk aversion) which in turn determines asset returns and prices. As discussed in a previous section, the model I propose imposes structure on the `value of money' in each state, therefore resembles the second strand. In fact, this model features a built-in mecha- nism that naturally induces predictability of future returns (r e t+1 ) with current price-dividend ratios ( Pt Dt ). Recall that the dis-utility processB t is a slow-moving, mean-reverting, and hence predictable process due to its geometric decay. We have also seen (in Figure 13) thatB t has strong (negative) relationship with the price-dividend ratio. Therefore, price-dividend ratios will inherit the pre- dictability ofB t . In addition, if there is a clear relationship betweenB t and future returns (r e t+1 ) - which there is - we can use this relationship to project the predictability ofB t onto future returns (r e t+1 ) as well. That is, the predictable relationship between Pt Dt and r e t+1 can be generated viaB t . It is worth noting the parallel betweenB t and the `habit' in Campbell and Cochrane (1999), which is a key ingredient to generate predictability in their model. TheB t process provides a natural interpretation of the `habit', which in their model is an abstractly (reverse) engineered process designed to generate the actions needed to justify observed asset price movements. B t Pt Dt r e t+1 predict 6.9.2 The Relationship BetweenB t and r e t+1 To understand the relationship betweenB t and r e t+1 , rst recall the Hansen-Jaganathan bound 23 : Sharpe Ratio t (M t+1 ) E t [M t+1 ] ; (20) and that the stochastic discount factor is: M t+1 := c t+1 + 1 k B t+1 c t + 1 k B t : Well-known mechanisms predict that highB t would lead to higher r e t+1 . That is, a highB t would, on average, result in lower M t+1 given our geometric decay structure, hence a lower E t [M t+1 ] and higher bound on the Sharpe Ratio and r e t+1 , ceteris paribus. 24 More intuitively, a highB t today 23 Alternatively, we can use the expression for excess return: cov t (M t+1 ;r t+1 ) E t [M t+1 ] to arrive at identical conclusions. 24 Alternatively, given the negative relationship betweenBt and Pt, this implies a positive relationship betweenBt and r e t+1 by the familiar Campbell-Shiller decomposition. 44 increases the price of risk, and hence commands higher future expected returns. This is certainly a valid channel, and is functional in this model as well. Note that this relationship will ensure that high price-dividend ratio today predicts low future expected returns such as in Fama and French (1989). In this model, there is a distinct force that counteracts and dominates the conventional channel, which in turn predicts that high P D ratio in fact predicts higher returns in the short run, whose predictability dies out in the long run. Equations (11) - (12) show that the stochastic discount factor in fact is very closely related to the growth of the dis-utility process, B t+1 Bt . But this ratio is higher duringC H states than in theC L states, which also translates to higher variance (ofM t+1 ), thereby increasing the numerator in Equation (20). Intuitively, given an identical size of dis-utility shock, dis-utility growth is high when an individual is enjoying a rather placid life, than when (s)he is already going through adversities, and it is this dis-utility growth that matters in the model. This short term pro-cyclicality or excess returns is reminiscent of ndings by Poterba and Summers (1988) and Fama and French (1989). This prediction is distinct from the majority of the literature that predict negative relations between P D ratio and future returns, despite the fact the this model also share the features of a slow-moving process that governs price of risk. In short, the underlying mechanism of this model reveals a distinct channel that `loads' similar slow-moving shocks in a dierent manner than do conventional models. The distinct predictions are interesting, especially in the light of con icting empirical ndings on asset return predictability. 6.9.3 Predictability Regression and an Alternative Specication Table 9 reports the coecients in the regression: r t+1 +r t+2 +r t+j+1 = j + j D t P t +; where r t denotes the t-period stock return, generated from the baseline model (with (6) as the utility function). Since the regressor is the reciprocal of the price-dividend ratio, the negative sign represents the pro-cyclicality of stock returns, which is pronounced until t = 3, then dies out gradually. This shows the dominance of the dis-utility growth channel over the conventional channels, given our baseline specication. To recover the return predictability signs documented in Fama and French (1989), it would suce to specify a setting that induces counter-cyclicality of t (M t+1 ). An alternative specication of the 45 Table 9: Return Predictability j j j+1 j 1 -1.71 - 2 -3.69 -1.98 3 -5.75 -2.06 4 -7.61 -1.87 5 -9.10 -1.49 6 -10.13 -1.03 7 -10.78 -0.65 8 -11.18 -0.39 9 -11.40 -0.22 10 -11.53 -0.12 utility function that would achieve this would be, for example: U(C t ;B t ) =u(C t )e B t B 1 1 k C t C D C H C L ; (> 0;B> 0) (21) whereB is a `normalizing' constant. Such a specication puts the dis-utility shocks on the shoulder of an exponential, thereby increasing the weight on highB t states. Such a setting implies increasing marginal dis-utility. Namely, agents dislike the piling-up of dis-utilities, a setting that is used in other areas of economics as well. This specication would ensure that t (M t+1 ) is higher (lower) during higher (lower)B t states, generating a positive relationship between the current price-dividend ration and expected returns. 7 Conclusion As Barro (2006) summarizes, the asset pricing literature has made \continued attempts to nd more and more complicated ways to resolve the equity-premium" (and other asset pricing) puzzles. But for an asset that is so representative and widely owned by the general public, it is questionable whether the ever-growing degree of complexity represents an advancement or retreat. After all, the explanations oered by economists are becoming more and more challenging to understand - in some unfortunate cases, even by those within the same profession. If so, is it reasonable to expect that these convoluted mechanisms are the driving forces responsible for actions taken by the public mass at large? Oering a more \visceral" explanation, as I do in this paper, is therefore a worthwhile attempt. 46 To this end, I introduce a consumption-based model with an extra state variable: dis-utility process. I saddle the model with some realistic, micro-founded, yet parsimonious assumptions on how the dis-utility shocks \speak to our hearts", especially in relation to the consumption level. The simple setting addresses many of the puzzles regarding aggregate stock price movements. Compared to the pure versions of consumption-based model (CCAPM), this model generates higher and more volatile excess stock returns, low and stable risk-free rate. Further assuming that dis-utility shocks have decaying structure over time yields pro-cyclical and auto-correlated price-dividend ratios, and also has interesting implications on return predictability. The model I suggest clearly has limitations as well. For example, while it does contribute to addressing many of the puzzles qualitatively, it does not completely exhaust the entire magnitude for some of them. 25 Nonetheless, the aim is here not necessarily to generate a model that \matches as many empirical moments as possible", but rather to suggest a plausible, parsimonious mechanism that reinstates a sense of instinctiveness to the explanation for the behavior of a very representative asset that is traded widely and publicly. The general message is that rare disasters can be visceral, idiosyncratic, and yet still aect asset prices under realistic assumptions. Dis-utility shocks may be \dark" matter in the sense that they are personal and do not surface in the public domain. Yet insofar as this dark matter aects utility, and under plausible assumptions that imply its correlation with the aggregate economy, the \dark matter" can help explain the dynamic features of stock prices. 25 For example, the volatility of P D ratio, or the magnitude of excess volatility. 47 8 References [1] Abel, A. (1990): \Asset Prices under Habit Formation and Catching up with the Joneses." American Economic Review, 80, 38-42. [2] Bansal, R., Yaron, A. (2004): \Risks for the Long Run: A Potential Resolution of Asset Pricing Puzzles." Journal of Finance, 59, 1481-1509. [3] Barberis, N., Huang, M., and Santos, T. (2001): \Prospect Theory and Asset Pricing." Quar- terly Journal of Economics, 116, 1-53. [4] Barro, R. (2006): \Rare Disasters and Asset Markets in the Twentieth Century". Quarterly Journal of Economics, 121, 823-866. [5] Barro, R., Ursua, J. (2008): \Macroeconomic Crises Since 1870". Working Paper. [6] Benson, M., Fox, G., DeMaris, A. and Van Wyk, J. (2003): \Neighborhood Disadvantage, Individual Economic Distress and Violence Against Women in Intimate Relationships", Journal of Quantitative Criminology, 19-3. [7] Campbell, J., Cochrane, J. (1999): \By Force of Habit: A Consumption-based Explanation of Aggregate Stock Market Behavior." Journal of Political Economy, 107, 205-251. [8] Campbell, J., Shiller, R. (1988): \The dividend-price ratio and expectations of future dividends and discount factors." Review of Financial Studies, 1, 1-34. [9] Cochrane, J. (2008): \Financial Markets and the Real Economy." Handbook of the Equity Risk Premium, 237-325. [10] Constantinides, G. (1982): \Intertemporal Asset Pricing with Heterogeneous Consumers and without Demand Aggregation." Journal of Business, 55, 253-267. [11] Constantinides, G., and Due, D. (1996): \Asset Pricing with Heterogeneous Consumers." Journal of Political Economy, 98, 519-543. [12] Constantinides, G., and Ghosh, A. (2017): \Asset Pricing with Countercyclical Household Consumption Risk." Journal of Finance, 72-1, 415-460. [13] De Nardi, M., French, E., and Jones, J. (2010): \Why Do the Elderly Save? The Role of Medical Expenses." Journal of Political Economy, 118-1, 39-75. 48 [14] Engelberg, J., and Parsons, C. (2016): \Worrying about the Stock Market: Evidence from Hospital Admissions". Journal of Finance, 71-3, 1227-1250. [15] Fama, E., and French, K. (1989): \Business Conditions and Expected Stock Returns". Journal of Finance, 25, 23-49. [16] Fredrickson, B. and Kahneman, D. (1993): \Duration Neglect in Retrospective Evaluations of Aective Episodes." Journal of Personality and Social Psychology, 65-1, 45-55. [17] Finkelstein, A., Luttmer, E., and Notowidigdo, M. (2013): \What Good Is Wealth Without Wealth? The Eect of Health on the Marginal Utility of Consumption." Journal of the European Economic Association, 11, 221-258. [18] Grimmett, G., Stirzaker, D. Probability and Random Processes. Oxford University Press. 2001. [19] Goyal, A., and Welch, I. (2008): \A Comprehensive Look at the Empirical Performance of Equity Premium Prediction." Review of Financial Studies, 21, 1455-1508. [20] Homan, S. and Duncan, G. (1995) \The Eect of Incomes, Wages, and AFDC Benets on Marital Disruption." The Journal of Human Resources, 30, 19-41. [21] Jofre-Bonet, M., Serra-Sastre, V. and Vandoros, S. (2018) \The impact of the Great Recession on health-related risk factors, behaviour and outcomes in England", Social Science Medicine, 197, 213-225. [22] Kahneman, D., Fredrickson, B. and Schreiber, C. (1993): \When More Pain Is Better To Less: Adding a Better End". Psychological Science, 4-6, 401-405. [23] Kahneman, D., and Tversky, A. (1979): \Prospect Theory: An Analysis of Decision under Risk". Econometrica, 47, 263-291. [24] Kalmijn, M., Loeve, A. and Manting, D. (2007): \Income Dynamics in Couples and the Dissolution of Marriage and Co-habitation." Demography, 44-1, 159-179 [25] Kools, L., and Knoef, M.G. (2018): \Health and the Marginal Utility of Consumption: Esti- mating Health State Dependence Using Equivalence Scales." Working Paper [26] Krusell, D., and Smith, A. (1998): \Income and Wealth Heterogeneity in the Macroeconomy". Journal of Political Economy, 106, 867-896. [27] Lewin, A. (2005): \The Eect of Economic Stability on Family Stability Among Welfare Recipients". Evaluation Review, 29-3, 223-240 49 [28] Lillard, L. and Weiss, Y. (1997): `Uncertain Health and Survival: Eects on End-of-Life Consumption." Journal of Business & Economic Statistics, 15-2, 254-268 [29] Lucas, D. (1994): \Asset Pricing with Undiversiable Income Risk and Short Sales Constraints Deepening the Equity Premium Puzzle". Journal of Monetary Economics, 34, 325-341. [30] Lucas, R. (1978): Asset Prices in an Exchange Economy. Econometrica, 46, 1429-1445. [31] Ljungqvist, L., Sargent, T. Recursive Macroeconomic Theory. The MIT Press. 2004. [32] Mankiw, G (1986): \The Equity Premium and the Concentration of Aggregate Shocks". Jour- nal of Financial Economics, 17, 211-219. [33] Mehra, R (2003): \The Equity Premium: Why Is It a Puzzle? (corrected)". Financial Analysts Journal, 59, 54-69. [34] Mehra, R., and Prescott, E. (1985): \The Equity Premium a Puzzle". Journal of Monetary Economics, 15, 145-161. [35] Merton, R. (1973): \An Intertemporal Capital Asset Pricing Model". Econometrica, 41, 867- 887. [36] Ono, H. (1998): \Husbands' and Wives' Resources and Marital Dissolution", Journal of Mar- riage and Family, 60-3, 674-689 [37] Pool, L., et al. (2018): \Association of a Negative Wealth Shock With All-Cause Mortality in Middle-aged and Older Adults in the United States", JAMA, 319-13, 1341-1350. [38] Rietz, T. (1988): \The Equity Risk Premium: A Solution". Journal of Monetary Economics, 22, 117-131. [39] Roussanov, N. (2010): \Diversication and Its Discontents: Idiosyncratic and Entrepreneurial Risk in the Quest for Social Status". Journal of Finance, 65, 1755-1788. [40] Schmidt, L. (2016): \Climbing and Falling O the Ladder: Asset Pricing Implications of Labor Market Event Risk". Working Paper [41] Seeman, T., Thomas, D., Merkin, S., Moore, K., Watson, K. and Karlamangla, A. (2018): \The Great Recession worsened blood pressure and blood glucose levels in American adults", PNAS, 115-13, 3296-3301. 50 [42] Summers, L. and Poterba, J. (1988): \Mean Reversion in Stock Prices". Journal of Financial Economics, 22, 27-59. [43] Telmer, C. (1993): \Asset-Pricing Puzzles and Incomplete Markets". Journal of Finance, 48, 1803-1832. 51 Appendices A Support for Assumptions 1&2 A.1 Additional Supporting Data for Assumption 1 Table 10: Violent Crime Victimization, by Income Household Income 2015 a 2016 b (per 1,000) (per 1,000) <$10,000 17.5 15.1 $10,000-$14,999 12.0 10.0 $15,000-$24,999 8.2 13.5 $20,000-$34,999 5.5 6.0 $30,000-$49,999 7.1 6.6 $50,000-$74,999 5.9 5.0 >$75,000 4.5 3.9 a Source: Bureau of Justice Statistics, National Crime Victimization Survey (2015) b Source: Bureau of Justice Statistics, National Crime Victimization Survey (2016) Figure 14: Risky and Criminal Behavior (Youths), by Income Source: Brookings Policy Memo (2014) and Kent (2009) 52 A.2 Lemma 2: \D Increases MU C " Let C denote consumption, and R denote reparation cost. Since reparation also needs to be consumed (e.g., hiring a lawyer), it constitutes part of consumption (C): C = ~ C +R; where 0<R<R<C. The upper bound R represents the fact that there is limit to what can be done about dis-utility shocks: it is not fully reparable. ~ C is the `ordinary' consumption from which agents derive CRRA utility. Consider the utility function of the form u(C) =u ~ C ( ~ C) +u R (R) 1 fDg : In this specication, u(C) is precisely the CRRA utility function, except during times of dis- utility shock, in which case, the second term is `turned on' to incorporate the agent's reparation consumption (R). I assume that given 1 D = 1: @u(C) @R R > @u(C) @ ~ C ~ C=CR : (22) This assumption is simply to say that within the available range of R; (0 < R < R), agents prioritize R over ordinary consumption ( ~ C) during dis-utility shocks, as per the higher marginal utility postulated by the assumption. 26 Lemma 2. Assume (22) and u 00 R (R)< 0. Then @u(C) @C 1 fDg =1 > @u(C) @C 1 fDg =0 : Proof. @u(C) @C 1 fDg =1 = @ max 0<R<R u ~ C (CR) +u R (R) @C = @ u ~ C (CR) +u R (R) @C = @u ~ C (CR) @ ~ C @ ~ C @C = @u ~ C (CR) @ ~ C > @u ~ C (C) @ ~ C = @u(C) @C 1 fDg =0 ; 26 This assumption is a sucient condition that makes the proof transparent, which can certainly be relaxed. 53 where the second equality follows from: @ u ~ C (CR) +u R (R) @R = @u ~ C ( ~ C) @ ~ C + @u R (R) @R > @u ~ C (CR) @ ~ C + @u R (R) @R >0; given the priority on R during dis-utility shocks, (22). B Lemma 1 B.1 Proof Proof. Note that preference of A over B is equivalent toj @U @D j C H <j @U @D j C L . Since @U @D < 0, this means @ @U @D @C > 0. Hence, AB () @ @U @D @C > 0 () @MU @D > 0; establishing the equivalence of (i) and (ii). The equivalence of (iii) follows from extending the continuity axiom on C (which is implicitly assumed in the CRRA representation of consumption- based utility) to B. B.2 A `Sanity Check' Using Lemma 1-(iii) The parameter `' in Lemma 1-(iii) eectively asks the following question. \During a dis-utility shock, How much should B be decreased (at least) to accept the decrease in consumption: C H ! C L ?" It is, in spirit, an equivalent variation, where the equivalence is between the utility variations stemming from consumption and dis-utility shock. As a sanity check, we can ask whether these two sources of variations make sense (i.e., whether they are approximately equal in magnitude) when converted to consumption-based utility terms. Since we are using the Mehra-Prescott moments for the consumption process, 3%. Therefore, C H ! C L represents a 6% drop in C. Meanwhile, recall that B represents a utility drop that is equivalent to a 20% drop in consumption. Also, 4 yields an equity premium commensurate with historical average. Thus, roughly speaking 27 , the shield value of consumption amounts to a 27 Certainly, this is abstracting away from convexity of the CRRA utility function, which is a conservative abstrac- 54 5% (= 1 20%) reduction in consumption, which is on par with the 6% reduction of the consumption process. This is just an approximation, but it serves the purpose which is to provide a `sanity check' that the shield value assumption quantitatively aligns with its consumption-based counterpart, based on the notion of equivalent (compensating) variation in microeconomics. C Sources of Dis-utility Shocks Remain Stable Over Time The graphs below provide some suggestive evidence that unlike rare disasters, sources of dis-utility shocks have not declined over time. The left panel (a) shows trends in divorce rate since 1880. While socio-economic factors may certainly be driving the trend, it has by no means declined. Similarly, panel (b) shows suicide and homicide rates in the US since 1990. The trend is, overall, stationary. (a) Source: CDC (National Center for Health Statis- tics), Randal Olsen (b) Source: \Homicide and Suicide in America, 1900- 1998", David C. Stolinsky, Figure 15: Divorce, suicide and homicide rates since 1900, (US) D Lemma 3: \Single-Period Bond is Enough" The purpose of this lemma is to justify the requirement of a complete goods market in this model by showing that its allocation can be attained simply by introducing single-period bonds. Lemma 3. Assume no-Ponzi, that agents own equal initial wealth in the form of equity 28 (i.e., tion. 28 The only role of equity in this Lemma is to map aggregate dividends to individual income. It plays no risk- sharing role whatsoever. Any distribution of aggregate dividend is ultimately redistributed by trading 1-period bonds. Given a separate individual income process, (e.g., labor income process) the presence of equity would be completely redundant. 55 i 0 = 1 N A ), and thatB 0 = 0 (8i). Letf~ c i (S t )g t=1 t=0 , (i 2 I) be the equilibrium allocation in a SME with complete goods market. A SME with 1-period bonds alone achieves the same outcome as f~ c i (S t )g t=1 t=0 , (i2I). Proof. We start by gathering some immediate facts. First recall the well known result (e.g., Ljungqvist and Sargent 2004) that under the given assumptions, the allocationsf~ c i (S t )g t=1 t=0 ; (i2I) is also an Arrow-Debreu equilibrium, which satises perfect risk-sharing. Given the current as- sumptions, (in particular, symmetric initial wealth andB 0 distribution) this means that (18) holds point-wise: ~ c i (S t ) +B i (S t ) = ~ c j (S t ) +B j (S t );8S t ;8(i;j): Second, it can be shown (using e.g., Perron-Frobenius theorem; see Grimmett and Stirzaker, 2001) that there exists a stationary distribution for the (Markov) dis-utility model. Moreover, this sta- tionary distribution coincides with limiting behavior ast!1. Given the symmetric structure and no-Ponzi condition, the limiting asset holdings are also symmetric, and consumption levels are also symmetric across the agents in the limit. Using these facts, the proof is by contradiction. Letf^ c i (S t )g t=1 t=0 , (i2I) be the equilibrium allocation of the SME with 1-period bonds alone, and suppose ^ c i (S t )6= ~ c i (S t ) for someS t . Given strictly diminishingMU C of CRRA utility, this implies that (18) does not hold on that particular S t , namely, ^ c i (S t ) +B i (S t )6= ^ c j (S t ) +B j (S t ); (23) for some (i;j) at S t . Since we are still assuming that f^ c i (S t )g t=1 t=0 ; (i 2 I) is an equilibrium allocation (with only 1-period bonds available in the goods market), the perturbation argument holds. Namely, let U i be the objective function for agent i and let ^ P t (S t ) denote the prevailing 1-period bond price at state S t . Then, @U i @^ c t (S t ) = 1 ^ P t (S t ) @U i @^ c t+1 ; (24) where @U i @^ c t+1 denotes the derivative ofU i with respect to equal increments of ^ c t+1 in all (t+1) states subsequent toS t : (This essentially means that in equilibrium,U i remains unchanged by substituting ^ c i (S t ) for ^ c i (S t ) ^ P (S t ) and ^ c i (S t+1 ) + on allt + 1 states subsequent toS t :) Critically, note that ^ P t (S t ) does not depend oni, as it is the prevailing competitive price. Hence, we can keep iterating this recursively, and obtain: ^ c i (S t ) +B i (S t ) = @U i @^ c i (S t ) =C(k) @U i @^ c i;t+k ; 56 and ^ c j (S t ) +B j (S t ) = @U j @^ c j (S t ) =C(k) @U j @^ c j;t+k ; whereC(k) is a constant that depends only onk. Then combining with (23), we can conclude that @U i @^ c i;t+k 6= @U j @^ c j;t+k (8k), and therefore also in the limit as k!1, which is clearly a contradiction to symmetric allocations in the limit. Although this only shows that the inter-agent distribution rule in an SME and 1-period bond is identical given a state S t , this in fact completes the proof, since the inter-temporal allocation is automatically done by market clearing, given the inter-agent distribution rule. In summary, any individual shock that is fungible with consumption (e.g., income shocks) can be diused away by 1-period bonds alone. This means that any attempt to explain the behavior of equity premium using transient idiosyncratic income shocks will be futile unless we assume strong market constraints. E Lemma 4 The purpose of Lemma 4 is to show that specializing (9) to (10) is an innocuous step that does not hamper with the assumption that the goods market is complete. That is, reducing the budget constraint to (10) does not impose restrictions on the choices of the agents in their optimization. In addition, it justies a simplifying step used in the numerical procedure to compute asset prices, which also leads to Lemma 5. Consider the Sequential Market Equilibrium (SME) where each agent seeks to maximize their life-time utility: U i :=E 0 " 1 X t=0 t U c i t (S t );B i t (S t ) # = 1 X t=0 X S t t U c i t (S t );B i t (S t ) (S t ); (8i2I); subject to the pointwise sequential constraints: c i (S t ) + X S t+1 2S t+1 (S t ) P a (S t+1 jS t )a i (S t+1 jS t ) =a i (S t jS t1 );8(t;S t ); and the pointwise market clearing conditions: X i n a i (S t jS t1 ) X S t+1 2S t+1 (S t ) P a (S t+1 jS t )a i (S t+1 jS t ) o = X i c i (S t ) =C(S t ): (25) Denote this maximization problemM 1 . 57 Consider the same objective functionU i , also to be maximized subject to another set of pointwise sequential constraints: c i t (S t ) +P e (S t ) i e (S t ) +P b (S t ) i b (S t ) P e (S t ) +D e (S t ) i e (S t1 ) + i b (S t1 ); subject to the pointwise market clearing conditions: X i i e (S t ) = 1 and X i i b (S t ) = 0; and denote this maximization problemM 2 . Note that imposing these market clearing conditions guarantees that (25) is satised, thusM 2 indeed is a special case ofM 1 . Lemma 4. M 1 and M 2 attain identical equilibrium consumption stream and sequential state prices. On each S t , the quantity c i (S t ) +B i (S t ) is equal8i2I: Proof. By Lemma 3, bonds alone essentially complete the market. Therefore, the generality of M 1 is redundant and the equilibrium allocations ofM 2 coincide with the equilibrium allocations ofM 1 . Moreover, given the current setup, the equilibrium allocations ofM 1 coincide with ADE allocations, where ~ c i (S t+1 ) + ~ B i (S t+1 ) k ~ c i (S t ) + ~ B i (S t ) k = ~ c j (S t+1 ) + ~ B j (S t+1 ) k ~ c j (S t ) + ~ B j (S t ) k ;8(i;j). But given symmetrical initial distribution of wealth and B 0 , this implies that on each S t , c i (S t ) +B i (S t ) is equal 8i2I: F Construction of the Probability Transition Matrix (P) The construction of the probability transition matrix of the relevant state variables (S t ) is done in two main steps: (1) Fix the the aggregate consumption state C t and the agent i. For reasons to be justied later, we simplify the aggregate consumption state to be binary in its growth rate: Ct Ct 2fH;Lg. The rst step is to nd the conditional transition probability matrix, P fC;ig which encodes the law of motion ofB t of agent i in a given aggregate consumption state. Given the consumption state, we know the conditional probability of dis-utility, q i 2fq 1 ;q 2 g. If (for example) the consumption state isH andq i =q 1 , the conditional probability transition matrix would be: 58 where the (i;j) th entry denotes the probability Pr(B t+1 = jjB t = i). That is, as is customary in Markov Chains, each row contains transition probability to the next state, emanating from the same state indexed by the row, and similarly, each column represents a target next period's state. The magnitude ofB t is increasing in both the row and the column. Namely, the rst (i = 1) row represents the lowest current dis-utility level and the bottom row (i = N B ) represents B max , and ibid for the columns. The conditional transition matrix is already sparse matrix where the only non-zero entries are denoted by the blue line, each with slope less than -1. This (steeper than 45 ) slope represents < 1. For example, if the consumption state is H, the probability of a dis-utility shock is q 1 , and the probability of no dis-utility shock is 1q 1 . SupposeB t = k. ThenB t+1 can be either `k' with probability (1q 1 ) or `k +B' with probability q 1 , which is exactly what the transition probability matrix is representing. (2) Since the conditional probability transition matrix (P fC;ig ) xes an agent and consumption state, we need to suitably `blow up' the conditional probability transition matrix to arrive at P , which holds the distribution ofB i t and consumption states. Given that dis-utility shocks are independently distributed, we can achieve this using Kronecker products. The Kronecker product is taken repeatedly by the number of agents (N A ). Further adjusting for the consumption states yields P, the full probability transition matrix. Let N A be the number of agents in the economy. The conditional (now conditional on aggregate consumption state only) transition probability matrix for all N A agents (P N A fCg ) is: P N A fCg =kron N A 1 (P fC;ig ) (26) where the `kron n ' notation denotes taking the Kronecker product repeatedly n-times. 59 The full transition probability matrix, together with the consumption process, is then given as: P =(P C I N ) 2 4 P N A fHg 0 0 P N A fLg 3 5 (27) = 2 4 H;H P N A fHg H;L P N A fLg L;H P N A fHg L;L P N A fLg 3 5 ; (28) where N =N B N A , ` ' denotes the Hadamard product and P C is the transition probability matrix of aggregate consumption from (4). This procedure also xes a mapping between the numbering attached to a state (as dened by thei th row or column onP ) and the numerical values of the state variables this implies. In particular, let Q(a;b) denote the quotient from the division of integersa byb, and letk be the numbering attached to a particular state. We can infer that Q(k 1;N) determines the aggregate consumption growth state ( Ct Ct ) to be 8 > < > : H : if Q(k 1;N) = 0; L : if Q(k 1;N) = 1 , whereN =N N A B as before. Similarly, the dis-utility process of thei th agent can be inferred fromB i t =Q(k1;N i1 B ). Thus, the states are numbered in lexicographic order, rst in consumption, then theB t of agentN A , then agent (N A 1), , agent 1. One nal remark is thatP is also a sparse matrix. For computational purposes, it is wise to use the `sparse' command in computational packages, for example, when computing the null-space ofPI. This allows faster computations and saves memory, each of which is crucial to attain convergence. G Numerical Method G.1 `Progressive Tax' Algorithm The specic goal of this procedure is to nd, as eciently as possible, the allocations (c i ;8i) that satises: c i (S t ) +B i (S t ) =c j (S t ) +B j (S t ); (8i;j) Using the well-known numerical methods such as Newton-Raphson is clumsy in this setting, mainly due to its sensitivity to initial points. This is problematic, considering the high level of computa- tional load which increases geometrically withN A . Borrowing an idea from economics (progressive tax) helps address this issue, because it allows for an algorithm that works with straightforward (uniformly distributed) initial points. Namely, consider the following procedure: 60 (1) Rank the agents in ascending order of marginal utility (MU). (2) `Tax' the agent with lowest MU and transfer the proceeds to the agent with highest MU. The amount that needs to be taxed is determined by the (normalized) second derivative of the utility function at the given allocation. The intuition of this particular step is similar to Newton-Raphson. (3) Repeat the procedure until the largest MU dierence falls below tolerance level. The intuitive reason why using this `progressive tax' idea helps sidestep the initial point issue is because it `automatically' selects the initial points (and the subsequent points) that gravitate towards the equilibrium allocation. The underlying mechanism is Lemma 5 below: Lemma 5.B i t >B j t () c i t >c j t : Proof. This is a direct corollary of Lemma 4. This shows that agents `pool' resources to eectively help out the agent type who is hit by high level ofB t , and therefore the equilibrium allocation can be found eciently by `progressively taxing' the fortunate agents with lowB t values. It turns out that this method is highly eective in this setting. It achieves convergence in few (often < 10) iterations. This is important, especially as we increase N A and N B to levels required for convergence of asset prices. Given that the notion of allocating resources to match a given price (price-taking) is prevalent in economics, this methodology may have a wider range of application than this particular setting alone. G.2 State-Space Reduction Given discretization, the number of state space is:jCjN B N A . The goal of state-space reduction is to reduce this number as much as possible, by exploiting the structure behind each term: jCj, N A , N B : I look at each term separately. G.2.1 jCj: Time Invariance Time invariance allows us to focus on growth rates Ct Ct 2fH;Lg, rather than the nominal levelC t , thereby achievingjCj= 2. This is possible because (15) is in fact time invariant: `economic growth' does not aect this quantity. This simplicity derives from our setting where the unit `meter-stick' of dis-utility shock is dened as CRRA utility drops that match percentage drops in consumption level, so that its time invariance approximately carries over (given the decaying nature of theB t 61 process) toB t process as well. 29 Using invariance, I useC 0 = 5 for numerical implementation going forward. Robustness checks conrm that the reported results are indeed approximately invariant to C 0 values. G.2.2 N A : Exploiting Symmetry of the Problem In principle, the number ofN A required for convergence is dictated by the model itself, so no further reduction is possible. Nonetheless, given linearity of the pricing operator, it is possible to lump states by observing that some states have identical state prices, and hence lead to identical asset prices. As is clear from the section on the construction of P , two states are considered dierent if it implies dierent distribution ofB i t . But since the identity of agent is irrelevant, it is only the ranking ofB i t in the distribution that matters for state prices. Therefore, we can lump two states into one if it induces the same ranked distribution. Intuitively: P = X s SDF (s)x(s)(s) (where P is the price of a security, SDF is stochastic discount factor, x is the payo and is the probability assigned to that state), and if there are two statess ands 0 such thatSDF (s) =SDF (s 0 ) because s and s 0 have the same ranked distribution, we can lump s and s 0 together and assign a new measure =(s) +(s 0 ), which clearly works because of linearity. Numerically, this lumping procedure takes some time, but that clear advantage is that we can reduce the size of P (from P to P reduced ), extending the feasible number of N A that allows for computation. The magnitude of gain by doing so can be likened to the dierence between permutations and combinations. G.2.3 N B : Discretization ofB t For numerical implementation, the state space needs to be discretized. Namely, in principle,B t is a continuous process in [0;1) but to facilitate computation, the interval needs to be truncated to [0;B max ], and sliced intoN B bins, each of length Bmax N B . Increasing the numeratorN B is (geometri- cally) expensive. Meanwhile, it is clearly ideal to makeB max large to encompass the entire interval. However, the trade-o is that this leads to large values of Bmax N B detract from numerical accuracy, since we need small values of Bmax N B (i.e., ne discretization) for numerical accuracy. Given that increasing N B is geometrically costly, B max needs to be set in moderation. Some robustness checks suggest that truncating theB t process at B max = 2B (or 3B), where B 29 An alternative approach would be to `normalize' dis-utility shocks, as in Barberis Huang Santos (2001) or Rous- sanov (2010). 62 is the `meter stick' of dis-utility as before, seems to be a reasonable choice given this trade o. This is becauseB t > 2B is such a rare phenomenon that it does not aect the equilibrium in a material way, whereas the heavy concentration of equilibrium states aroundB t 0 means that the discretization needs to be suciently ne, hence requiring high number of bins (N B ) to ensure numerical accuracy. Choosing the B max in moderation helps keep N B as low as possible while retaining numerical accuracy. G.3 Faster Convergence (State Space Regulation) G.3.1 `Lebesgue Approximation' TheB t process is a critical element in the model. Computationally, this quantity can only be approximated. The problem is that convergence requires accurate approximation ofB t , and this requires larger number of states at a geometric rate when N A > 1. `Lebesgue Approximation' is useful because it helps us obtain convergence at a much coarser level of partitioning. The idea is to `respect' the coarseness of the partitioning of the co-domain (B, in the spirit of Lebesgue integration) and to sweep all inaccuracies into the Markov Chain. Let B n :=fB (1) ;B (2) ;B (3) ;B (n) g be the n-partitioning endpoints of the interval [0;B max ]. To understand the methodology, take for instance the typical movement of theB t , which is to decay by . The `Riemann' way of approximating this decay over time (i.e., the functionB t = B t ), would be to take a partition on the domain (e.g., t =f1; 2; 3; 4;g) and nd its mapping on the co-domain (e.g.,fB 1 ;B 2 ;B 3 ;g). Then the typical computational procedure is to nd the elements in B n that are closest to the elements in the co-domain (fB 1 ;B 2 ;B 3 ;g) and work with these to compute asset prices. The immediate challenge here is that you need B n to be very ne in order to get accurate approximation and hence, convergence. The `Lebesgue' approach, as the name suggests, starts from B n (instead oft). Rather than simply increasing n to obtain convergence, the idea is to use the Markov transition probability matrix to interpolate the value ofB. For example, suppose t = 1, and hence we want to setB =B 1 :=B, but 30 B = 2 B n . LetB2 [B ;B + ], whereB ;B + 2 B n . Then clearly, there exists an (0 1) such thatB = B + + (1)B . Since onlyB andB + are members of the discretized state space - andB is not - a natural possibility would be to `weigh' (B + ;B ) to generateB. Moreover 30 B is the `true' value of dis-utility we want to approximate, and B n is a zero-measure set, so this will happen `almost surely'. IfB2 B n , then approximation is trivial (= 0 or 1). 63 since 0 1, satises a necessary condition of being a probability measure 31 , hence the prob- ability transition matrix arises as a plausible candidate location to place these weights: (; (1)). Recall the asset pricing equation: P t = E[M t+1 (B)X t+1 )jS t ]: M(B) is the pricing kernel, and I have suppressed all other arguments except forB because this is what we need to approximate. The proposed method suggests that we use B =B + + (1)B to compute P t . Note thatB is the `domain' of the pricing kernel, hence we are eectively approx- imating the domain. This raises an immediate concern: how can we be sure that approximating the actions on the domain (i.e.,B) leads to correct approximations on the co-domain (P t ), namely, whether this procedure is a legitimate approximation. What I show below is that if M(B) is rst order dierentiable inB 32 , this procedure is legitimate. Namely, under rst order dierentiability, the computational result will converge to the true P t as we increase the granularity (n) of B n . We rst need a Lemma. Lemma 6. Assume that M(B) is rst order dierentiable inB and thatB = B + + (1)B . Then M(B + ) + (1)M(B ) n!1 !M(B): Proof. By dierentiability,9M 0 (B ), such thatM(B + ) =M(B )+M 0 (B )(B + B )+O(n). Then M(B) =M(B + + (1)B ) =M(B +(B + B )) =M(B ) +M 0 (B )[(B + B )] +O(n). But since M 0 (B ) = M(B + )M(B ) B + B +O(n), we obtain the desired result by substituting back into the expression above. The conclusion of this Lemma is somewhat intuitive given the notion of a derivative, which is essentially piece-wise linear approximation. Now, using Lemma 6, P t = E[M(B)XjS t ]E[M(B + )XjS t ] + (1)E[M(B )XjS t ] =: P[B + jS t ] E[M(B + )XjS t ] + (1 P[B + jS t ]) E[M(B )XjS t ]: 31 SinceB andB + are mutually exclusive events, `satises' the (sub-)additive requirement of a probability measure as well. 32 First order dierentiability is rather clear when NA = 1 (from the MUC of the model.) When NA > 1, the issue of dierentiability may seem more complicated because of the possibility to trade, but can still be established from envelope theorem. 64 The approximation becomes exact as n!1, and derives from a combination Lemma 6 and the fact that expectation is a linear mapping so that the constant can be pulled out. The second equal- ity is really a denition, and explicitly shows how the weight should be encoded into the transition probability matrix. 33 The above argument establishes convergence. Since we can now use to approximateB, conver- gence obtains at a much faster rate than in the conventional Riemann counterpart. G.3.2 Monte Carlo vs. Nullspace Approach Once the transition probability matrix (P ) is specied, the remaining ingredient is the stationary distribution for the Markov Chain that P represents. Broadly speaking, there are two methods to this end. First, is Monte Carlo simulation, which is essentially to generate numerous sample paths and use the spirit of \law of large numbers" to claim that the empirical distribution converges to the stationary distribution. The second method is the nullspace approach, which is to nd the solution to the equation =P: This amounts to nding the nullspace of PI. The Monte Carlo procedure is outlined as follows: (a) Specify the transition matrix generated jointly by the consumption process and disutility shock process. (b) Simulate each sample path of the Markov Chain. (c) Record the maximal level that the dis-utility processB t attains. (B max ) (d) Bin the B process on the interval [0;B max ] (e) Calculate the stationary distribution either from empirical distribution or Chapman-Kolmogorov estimator. The empirical estimator is simply the equally weighted histogram of the sample paths. The Chapman-Kolmogorov is combines Chapman- Kolmogorov equation and Law of Large Numbers for more eciency. Namely, the estimator for (j) is: 1 n n X k=1 p(i (k) ;j) 33 It is worth noting that this encoding is possible only because the expectation is linear in its argument. Because of this linearity, weighing expectations (by ; (1)) amounts to assigning conditional probabilities on the Markov transition matrix. 65 This is because 1 n n X k=1 p(i (k) ;j)!E[P (i;j)] = n X k=1 p(i;j)(i) =(j); where the last equality is by Chapman-Kolmogorov equation. Unfortunately though, it turns out that the gained eciency from this estimator is minimal in my setting. The main bottleneck for the Monte Carlo method, presumably, is computing time, since we need to simulate many sample paths to gain convergence. The execution of the nullspace approach is straightforwardly calculating the eigenspace by com- puter, and there is not much we can do, except to exploit \sparsity" of the matrix. The bottleneck for this avenue is memory, since the matrix (P ) has to be recorded on the RAM memory of the computing machine at some point. The choice between Monte Carlo versus nullspace is eectively trading o compute time against memory, both scarce computing resources. It turns out that the nullspace method is easier to work with, and this is how the reported results were obtained. 66 Part II Rolling the Skewed Die: Economic Foundations of the Demand for Skewness Abstract Skewness is pervasive across nancial instruments, and the literature has documented that many investors seek idiosyncratic skewness in their portfolios. In response, there are some theoretical models that study implications of the preference for skewness, but using utility functions where the preference for right skewness is hard-wired. Drawing from status concerns, we derive a utility function reminiscent of Friedman and Savage (1948) that leads the investor to demand skewness {right or left skewness. We then consider a parsimonious set of securities that allow the investor to select the exact optimal level of right or left skewness. Our analysis yields a rich set of results broadly consistent with empirical observations. 67 1 Introduction Skewness is pervasive among nancial securities {options, growth stocks...{ and other types of in- vestments {private equity, VC... Furthermore, for many agents, skewness-seeking is an important element in their investment decisions, as documented in the literature. In fact, the quest for skew- ness has the potential to explain some of the most challenging empirical puzzles contemplated in the literature {for example, the value puzzle, Zhang (2013). However, standard utility functions {in particular, CRRA utility functions{ cannot explain the demand for skewness we observe in practice. While the current literature, especially during the last ten years, has explored lottery characteristics {i.e., skewness{ of many securities, it has not tackled the reasons that drive individual investors to demand skewness. In general, the analysis of the eects of skewness demand is based on utility functions that assign an ad-hoc large weight to the positive third moment of wealth {i.e. right skewness. This is the case of the in uential paper by Kraus and Litzenberger (1976) or the more recent by Harvey and Siddique (2000). A step further towards an axiomatic utility is the work on aspirational utility (Diecidue and van den Ven 2008, in the spirit of Friedman and Savage 1948). Their utility includes a jump that represents the discontinuity in utility derived from crossing a certain threshold o wealth. In this paper we analyze the demand for skewness that results from an utility function similar to the model of Diecidue and van den Ven (2008) but derived from microeconomic foundations. In particular, we consider an economic agent who cares not only about consumption but also about status as another source of utility dierent from the consumption good. The consumption good is divisible and contributes to total utility in the same way as in the standard CRRA case. How- ever, status is conveyed through acquisition of a non-divisible good {to simplify, we assume that the only utility provided by this good is through the status recognition; our conclusions do not depend on this assumption. Status-seeking is related to a number of preferences popular in the nancial economics literature, as relative wealth concerns {or external habit formation, Campbell and Cochrane (1999){ and habit formation (Sundaresan 1989 and Constantinides 1990). Rayo and Becker (2007) show that this type of utility provides an evolutionary edge. More recently, Rous- sanov (2010) proposes a utility model that includes status-seeking and derives some investment implications that he shows are consistent with the data. When we proxy status by a non-divisible good we have in mind examples such as a luxury car, a house, a country-club membership... all of which are often interpreted as signals of status (see, for example, Charles, Hurst and Roussanov 2009). They might also yield consumption utility. We do not explore this possibility, but it is not inconsistent with our utility specication. 68 The status driven, aspirational utility that emerges from this microeconomic foundations is remi- niscent of the framework rst established by Friedman and Savage (1948). They are motivated by the `puzzling' observation that some investors simultaneously buy insurance and lotteries which, they argue, cannot be explained by standard utility models. However, in their analysis they only consider the notion of \volatility," the second moment of the distribution, and overlook skewness, which is the focus of our analysis. In fact, focusing on the net demand for right skewness provides a straightforward answer to the rst part of the puzzle raised by Friedman and Savage (1948). In particular, buying a lottery ticket amounts amount to taking a long position in right skewness and buying insurance implies a short position in left skewness. Each of these two decisions reveals a preference for right skewness. Yet, the second problem raised by Friedman and Savage (1948), the shortcomings of standard utility models, is still relevant because, in general, standard utility models cannot explain in generality the demand for right-skewness that the previous two examples illustrate. For example, although CRRA utility in principle implies a preference for right skewness, under any reasonable parameter values the optimal policy of an investor with CRRA utility is to sell short a lottery because its negative expected return and high variance dominate the eect of positive skewness. Furthermore, negative skewness is always discarded by a CRRA investor, unless it is associated with a positive, large enough, expected payo, and yet, as we will later argue, demand for left-skewness is also present in some economic decisions. These observations justify our quest for individual motives whose resulting utility function explains a possible optimal demand for skewness {right or left. Interest in the demand for skewness and its eect on equilibrium prices is not new. Kraus and Litzenberger (1976) already explore its implications, assuming a utility function that puts a larger weight on the third moment. More recently, Harvey and Siddique (2000) explore implications of the relation among higher moments {in particular, co-skewness{ in the cross-section of stocks. Mitton and Vorkink (2007) show that many investors have a preference for skewness in their portfolios and strategically choose securities that are avoided by investors who prefer a diversied portfolio. Kumar (2009) studies stocks with lottery-like properties and shows they are chosen by people who also buy lotteries. Estimating ex-ante skewness in stocks is dicult. Bali, Cakici, and Whitelaw (2011) suggest an alternative way to identify lottery stocks. Boyer and Vorkink (2014) study lottery properties among stock options. Based on relative status concerns we derive a specic type of aspirational utility and we show how 69 demand for skewness can arise endogenously. Our utility function is similar in shape to a CRRA function but with a aspiration point (R), and a positive jump in utility if the agent can spend more than R. We then introduce a parsimonious set of securities (called `binomial martingales') that contain dierent levels of skewness, which allow investors to choose the exact level of skew- ness {right or left{ optimal for their position in the utility function. From this setting, through a concavication of the utility function we can derive the four seasons of the demand for skewness. In particular, we show analytically that the relative position of the reference point R (i.e., how far away the aspiration is), with respect to the agent's current consumption level (C 0 ) is a critical factor in determining the demand for skewness. If the agents' aspiration is only marginally higher than the current endowment, they choose to sell skewness. This is because the proximity of the aspiration point encourages the agents to select a security that lands on the aspiration level with relatively high chance. Such a security - high chance of small gain - is negatively skewed. On the other hand, by exactly symmetric arguments, as the aspiration level moves further away from the current wealth, agents choose to buy right-skewed securities. This is in sharp contrast to the standard mean-variance analysis where agents shun any form of gambling with zero or negative expected returns. We also explore how endogenous demand for skewness changes with respect to the parameters of the utility function. Predictably, the size of the jump is a main factor. If attaining the aspiration leads to a big jump in utility, agents choose less (right-)skewed securities. Intuitively, a big jump implies greater importance of aspiration: the agent is then forced to demand securities that can help get to the aspiration level with higher probability, albeit at the expense of a lower level of consumption if the gamble fails. Such securities contain low or even negative skewness. Analogous results are presented for the level of risk aversion and initial wealth of the agent. In the `binomial martingale' setting we introduce, the agent's choice of skewness mechanically xes the level of volatility. It is impossible to separate the choice of volatility from the choice of skewness. To investigate the role of volatility in the aspirational setting, we introduce tri-nomial securities (that embeds binomial securities as a special case). This allows us to separate volatility from skew- ness. This yields a somewhat surprising result, but in line with the Friedman and Savage (1948) analysis: the aspirational agents do not necessarily choose to minimize volatility as they would in the standard mean-variance setting. Instead, agents choose just the right amount of volatility to propel themselves to their aspirations. In the aspirational setting, volatility can be desirable insofar as it helps them attain their aspirations. In similar vein, extensions are also made to consider a 70 broader range of securities to investigate the consequence of variations in the rst moment. We also consider multiple (two) aspirations. In the two-aspirations case, we nd that agents can either choose to mind both aspiration points, mind only one aspiration point, or mind neither. If agents choose to mind both aspiration points, the level of skewness they demand is determined entirely by the position of the aspirations relative to their current consumption level (C 0 ). In ev- eryday parlance, the agent gets `trapped in' between a status they want to attain, and a status that they have already attained and would never want to lose. The agent's demand for skewness is determined by the relative strength of these considerations. The paper is organized as follows. In section 2 we describe a basic setup, derive our utility function and rst order conditions. In Section 3 we study the demand for skewness in the case of a single jump. In section 4 we study the two-jump case. Section 5 considers more realistic nancial markets. In section 6 we explore the interaction between volatility and skewness. We close the paper with some conclusions. 2 Setup In this section, we introduce and motivate `aspirational utility': a standard utility function aug- mented by elements of `goal' and its `attainment' (Diecidue and Van de Ven, 2007). The `goal' in this setting is represented as a position in the agent's wealth or consumption level, the satisfaction of achieving this aspiration is expressed by a discontinuity or `jump' in utility level. 2.1 A Motivation: Local, Bulky Status Goods To see how aspirational utility can arise from a natural setting, we consider the notion of `status' in the spirit of Roussanov (2010). In Roussanov's model, agents care not only about standard consumption (as represented by power utility over consumption) but also about their wealth level relative to the average wealth level of the economy, a feature which represents agent's `status con- cerns'. While we believe incorporating status is a meaningful endeavor, we make two observations that further enhances the realism of this feature. First, status goods are often bulky, indivisible purchases. (Consider luxury cars, or mansions for example.) Second, unlike the setting of Rous- sanov where the benchmark of status is uniform across agents of all wealth types (the average wealth level of the economy), it is reasonable to assume that status goods are wealth dependent: for example, jewelry for the poor, large house for the rich. 71 We will consider an example that embodies status concerns in the spirit of Roussanov's model (2010), with the two enrichments described above. Let W i denote the wealth level of an agent i. Suppose that for agents of wealth level W i 2 (0; $1 million) the relevant status good is ownership of a small house that costs $0:5 million, and for agents of wealth levelW i 2 ($1 million; $2 million) the relevant status good is a luxury house with private pool that costs $1:5 million to acquire. This re ects the locality and indivisibility of status goods; that status goods are often bulky purchases that are endemic to the peer groups within wealth brackets. Let S i denote the status good con- sumed by agent i, and let c i denote the standard consumption good (bread and butter) consumed by agenti. The total consumption of agenti (C i ) consists ofc i andS i , (that is,C i =c i +S i W i . ) Given standard (e.g. power utility) utility overc i , and the local, indivisible nature ofS i , an agent optimally chooses the amount (c i ;S i ). The left-hand panel of Figure 1 gives a graphical description of U(C i ) when the optimal choice (c i ;S i ) is made. The derivation can be made rigorous, but the graph is intuitive. The jumps are inheritence from the local, indivisible nature of status goods. They occur at around $0.7 (A) and $1.7 (B) million, slightly above the cost of owning the status goods ($0:5 and $1:5 million), representing the fact that agents would buy status goods only after they surpass their `subsistence level'. The marginal utility of C is high following the jump, since the agents were forced to be thrifty in order to purchase the local status good. Once the marginal utility drops, agents ponder another jump in status (point B), etc. Figure 1: Local, Bulky Status Goods Imply Aspirational Utility The key element of this utility is the `jump'. The right-hand panel of Figure 1 depicts a dierent 72 (single) jump, associated to the corresponding (single) jump at point A of the left-hand panel. This discontinuous jump described in the right-hand panel is essentially the `aspirational utility' proposed by Diecidue and Van de Ven (2007). Although the formal shape is dierent, one can use a well-known concavication argument to show that they yield identical (expected) utility maximization, and hence to the extent of microeconomic analysis, they are indierent. In short, local, indivisible status concerns lead to aspirational utility. 2.2 A Formal Representation of Aspirational Utility Consider an agent who maximizes expected utility (EU), with initial endowmentC 0 . However, this agent diers from the typical EU-maximizing agent in that (i) he has an aspiration R (R > C 0 ), and (ii) he discounts payos that fall below R, so that his utility is u(C) when C <R, and u(C) when C (R) (0<< 1). Otherwise, we maintain the conventional assumptions on the behavior of u(): u 0 ()> 0 and u 00 ()< 0. More formally, we write the agent's utility function as: U(C) =u(C)1 (0;R) (C) +u(C)1 [R;1) (C) (1) Given this denition of aspirational utility, our next task - which is the main focus of this paper - is to apply this to see how it generates an environment that endogenously creates demand for skewness. 2.3 The Aspirational Agent's Choice Set: Binomial Martingales Since our goal is to understand how preference for skewness can endogenously arise from aspira- tional utility, we introduce an expected utility (EU) maximization setting that allows agents (with aspirational utility) to make their optimal choice of skewness. To this end, we introduce the fol- lowing set of securities that denes the choice set of the EU-maximizing agents. The idea is rst to parsimoniously focus on the level of skewness embedded in these set of securities, and add in other features later. Let L(p) be a binomial martingale (i.e., fair game with two outcomes) with p2 (0; 1). That is, L(p) is a fair gamble which costs to purchase, and pays M with probability p. 73 0 Fail 1-p M Success p To ensure that L(p) is a martingale, we require: 0 =p(M) + (1p)(). This pins down the amount M as a function of price of lottery and p; M(;p). Similarly, let L (p) be an extended binomial martingale with p2 (1; 1). That is, it is a gamble which pays C S with \probability" p and C F with \probability" 1p , such that: 0 =p(M) + (1p)();p2 (1; 1) (2) Note that the only dierence is the domain of p. That we allow p< 0 means that at this stage, p and 1p should be interpreted as `weights', rather than probabilities. The reason for extending the domain is to keep the structure consistent and tidy throughout the analysis; for those who prefer to focus on `real-world intuition', it would not do much harm to ignore the term `extended' for now. The real-world interpretation of p< 0 will be provided in Theorem 2. Using the set of securitiesL (p), the agent can construct a consumption scheme by exercising free- dom over two dimensions: (i) which security (L (p); choice is over p) she wishes to purchase and (ii) how much of that security she wishes to purchase. Let N be the number of security she purchases. Let C S be the consumption she enjoys if each unit of her security pays M. Let C F be the consumption she gets if each unit of her security pays. Without loss of generality, we x = 1 going forward, since the nominal amount she invests in the bet can be adjusted by N. 74 C 0 C F = C 0 N Fail 1-p C S =C 0 +N(M1) Success p Note here that because L*(p) is a martingale, the structure of his consumption scheme is also a martingale (can easily be veried algebraically by eliminating N in equations (5), (6) and adding in (7) below), namely: C 0 =pC S + (1p)C F : (3) To summarize, we have created a set of securities that abstracts away from variations in the rst moment (expected returns) by imposing martingale. We also simplify the distributional structure of these fair games by assuming a binomial payo. All this is to ensure relentless focus on the third moment: skewness, a focus which will be relaxed in the sections that follow. Using binomial martingales, the agent can set up her consumption scheme by choosing (p,N). Under this setup, we now move on to the utility maximization problem of the agent. 2.4 Utility Maximization The single jump EU-maximization problem (with the extended L (p)) is: max p2(1;1) N2[0;1) (1p)U(C F ) +pU(C S ) (4) such that: C F =C 0 N (5) C S =C 0 +N(M 1): (6) 75 0 =p(M 1) + (1p)(1) (7) The following lemma allows us to reduce a dimension. Lemma 1. Consider the single jump EU-maximization problem described in equations (4) - (7), with C 0 : initial wealth and R: reference point (C 0 <R). Consider an associated reduced form of the EU maximization problem: max p2(1;1) (1p)U(C F ) +pU(R) (8) such that C F satises C 0 =pR + (1p)C F ; and let p be the maximand (solution) to this reduced problem. Assume that the following holds. U 0 (C F ) :=U 0 C 0 p R 1p :=u 0 C 0 p R 1p >u 0 (R) =:U 0 (R) (9) Then any optimal consumption scheme in the original problem satises C S = R. Assumption (9) of Lemma 1 is not automatically satised unless = 1 (then, since C F <R, and marginal utility decreases in C, automatic), yet is still innocuous. It only requires that the marginal utility at C F is higher than the marginal utility at R. This is intuitive; once the agent attains the reference (aspiration) point, the hankering dissipates and marginal utility goes down, almost by denition. This Lemma is useful because for each and every p, we can forget about the argument N. Any N which induces a C S higher than R is irrelevant for the purpose of the optimization, and thus, N is determined automatically. Hence, using Lemma 1, we can reduce the setup described in equations (4)-(7) to the following setup: max p2(1;1) (1p)U(C F ) +pU(R) (10) 76 such that C F satises C 0 =pR + (1p)C F : From now on, we stick to this setup, which means we will only look at the parameter space (;R;C 0 )R 3 that satises assumption (9). Before laying down a set of results that characterizes the behavior of solutions to the problem, we introduce an ancillary lemma that relates p with the level of skewness embodied in security L (p). The denition of skewness we use is `Pearson's moment coecient of skewness'. S(p) =E " X 3 # Lemma 2. Let S :p(0; 1)7!R denote `Pearson's moment coecient of skewness' of the consump- tion scheme induced by L(p). Then: (i) S(p) is monotonically decreasing in p (ii) S(p)"1 as p# 0 (iii) S( 1 2 ) = 0 (iv) S(p)#1 as p" 1 This Lemma describes the relationship betweenp and skewness implied byp. In particular, it shows that skewness is monotone in p and hence, the p chosen as the solution to the EU-maximization problem (10) uniquely determines the level of skewness that the agent's choice implies. If p = 1 2 , the agent is demanding a symmetric security. Ifp < 1 2 , the agent is demanding a positively-skewed security; a security that has `lottery-like' feature. If p > 1 2 , the agent is demanding a negatively- skewed security; e.g., a security that delivers modestly positive returns most of the time, but very negative returns in rare, but unfortunate states. Note that when p > 1 2 ; the returns during the bad state have to be `very negative' in order to honor the martingale assumption, since it has to compensate for the high (> 1 2 ) likelihood of reaching a good state. This is increasingly so as p approaches 1. 77 3 Single Jump Optimization Now, we go back to the single jump EU-maximization problem. We specialize u() to be the power utility function with > 1. For technical comfort (and realism), we assume 1C 0 (<R). Namely, rst, it is natural to assume that C 0 is lower than R. Also, requiring 1C 0 is to ensure the utility value is positive on that domain, so that that multiplying by is indeed a discount. This does not harm generality, since we can always translate the utility function to be positive. Also, to make the problem non-trivial, we assume 0< < 1. Theorems 1 and 2 characterize the solution to the EU-maximization problem, p . 3.1 The `Four Seasons of Gambling' Theorem 1. Let u(C) be the power utility function and let L (p) be an extended binomial mar- tingale with p: \probability" of success (p2 (1; 1)). Fix C 0 : initial wealth, and consider the reduced single jump EU-maximization problem (10) with R: reference point (C 0 < R). Let p be the \probability" that maximizes the expected utility. Then: (i) @p @( C 0 R ) > 0 (ii) as ( C 0 R )" 1 (i.e., as C 0 "R), p " 1 (iii)9( C 0 R ) 2 (0; 1) such that p 0. Theorem 1 describes how p - the optimal security demanded by the agent with aspirational pref- erence - changes with the position of his aspiration (R) relative to his current consumption (C 0 ). (i) asserts that as R moves further out, away from C 0 , the agent demands more positively skewed security. In everyday parlance, this means that as her aspiration becomes `unrealistically high', the agent starts to demand more and more `lottery-like' securities to meet the aspiration. When C 0 is far away from R, attempting such a big jump in consumption (R - C 0 ) with high chance comes at the cost of disastrously low consumption if the attempt fails. (Recall that all securities in the choice set are fair gambles.) Thus, the agent rationally avoids the abysmally low utility levels in bad states, through the purchase of positively skewed security (with lowp .) Moreover, the positive sign on the derivative asserts that this relationship is monotonic. That is, as agents' aspirations move far from (close to) C 0 , the optimal choice of skewness will increase (decrease) monotonically. (ii) describes the opposite situation. As R becomes indistinguishably close to C 0 , the agent prefers 78 a negatively-skewed security; that which takes her to the aspiration (R) with high chance at the expense of a low-chance event of a disaster. While it is true that the martingale assumption requires thatC F be low (becausep is close to 1), the proximity ofC 0 to R ensures that the associated C F is not unacceptably low. In other words, the proximity of the aspiration ensures that the agent can avoid exposing herself to a big downside risk, even while she is taking negatively skewed bets. (iii) describes what happens when the aspiration is too far, or equivalently, when the agent is `too poor'. (iii) asserts that at some point, the aspiration will be so remote that it is optimal to choose p 0. At this point, however, we are interpreting p as `optimal weights', because p has no real world analogue. The full meaning of (iii) will be revealed in Theorem 2. Taken together, (i), (ii), and (iii) show that as R gets pushed away from C 0 , the agent's demand runs through the entire gamut of securities, starting from the most negatively skewed to the most positively skewed. This happens for any xed 2 (0; 1); which determines the `size of jump.' Theorem 2. Assume (as is in the real world) that only L(p) with p2 (0; 1) is available. When p 0, the agent chooses C 0 over any L(p) with p2 (0; 1). Theorem 2 and Theorem 1 together tell us what happens as R increases relative to C 0 . At rst, as C 0 stands close to R, the agents demand negatively skewed securities for the reasons explained above. As R increases, the agents start demanding more symmetric securities (e.g., stocks), then positively skewed `lottery-like' securities. As we move the R away from C 0 further, agents even- tually reach a certain threshold where they stay with C 0 , thereby stop demanding risky securities altogether. We may call this the `four seasons of gambling' as depicted in Figure 1. The horizontal axis represents the 2 [0; 1], and the vertical axis represents R C 0 2 [1; 3]: Fix any , and start from R C 0 = 1. Theorems 1 and 2 says that when R C 0 0, agents start fresh by buying neg- ative skewness (Spring, light blue). Then, as R C 0 increases, they start to buy symmetric securities (Summer, yellow), then positively skewed securities (Autumn, brown), and ultimately, they stop (Winter, navy). Although the picture is truncated at R C 0 = 3 , if we were to elongate the picture, we would indeed see that winter hits agents for all . It is worth mentioning the role of volatility in the optimization problem, which provides an interest- ing contrast from the familiar mean-variance (MV) framework. From equation (20), we know that volatility in increasing in p , and inversely related to skewness. This means that when R is very 79 Figure 2: The Four Seasons of Gambling close toC 0 , agents choose a negatively skewed security, even though it embodies a very high volatil- ity. Intuitively, this can be interpreted as a situation where the aspiration point is tantalizingly close to the current income C 0 , and the agent will choose to attain it (and hence enjoy the utility that hikes over the kink) with very high probability, even at the expense of the commensurately low C F and the large volatility this entails. On the other hand, when R moves farther away from C 0 agents see it as `too remote'. They grow increasingly more interested in limiting the downside risk, while hitting the aspiration point becomes a lottery event. As the downside risk is reduced, it also increasingly evaporates their hopes of hitting the aspiration point. These forces diminish the volatility embedded in the optimal L(p ), until ultimately, agents are unwilling to accept any volatility whatsoever (p 0) and the four seasons of gambling is over. In short, the most stark deviation from the mean-variance framework (recall that in the MV framework with martingale setting, agents do not tolerate any volatility in the security) is when the aspiration point is very close to the current consumption levelC 0 , in the sense that agents are willing to buy securities with huge volatilities embodied in them in order to obtain RC 0 . Ultimately, this contrast from MV dissipates as R moves far away fromC 0 . These ndings preview a later section, which explores the relative roles of volatility and skewness in greater detail. 80 3.2 Comparative Statics We are now ready to harvest a few comparative statics results that illustrates the choice of skewness by aspirational agents. 3.2.1 Size of Jump Theorem 3. dp d < 0 Theorem 3 explains what happens as increases. Recall that is how much the agent keeps when the consumption falls short of the reference point R, hence (1) determines the `size of jump'. As increases, agents take the aspirations less seriously, and can hence aord to attain it with lower probability (like lottery). In other words, when is high making it less important to attain R, agents demand securities with higher skewness which limits the downward risk (i.e., low C F ) at the expense of lower chance of attaining R. However, as decreases, agents must take the kink more seriously, forcing them to give up more of their C F in bad state to ensure that they reach R with higher probability. The agents thus demand increasingly negative-skewed securities. Figure 1 also illustrates this. 3.2.2 Initial Endowment Theorem 4. Fix ; ; and := C 0 R . Then, @p @C 0 j ; ; > 0 Theorem 4 is also very intuitive. As agents become wealthier, they need to worry less and less about the disastrous states where marginal utility is extremely high. Hence, they can aord to take more and more downside risk, thereby demanding less skewed securities. This coincides with the empirical ndings (e.g., Kumar 2009) that report more active engagement in lottery-like bets among consumers with lower income. Note that Theorem 4 is saying a little more than Theorem 1-(ii), which is eectively saying that @p @C 0 j ; ;R > 0: The dierence here in Theorem 4 is that we are xing := C 0 R instead of R. Thus, here we are not decreasing the distance between C 0 and R as we increase C 0 . The point here is that p increases even when R increases along with C 0 , and that the increase of p is purely an endowment eect, rather than eect of C 0 approaching R as in Theorem 1-(ii). 81 3.2.3 Risk Aversion Theorem 5. Fix C 0 and . Suppose R is `big enough' to satisfy: R C 0 " ( R C 0 ) log R C 0 ( R C 0 ) 1 # +O 1 R <; whereO( 1 R ) is a positive term which vanishes at the rate of 1 R . Then, @p @ j ;C 0 ;R < 0 . We rst give some numerical examples to get a feel for how stringent the assumption is. For reasonable parameters values such as = 0:8, = 2,C 0 = 2, @p @ < 0 will hold wheneverR> 3:491. For = 0:8, = 2, C 0 = 1:5, @p @ < 0 will hold whenever R > 2:227. For = 0:8, = 2, C 0 = 1, @p @ < 0 will hold whenever R> 1:159. The role of this R-threshold (higher than which will allow full monotonicity) can be interpreted as follows. Suppose the agent has C 0 in his hands. Higher implies that his utility function is more concave. In the power utility setting, this extra concavity is achieved by pulling down the utility of the agent both on positive outcomes (C > 1) and negative outcomes (C < 1) with C = 1 as the anchoring case. (Refer to gure below.) In our setting, the agent with higher discounts both u(C S ) and u(C F ) more heavily than in the log-utility case. This has consequences on p . The depressed u(C S ) aects p unequivocally; it acts to lower p . Intuitively, this is because the reduced upside discourages the agent from taking much downside risk (in the form of lower C F ) in return, and the agent consequently decreases p to ensure that C F does not fall too low. (Recall that we are envisioning a martingale situation.) Hence, the agent with higher chooses lower p . Meanwhile, the eect of the depressed u(C F ) on p can go both ways. When is higher and the downside is even lower, the agent faces a predicament: on the one hand, she wishes to avert the painful depth of the downside by choosing to increase C F . However, this can only be done at the cost of lowered p , again, because of the martingale assumption. The loweredp means that she is undermining her very chance of avoiding the downside, albeit perhaps a less painful one. Since the agent faces this inevitable trade-o, the eect of depressed u(C F ) is ambivalent. Nevertheless, we still can say this with absolute certainty: as C S = R grows, so does the magnitude by which u(C S ) gets depressed. (Refer to Figure below.) This means that the unequivocal eect of the depressed upside (i.e., lower p ) will grow bigger and bigger, until at some point it dominates the (ambivalent and hence limited) eect of the depressed downside. 82 This is precisely our assumption in Theorem 5: as R exceeds a certain threshold, the eect of on p becomes monotonic. Modulo the assumption, the broad-strokes conclusion of Theorem 5 is intuitive: agents with higher risk-aversion will choose lowerp securities to minimize their perceived downside. 0.5 1 1.5 2 2.5 3 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 Figure 3: Power utility functions with dierent 83 4 Double Jump Optimization It is realistic to envision a situation where there are multiple aspirations. For example, while the aspirational agent may be eager to accomplish a goal that she has yet to attain, she could equally be concerned that a past accomplishment may be revoked. We model this situation as a two jumps, where C 0 is in between the two aspirations. (R 1 < C 0 < R 2 ) We follow the same ow as in the single jump case: we rst oer the (doubly aspirational) agents a choice set of securities, each with specic levels of skewness embedded, then observe their behavior. 4.1 Setup We modify the U(C) in (1) to accommodate two aspirations; let U(C) = 2 u(C)1 [0;R 1 ) (C) + 1 u(C)1 [R 1 ;R 2 ) (C) +u(C)1 [R 2 ;1) (C) (11) with 0< 2 < 1 < 1. The double jump EU maximization problem is identical to (4) - (7), except for the modied U(C). Consequently, we can use Lemma 1 again to reduce dimension just as before. Therefore, for any 5-tuple set of parameters: ( 1 ;R 1 ; 2 ;R 2 ;C 0 ); (12) with 0< 2 < 1 < 1, R 1 <C 0 <R 2 and assumption (9) of Lemma 1. We can the dene the double jump EU maximization problem to be: max p (1p)U(C F ) +pU(R 2 ) (13) such thatC F satises (4). Since any ( 1 ,R 1 , 2 ,R 2 ;C 0 ) (with parameters in suitable range) denes such a problem, we will succinctly denote the EU maximization problem as: M (s) for s = ( 1 ,R 1 , 2 , R 2 ; C 0 ). Figure ?? illustrates an aspirational utility with two jumps. It turns out that the behavior of agents given this doubly-aspirational setup can easily be un- derstood by drawing on our previous results with single-aspirational agents. To establish this connection, we consider a corresponding single-aspirational setup. Consider ~ s = (R 2 , 2 ; C 0 ) and dene a single-jump problem using these three parameters. This is nothing more than a single jump 84 Figure 4: Aspirational Utility with Two Jumps 85 problem generated by erasing the intermediate jump. We denote this EU maximization problem asM (~ s) for ~ s = (R 2 , 2 ; C 0 ), and call this the single jump problem associated withM (s). We now introduce some notations and denitions. Denition 1. Let S R 5 be the set of all ( 1 , R 1 , 2 , R 2 ; C 0 ) such that 0 < 2 < 1 < 1, R 1 < C 0 < R 2 and assumption (9) holds. Consider s = ( 1 , R 1 , 2 , R 2 ; C 0 )2 S, and ~ s = (R 2 , 2 ; C 0 ). Then, s is indistinguishable from ~ s iM (s) induces the same solution (maximand p* and maximized EU) asM (~ s). Intuitively, when the parameters are arranged in such a way that one of the two aspirations (in this case, the lower one; R 1 ) does not play any role in the agent's maximization problem, we may say that such parameter setting s is indistinguishable from its `reduced form' single-jump parameter setting ~ s. In this case, it is redundant to think of it as a double-jump problem because it is possible to reduce it to a single-jump problem. This denition allows us to partition S into two parts. Denition 2. H =fs2S : s is indistinguishable from ~ sgS Naturally, S = H _ [H c and any s2 S belongs to either H or H c . When s2 H, the parameters are congured in such a way that it can be reduced to a single-kink problem and in this sense the double kink problem is `trivial'. When s2H c , the parameters do not allow for such a reduction. 4.2 Optimal Choice Under Two Jumps A natural question to ask given this setup would be: when is reduction possible (s2H) and when it is not (s2H c )? The following Theorem addresses this question. Theorem 6. Fix all 4 elements of s except R 1 (i.e., x the ex-R 1 4-tuple of s, and allow R 1 to oat.). Let R 1 = inffR 1 : ( 1 ;R 1 ; 2 ;R 2 ;C 0 )2Hg. Then, R 1 acts as a demarcation point; when R 1 < R 1 , ( 1 , R 1 , 2 , R 2 ; C 0 )2 H c and when R 1 R 1 , ( 1 , R 1 , 2 , R 2 ; C 0 )2 H. Moreover, when R 1 > 1, R 1 "R 2 as 1 " 1. This Theorem helps us discern whether the double-jump problem is an authentic double-jump prob- lem, or whether it is reducible to a single-jump problem. The pivot variable isR 1 , the intermediate location of the jump. The Theorem indicates that there is a threshold value (R 1 ), higher than which all problems are reducible, and vice versa. Roughly speaking, the higher the 1 and the lower the R 1 , the more likely it will be authentically double jumped, whence the intermediate reference 86 point actually matters. This is intuitive; higher 1 heightens the vertical stature of the intermediate reference point. Similarly, lower R 1 increases the horizontal stature of the intermediate reference point, and hence its importance. Overall, when these forces wrinkle the utility function suciently, the EU-optimization departs from a single jump problem. Moreover, when 1 increases (and ap- proaches 1 monotonically), the intermediate jump becomes signicant for monotonically increasing sets of R 1 until it is so high that it is signicant for any R 1 less than R 2 . This monotonicity, however, holds only on R 1 > 1, and this requirement ensures that we are not looking at a rather degenerate case where R 2 and 2 are simultaneously so low that the problem is dominated by the near-zero marginal utility induced by 2 0. The natural order of business now is to describe the behavior of p when s2 H c , i.e, when the double jumped problem truly involves two aspirations. This is done in the following theorem. Theorem 7. On any s2 H c , M (s) admits a solution p which is characterized by = C 0 R 1 R 2 R 1 , (which represents the position of initial wealth (C 0 ) relative to the two reference points), namely: (i) p is monotonically increasing in ; @p @ > 0. (ii) lim #0 +p 0 (iii) lim "1 p = 1 Theorem 6 and Theorem 7 together describe how agents react to a double jump problem. Imagine a situation where the agent is sitting on an initial wealth. On the upside, she sees a dream to which he aspires. On the downside, there is a point below which she does not want to drop. (e.g. she does not want to fall below a level where she will be unable to pay the rent, and will be kicked out of the neighborhood.) In some cases, the downside drop may be ignored (s2 H). Theorem 6 tells us that this is the case if the drop itself is not too big (e.g. when 1 is low and 1 2 is relatively small) or if the drop is too close to the aspiration point (e.g., when R 1 R 2 , so the agent already lumps them together when she makes decisions.) When these are not the case, the agent has to take both jumps seriously (s2 H c ). Theorem 7 tells us what happens when this is the case. When this happens, the agent gets `trapped' in between the aspirations, in the sense that the agent's demand for security is determined by, the position of her initial wealth relative to the two aspirations. When her initial wealth is close to R 1 , the agent, in fear of dropping below R 1 , demands more lottery-like securities that minimizes the downside, until at some point, she stops demanding risky securities altogether and just consumes C 0 (p 0). On the other hand, as C 0 is 87 safely far from R 1 and close to R 2 , the agent begins to demand negatively skewed securities that gets her to R 2 with great certainty, albeit at the risk of (much less likely) slips below C 0 . 5 Departure from Fair Gambles: Sub-martingale and Super-martingale The analysis thus far has limited its scope to martingales (fair-games with zero returns). This allowed undivided focus on the choice of skewness, however it begs the question: what happens when we vary the rst (mean returns) and second moments (volatility). We now add variations in the rst moment to see how this aects the choice of skewness. 5.1 Setup First, we consider sub-martingales. Sub-martingales are stochastic processes with positive drift. Recall thatC S =C 0 +N(M 1) andC F =C 0 N. To specify the deviation from martingale, let: :=p(M 1) (1p) 0: (14) Namely, is the expected gain from buying one unit of L(p), which costs 1 to purchase. Thus, the positive sign on is what makes this gamble a sub-martingale. When the agent purchases N units of L(p), the expected value of consumption is: E[C] =pC S + (1p)C F =C 0 +N: The second term (N 0) represents the `better than fair' component in the consumption scheme. Recall that in the martingale case,E[C] =C 0 and N was automatically pinned down by Lemma 1 and = 0. We rst state formally the EU-optimization problem in the sub-martingale case.: max p2(1;1) N2[0;1) (1p)U(C F ) +pU(C S ) (15) such that: C F =C 0 N C S =C 0 +N(M 1): 88 :=p(M 1) (1p) 0: Note that the analogue of (3) in the sub-martingale case (obtained by eliminating N from (5), (6) and substituting (14) in) is: p 1 + C S + (1p) + 1 + C F =C 0 ; (16) This is the `budget constraint' on the consumption scheme, set by the sub-martingale condition (14). Note that for every C 0 and C S , this sub-martingale budget constraint implies a C F higher than that implied by the martingale budget constraint (3).This re ects the fact that L(p) here represents a `better than fair' gamble; namely that 0. In the sub-martingale case, we cannot automatically invoke Lemma 1 anymore because Assumption (9) takes a dierent form, as is outlined in Lemma 3 below. In principle, we therefore have to solve the full problem, which amounts to maximizing (15) under the budget constraint (16). Doing so yields an interesting departure from the martingale case. The following two subsections illustrate the two consequences. 5.2 > 0: `Winter (keep C 0 )' Is Replaced by Near-Arbitrage Theorem 8. For any > 0,9p 2 (0; 1);N 2 (0;1) such that EU(p ;N )>U(C 0 ). The signicance of this theorem is that in the sub-martingale case, the default option for the EU- maximizing agent is no longer to `do nothing' (i.e., just sit on C 0 ) as in the Mean-Variance case. By doing nothing, the agent gets U(C 0 ), which is strictly dominated by EU(p;N) for a suitable choice of p and N which is always available as long as > 0. This begs the question: how can agents always enjoy an expected utility level higher than U(C 0 ) even when they are risk-averse? The answer is that when agents are allowed to choose the level of skewness embedded in the securities, they eectively end up selling lottery, whose payo prole increasingly resembles an arbitrage as p" 1. (Note that here, the term `lottery' is in the colloquial sense; purchase of a lottery entails loss in the mean, typically with high level of positive skewness, and vice versa for sale of a lottery.) To see this, note that the payo structure of L(p) in the sub-martingale case (obtained by manipulating the constraints in problem (15)) is: 89 0 1 Fail 1-p +(1p) p Success p In particular, when p 1, 0 1 Fail p 0 + Success p 1 whose payo almost resembles an arbitrage, increasingly so as p" 1. Such a near-arbitrage oppor- tunity may increase N indenitely, however it is bounded by the constraint C F (p )> 0, eectively acting as an upper-bound for N. The following is a pictorial illustration of Theorem 8. C0 −5 −4 −3 −2 −1 0 Figure 5: Almost Achieving Arbitrage by Selling Lottery: p SM 1 90 5.3 > 0: Lowers Demand for Positive Skewness Positive> 0 (strict sub-martingale) introduces an additional eect, encapsulated in the following result. The assumption required for Theorem 9 borrows from Lemma 3 which follows. Theorem 9. Assume (18) holds. Then> 0 impliesp SM > (1+)p , wherep is the martingale solution. (i.e., solution when = 0) In particular, (i) p SM >p , and (ii) p SM has a lower-bound that is strictly increasing in . The general message of this Theorem is thatp SM has a well-dened lower-bound, (1 +)p which is proportional to . In particular, Theorem 9-(i) means that in the sub-martingale ( > 0) case where the expected return is no longer 0, agents tend to go `more aggressive', and choose a security that achieves R with a higher probability than if = 0. This means that the optimal demand for skewness drops in the sub-martingale case. Theorem 9-(ii) states that this `aggressiveness' will most likely grow with; the stronger the positive drift, the less skewed are the securities demanded. Thus, in spirit, (ii) essentially means \ @p SM @ > 0". The result can be understood intuitively in the following way. Higher p SM is desirable because it increases the chance of attaining the good out- come U(C S (p SM )). The trade-o is, of course, that it pushes C F (p SM ) down lower. However, this downside can be mitigated if is positive, because allows agents to reduce their exposure to the gamble (N SM ) required to reach their aspired goals (R). Thus, the EU-maximizing agent can now aord to enjoy a higher p SM , hence > 0 implies p SM >p . This mitigating eect is stronger if the positive drift, , is stronger. Lastly, we close the loop by introducing the following Lemma which shows how Assumption (9) has to be modied in the sub-martingale case. Lemma 3. Consider the single-kink EU-maximization problem described in (4), with constraints (5), (6) and (14). Consider an associated reduced form EU maximization problem: max p2(1;1) (1p)U(C F ) +pU(R) such that C F satises p 1 + R + (1p) + 1 + C F =C 0 (17) 91 and letp SM be the maximand (solution) to this reduced problem. LetC F (p SM ) denote theC F which satises (17) at p SM . Assume that the following holds. U 0 (C F (p SM ))>U 0 (R) 1 + (1p SM ) (18) Then any optimal consumption scheme in the original problem satises C S = R. It may look as though Assumption (18) is harder to satisfy than Assumption (9) because (1p SM ) ! 1 asp SM ! 1. However, it turns out that Assumption (18) is almost always satised for reasonable values of , due to Inada Conditions. Intuitively, when p SM is close to 1 and C S R, this means C F (p SM ) is already very close to 0, if not already 0. By Inada Conditions, this implies very high U 0 (C F (p SM )), satisfying Assumption (18). Therefore, fortunately, we do not lose much by always assuming C S =R, as it turns out to be a good approximation for the solution to the full problem. Conclusion: the sub-martingale problem does not allow for Lemma 1, but for practical purposes, we can approach the sub-martingale problem as a reduced-form optimization problem, just as we did in the martingale case. 5.4 < 0 :Super-martingales By switching the sign of (14) we can extend the result in the previous subsection to super- martingales. Namely when 0, decreasing reduces p and increases preference for positive skewness. (i.e., in spirit, \ @p SM @ > 0" hold, as before.) However, somewhat dierently from the > 0 case, the replacement of `no gambling' by lottery sales is no longer available in the super- martingale case, simply because negative depletes the opportunity for near-arbitrage. One small caveat in the super-martingale case is that the choice of L(p) in this case must be limited to those such that p < 1 +, ( < 0). This is because L(p) oer an arbitrage opportunities if otherwise. (Since C S = M 1 = +1p p and C F =1, we need to ensure that + 1p > 0, otherwise arbitrage can be made by shorting L(p).) An interesting exercise can be done using super-martingales, starting from the following observa- tion. Taylor approximating a CRRA utility function reveals a mechanical sign on the moments, in particular the mean (positive), volatility (negative) and skewness (positive). This conclusion can, in part, justify the sale and purchase of lottery. Namely, if we characterize standard lottery as a security that has very high positive skewness yet minutely negative return, we can justify trading 92 lottery by ascribing it to the demand for skewness, at the expense of loss in mean. Meanwhile, in the aspirational setting we provided, we show that preference for negative skewness can also be justied (p 1). In the CRRA setting, buying negative skewness would have to be corroborated by positive mean since negative skewness reduces utility. However, in the setting we provide, agents can choose to sell skewness and yet accept negative returns, providing another sharp distinction from the CRRA-based MV framework. This is described in the gure below. The colors represent the amount of returns that can be taken away from a martingale bet in order to stop the agent from buying L(p). The red zone represents the situation where the magnitude of mean return that can be taken away (foregoable return) is the highest, and blue zone, the lowest. The horizontal axis represents the jump size which decreases from left to right. The vertical axis represents the distance between R andC 0 , which increases from high to low. The picture shows that clearly there is some mean that can be traded away against benets of trading skewed securities, even when the agent is pursuing a negative skewed payo. In fact, the foregoable return is highest when the agent is trading negatively skewed securities, and lowest when trading positively skewed securities. Overall, the picture is intuitive. The foregoable return is 0 when p = 0 because under that conguration, the agent can gain nothing by taking on a risky bet. When the agent is close to the aspiration point, and when the size of jump is higher, the foregoable return is higher because there is something to gain from taking on risk to attain the aspiration point. When the agent is too far from the aspiration point, or if the size of the jump is too small, the agent becomes increasingly unwilling to give up much in return. 6 Volatility and Skewness: A Role Analysis Another meaningful departure (from binomial martingale) is to separate volatility from skewness. In the stringent binomial martingale setting, the choice of p simultaneously xes volatility and skewness of L(p ), as illustrated by Lemma 2. Because of this entanglement, it was impossible to see clearly what the agent's preferences for volatility looks like in the aspirational setting. The aim of this section is to explore what the relative roles of volatility and skewness are in the EU- maximization process. To isolate the two eects, our rst goal is to design a set of securities that controls one, yet varies the other. 93 Figure 6: Maximum Tolerable Negative Expected Return 6.1 The New Consumption Scheme: Tri-nomial Martingales We generate a variant of the binomial martigale introduced before. For the sake of brevity, we suppress the underlying L(p) and start directly with the consumption scheme, keeping in mind there always exists a corresponding L(p). 94 C 0 C F = C 0 N Fail p 3 C M = C 0 Stay p 2 C S =C 0 +N(M1) Success p 1 where p 1 +p 2 +p 3 = 1; andE[C 0 ] =p 1 C S +p 2 C M +p 3 C F =C 0 (19) As before, (19) dictates that the trinomial consumption scheme is also a martingale. The key dierence is that we assign positive probability mass (p 2 ) on an intermediate branch, C M = C 0 . Because consumption level stays atC 0 on this node, the expected outcome is to (1) reduce volatility while (2) keeping skewness relatively stable. Indeed: Figure 7 illustrates the movements of volatility and skewness as probability mass on C M (i.e., p 2 ) varies. As expected, volatility is muted down as p 2 increases, whereas skewness is kept relatively stable. This allows us to address the task of analyzing separately how elements of volatility and skewness aect the agent's demand for L(p). We begin by asking, \if single-kinked EU-maximizing agents were oered the opportunity to choose securities with same (similar) skewness but lower volatility (by increasing p 2 ), which security would the agent choose?" For this, dene: Denition 3. LetT (P;C) denote the trinomial consumption scheme depicted above, with P = (p 1 ;p 2 ;p 3 ) and C = (C S ;C 0 ;C F ) such that p 3 C F +p 2 C 0 +p 1 C S = C 0 . ConsiderB :=C (P 0 ;C 0 ) withP 0 = (p 1 ; 0;p 3 ) andC 0 = (C S ;C F ). B is the binomial consumption scheme associated toT: In this denition,B is simply the binomial analogue of the trinomial consumption scheme whenp 2 = 0, while preserving the martingale property and the positions ofC S andC F . It can be interpreted as the most volatile security in the family of trinomial securities with identical consumption positions 95 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 −10 −5 0 5 10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 5 10 15 20 Figure 7: Top: Skewness, Bottom: Volatility and similar (by construction) levels of skewness. 6.2 Principle of Maximal Volatility Theorem 10 and 11 characterize the aspirational agent's choice over tri-nomials. They reveal a rather surprising deviation from the familiar mean-variance results. Theorem 10. ConsiderM (;R;C 0 ). Any optimal tri-nomial consumption scheme: T , is domi- nated by its associated binomial consumption scheme: B. Theorem 10 tells us that any consumption scheme which is less volatile than the most volatile consumption scheme (B) is dominated, and not chosen. In other words, somewhat against our intuition, agent would choose the most volatile security in the family of securities with same (similar) skewness proles. The reason for doing so becomes more apparent with the next Theorem. Theorem 11. Any solution to a trinomial optimization problem is an associatedB with C S =R. 96 Theorem 11 tells us that in spirit, Lemma 1 holds in the trinomial martingale setting as well. Namely, the agent takes just enough risk to hit R, and does not engage in a consumption scheme that oers C S >R: Meanwhile Theorem 10 tells us that the agent does not aim to reduce volatil- ity either, as long as C S = R can be attained. Reducing volatility only reduces expected gains from attaining R. As long as the agent knows that (s)he has right skewness and N to eciently hit the aspiration point R, (s)he buys the entire portfolio rather than try to reduce volatility by assigning weight on C 0 . After all, volatility is the thrust that propels the agent from C 0 to R. Reducing volatility would detract from the spoils of attaining R. Thus, insofar as the agent is able to precisely choose the level of skewness to attain R, she operates under the principle of maxi- mal volatility in order to maximize her gain from the gamble. In the examples that follow, we illustrate the relative role of skewness and volatility, the principle of maximal volatility, and what happens when the agent loses the ability to precisely target the level of skewness needed to attain R. Example 1: Loss of Skewness is Prohibitively Costly Consider the EU maximization problemM (;R;C 0 ) with = 0:8, R = 5, C 0 = 2:8. When the agent is allowed to choose from all available L(p), the agent will choose p = 11:22% with skewness of 2.46 to attain `o'. Now, we consider depleting the agent of the opportunity to control skewness. If the agent is only oered the choice of skewness of 0, (i.e. p = 1 2 ), the agent would be oered `x', which (s)he does not choose since this is dominated by U(C 0 ): Namely, EU(o)>U(C 0 )>EU(x) so the gambling stops. This illustrates the case where loss of skewness is prohibitively costly and the agent will stay with U(C 0 ) (no gambling). 97 Figure 8: Loss of Skewness is Prohibitively Costly Example 2: Loss of Skewness is Not Prohibitive Now consider the EU maximization problem M (;R;C 0 ) with = 0:8, R = 5, C 0 = 4:5. The only dierence from Example 1 is C 0 . When the agent is allowed to choose from all available L(p), the agent will choose p = 81% with skewness of -1.54 to attain `o'. In this case, when the agent loses control over skewness, the agent still chooses `x' over U(C 0 ). Namely, in this case, EU(o)>EU(x)>U(C 0 ), so the agent still makes a risky choice. 98 Figure 9: Loss of Skewness is Acceptable Examples 1 and 2 illustrate that `principle of maximal volatility' is not conned to optimal choices. Namely, in spirit, Theorems 10 and 11 hold even when we restrict the agent's choice to limited levels of skewness - in the Examples, restricted to 1 2 . In Example 2 when control over skewness is lost, agent does not choose to reduce volatility. It is easy to show that if the agent was oered a trinomial martingale with skewness = 0, the agent would decline and still choose the associated binomial martingale, very much like in Theorem 10. However, like in Theorem 11, this is only to attain R. Once C S =R, the agent does not increase the size of bet (N) to increase C S . 6.3 A Rule of Thumb Under Limited Choice of Skewness The Examples also illustrate a `rule of thumb' that agents use when they are oered a tri-nomials with limits on the level of skewness they can choose from. When the agent is oered a tri-nomial security with skewness = 0 (or more generally, with restrictions on the level of skewness she can choose), she processes it in the following steps. Step 1: she asks whether the loss of skewness is prohibitive and chooses C 0 if it is. 99 Step 2: if the loss of skewness is not prohibitive, she chooses the security with the highest volatility; the associated binomial. (Principle of Maximal Volatility) In summary, skewness is welfare improving because it allows the agent to tailor her aspiration more precisely. Restrictions on this freedom is welfare reducing (Example 2), and in some instances, pro- hibitively so (Example 1). Volatility on the other hand, can work both ways. Volatility is helpful insofar as it takes the agent to R. This is why agents choose the associated binomial (`principle of maximal volatility'.) However, this is only up to R, and excess volatility over and above R is avoided (Theorem 11.) 7 Conclusions In this paper we study the demand for skewness {both right and left skewness{ using a utility function with microeconomic and evolutionary foundations. We assume that economic agents care both about consumption {divisible good{ and status {achieved through the purchase of non- divisible goods. Our resulting utility is in the spirit of Friedman and Savage (1948), however, their analysis focuses on the second moment of the distribution that explains uncertainty. We consider a parsimonious set of securities that allow the agent to select the exact optimal level of right or left skewness. Our analysis yields a rich set of results broadly consistent with empirical observations. 100 8 References [1] Bali, T. G., N. Cakici, and R. F. Whitelaw (2011): \Maxing out: Stocks as Lotteries and the Cross-Section of Expected Returns". Journal of Financial Economics, 99, 427-446. [2] Boyer, B., and K. Vorkink (2014): \Stock Options as Lotteries". Journal of Finance, 69, 1485- 1527. [3] Campbell, J., and J. Cochrane (1999): \By Force of Habit: A Consumption-Based Explanation of Aggregate Stock Market Behavior". Journal of Political Economy, 107, 205-251. [4] Charles, K., Hurst, E., and N. Roussanov (2009): \Conspicuous Consumption and Race". Quarterly Journal of Economics, 124, 425-467. [5] Constantinides, G. (1990). \Habit formation: A resolution of the equity premium puzzle". Journal of Political Economy, 98, 519-543. [6] Diecidue, E., and J. van de Ven (2008): \Aspiration Level, Probability of Success and Failure, and Expected Utility". International Economic Review, 49, 683-700. [7] Friedman, M., and L. J. Savage (1948): \The Utility Analysis of Choices Involving Risk". Journal of Political Economy, 56, 279-304. [8] Harvey, C., and A. Siddique (2000): \Conditional Skewness in Asset Pricing Tests". Journal of Finance, 60, 1263-1295. [9] Kraus, A., and R. Litzenberger (1976): \Skewness Preference and the Valuation of Risk Assets". Journal of Finance, 31, 1085-1100. [10] Kumar, A (2009): \Who Gambles in the Stock Market?". Journal of Finance, 64, 1889-1933. [11] Mitton, T., and K. Vorkink (2007): \Equilibrium Underdiversication and the Preference for Skewness". Review of Financial Studies, 20, 1255-1288. [12] Rayo, L., and G. Becker (2007): \Evolutionary Eciency and Happiness". Journal of Political Economy, 115, 302-337. [13] Roussanov, N. (2010): \Diversication and Its Discontents: Idiosyncratic and Entrepreneurial Risk in the Quest for Social Status". Journal of Finance, 65, 1755-1788. [14] Sundaresan, S. M. (1989). \Intertemporally dependent preferences and the volatility of con- sumption and wealth". Review of Financial Studies, 2, 73-88. 101 [15] Zhang, X. (2013): \Book-to-Market Ratio and Skewness of Stock Returns". Accounting Review, 88, 2213-2240. 102 Appendices Proof of Lemma 1: Step 1) Show that any consumption scheme whose C S exceeds R is dominated by a consumption scheme whose C S equals R. LetN 0 be the N such thatC S = R on equation (6). (It can easily be shown that) EU is continuously concave in N onNN 0 . It then suces to show that @EU @N N 0 < 0 in the original problem. Namely, if @EU @N N 0 < 0, then by continuous concavity of EU, @EU @N N < 0 for all N N 0 , and (again, by concave continuity of EU) any consumption scheme with N >N 0 is dominated by a consumption scheme with N 0 To prove this sucient condition ( @EU @N N 0 < 0), rst, write down the EU NN 0 under equations (5) - (7): EU NN 0 =(1p)U(C F ) +pU(C S ) =(1p)U(C 0 N) +pU(C 0 +N(M 1)) =(1p)U(C 0 N) +pU(C 0 +N 1p p ): Then dierentiate EU NN 0 with respect to N at N 0 , p : @EU @N N 0 ;p =(1p )U 0 (C 0 N) +p 1p p U 0 (C 0 +N 1p p ) N 0 = (1p )(U 0 (C S )U 0 (C F )) N 0 = (1p )(U 0 (R)U 0 (C F )) < 0; where the last inequality follows from the domain of p, and assumption (9). Step 2) Show that any consumption scheme whose C S falls below R is dominated by a consump- tion scheme whose C S equals R. When C S <R, U(c) = u(c). Then by our standard assumptions on u(c) and Jensen's Inequality, 103 we know that p = 0 (no trade) dominates all other consumption scheme, which is by denition, (weakly) dominated by p because p=0 is in the choice set. Proof of Lemma 2: Since p2 (0; 1), C F = C 0 pR 1p , and C F C 0 = p(C 0 R) (1p) . Also by Lemma 1, C S = R. Some calculations yield: 2 = p (1p) (RC 0 ) 2 ; (20) and similarly, E[(CC 0 ) 3 ] = p(1 2p) (1p) 2 (RC 0 ) 3 : (21) )S(p) = 12p p p(1p) and S 0 (p) = 1 2p(1p) 3=2 < 0. Proof of Theorem 1: The optimization problem characterized by (10) is specialized to power utility: max p EU(p) (22) where EU =(1p) C 1 F 1 1 +p R 1 1 1 (23) and C 0 =pR + (1p)C F : (24) Let := C 0 R . The associated First Order Condition @EU @p gives: F (p;) : = 1 ( p 1p ) R 1 ( 1 +p 1p ) + R 1 + 1 1 (25) = 0 (26) Checking the Second Order Condition, @F (p;) @p = 1 R 1 ( p 1p ) 1 2 (1) 2 (1p) 3 < 0 (27) 104 Hence, the EU-optimization problem amounts to nding the p which satises F (p ;) = 0 (28) Proof of part (i): By Implicit Function Theorem, @p @ = @F @ @F @p (29) Partial dierentiation yields: @F @ = (1 )(1p) R 1 ( p 1p ) 1 ( 1)(1) 1p and @F @p = (1 )(1p) R 1 ( p 1p ) 1 (1 )(1) 2 (1p) 2 Hence, @p @ = 1p 1 > 0 (30) Proof of part (ii) Suppose not. This means that on equation (25), p does not converge to 1 as " 1. This in turn allows us to conclude that: F (p ;)! 1 R 1 + R 1 + 1 1 ; (31) as " 1 105 Since F (p ;) = 0, so is the limit. (Equals zero.) This implies: 1 R 1 + R 1 + 1 1 = 0: (32) Rearranging, this becomes: R 1 = 1, a contradiction under our assumptions R> 1 and > 1. Proof of part (iii) Let g() : =F (0;) = 1 R 1 ( 1 ) + R 1 + 1 1 : To show that9( C 0 R ) 2 (0; 1) such that p = 0, we need to show that g() has a root in (0,1). First, note that g(1) = 1 1 (R 1 1)> 0 Then, by Intermediate Value Theorem (IVT), it suces to show that: 9C ;R 2 (0; 1), such that g(C)< 0. (*) Proof of (*) Rearranging g(), we get: g() = R 1 (1 1 ) + R 1 + 1 1 but, (1 1 )> !1 as # 0 + 106 Plugging this back into (22), this implies g()!1 as # 0 + Hence, for any N2R,9C2 (0; 1) such that g(C)<N, which proves (*). Proof of Theorem 2: Recall that (i) @F (p;) @p < 0 (from SOC), and (ii) C F C 0 = p 1p (C 0 R). Clearly, from (i), the Expected Utility is maximized at p (0). Note that from (ii), choosing p =0 is equivalent to choosingC 0 , namely not choosing any gamble. This is certainly an available option for the agent. Hence, we want to show that the agent chooses p=0 over p2 (0; 1). To do this, we need to show EU(p )EU(0)>EU(p) (33) where p 0<p (34) By Mean Value Theorem (MVT),9c2 (0;p) such that EU(p)EU(0) =EU 0 (c)(p 0)< 0 (35) This follows from the fact that EU 0 (c) < 0, which can easily be shown by applying MVT again. Proof of Theorem 3: Using Implicit Function Theorem, dp d = @F @ @F @p : Note that 107 F (p ;) = 0 () @EU @p = 0 () @(1p) C 1 F 1 1 +p R 1 1 1 @p = 0 () @(1p) C 1 F 1 1 @p + R 1 1 1 = 0 Hence, @F @ = @(1p) C 1 F 1 1 @p = R 1 1 1 1 < 0: Also, from before, @F @p = (1 )(1p) R 1 ( p 1p ) 1 (1 )(1) 2 (1p) 2 < 0 It thus follows that dp d < 0. Proof of Theorem 4: Using Implicit Function Theorem, @p @C 0 j ; ; = @F @C 0 @F @p : Note that @F @C 0 = R (1) (1p) 2 ( p 1p ) 1 > 0: Also, from before, @F @p = (1 )(1p) R 1 ( p 1p ) 1 (1 )(1) 2 (1p) 2 < 0 It thus follows that dp dC 0 j ; ; > 0. Proof of Theorem 5: A direct algebraic proof is not amenable. We rst suggest a sucient condition (actually, an equivalence condition) and then use this to prove the theorem. Claim 1. For a given (and of course, under the xed C 0 and as assumed in the statement of the theorem), let R 0 ( ) denote the R which leads to p = 0. (Recall from Theorem 1, we know !9R 0 ( ).) It suces to show that @R 0 ( ) @ < 0 108 Proof. Let i < j , and denote the optimal solutions pertaining to i and j (as functions of ) as p i () and p j (). Note that p () is well-dened as a function of because we have xed C 0 . In fact, in this setting we can treat p () as a continuously dierentiable function, as a direct consequence of the Implicit Function Theorem. Note also that by (30), p i () andp j () can never intersect. To intersect at, say, point 0 , there must exist a neighborhood of 0 upon whichj @p i () @ j always exceedsj @p j () @ j. However, (30) prohibits this (i.e. plug in 0 into (30), and for any > 0 , j @p i () @ j <j @p j () @ j whenever p i () > p j () and vice versa for < 0 ), asserting our claim that p i () and p j () can never intersect. Next, suppose that @R 0 ( ) @ < 0 holds. Since i < j , this implies R 0 ( i )>R 0 ( j ). Using what we know about p () from Theorem 1, we can deduce that p i C 0 R 0 ( j ) > 0 =p j C 0 R 0 ( j ) ; where the inequality follows from combining Theorem 1-(i) (p is monotonically increasing in and approaches 1 from the left) and the fact that 0 = p i C 0 R 0 ( i ) , by denition. Similarly, the equality follows from denition ofR 0 ( ). But since we established thatp i () andp j () can never intersect, this inequality at = C 0 R 0 ( j ) must in fact hold uniformly in all , namely, p i ()>p j () whenever i < j . Therefore, dp d < 0, as desired. Claim 2. @R 0 ( ) @ < 0. Proof. We rst specialize (25) by insisting p = 0, as per the denition of R 0 : G() =F (0;) =C 0 R 1 C 1 0 + R 1 + 1 1 = 0 By Implicit Function Theorem, @R @ = @G @ @G @R = ( R C 0 ) 1 ( 1 C 0 ) | {z } A<0 ( R C 0 ) [ ( R C 0 ) log R C 0 ( R C 0 )1 ] +O( 1 R ) ( R C 0 ) | {z } B>0 (36) whereO( 1 R ) := (1)C 1 0 logC 0 R C 0 1 > 0, a positive quantity that converges to 0 at the rate of 1 R . 109 Under the current assumptions, R 1>C 0 1> 0, and 1 < 0, thus A = ( R C 0 )1 1 C 0 < 0. To sign B, note that ( R C 0 ) log R C 0 ( R C 0 )1 > 1 for all R C 0 > 1, andO( 1 R )> 0, so if ( R C 0 ) [ ( R C 0 ) log R C 0 ( R C 0 )1 ] +O( 1 R )<, this implies ( R C 0 ) <, and B > 0. Therefore, under the given assumption, @R @ < 0. Combining Claim 1 and Claim 2, we arrive at the desired conclusion. Proof of Theorem 6: Note that by denition,R 1 <R 1 automatically implies ( 1 ,R 1 , 2 ,R 2 ;C 0 ) 2H c . It remains to prove that when R 1 R 1 , ( 1 ,R 1 , 2 ,R 2 ; C 0 )2H. The proof is constructed as follows. First, (Lemmas 4 - 5) we state some miscellaneous facts which we will use along the way. Second, we introduce a `discriminant' that will help us tell H andH c apart, and derive some of its properties (Lemmas 6-10). Third, we will use these to draw conclusions on how parameters should behave to be in either H or H c . Lemma 4. Let C F (p ) denote the C F when the agent's optimal solution is implemented; i.e., that which satises (3) at p . Then, @C F (p ) @ > 0 Proof. C F (p ) = C 0 p R 1p . By chain rule, @C F @ = @C F @p @p @ = C 0 R (1p ) 2 @p @ > 0, as product of two negatives. Lemma 5. When C F >C F (p*), @EU @C F < 0 Proof. First, note that @C F @p = C 0 R (1p) 2 < 0, hence C F is a bijection. It follows that C F (p) > C F (p ) () p<p , and from proof of theorem 1, we know that @EU @p = @EU @C F @C F @p > 0 on p<p . ) @EU @C F < 0 as desired. Now considerM (s), and we dene the following objects on the EU maximization problem. g(c) will be the discriminant. . l 2 (c) :=u(R 2 ) + u(R 2 ) 2 u(C F ) R 2 C F (cR 2 ); (37) g(c) = 1 u(x)l 2 (c)2C 2 (38) 110 0.5 1 1.5 2 2.5 3 3.5 −2 −1.5 −1 −0.5 0 0.5 1 Figure 10: l 2 (c) is tangent to U(c) at C F The following lemma tells us what l 2 (c) is. Lemma 6. Consider s2 S and its associated ~ s. l 2 (c) is the tangent line to 2 u(c) onM (~ s) which passes through the point (R 2 ;u(R 2 )). Moreover, the point of tangency is unique. Proof. That it passes through (R 2 ;u(R 2 )) is clear. To prove tangency, consider the single kinked EU maximization problem and the FOC condition. F (p;) = @EU @p =u(R 2 ) 2 u(C F ) + 1 1p 2 u 0 (C F ) (C 0 R 2 ) = 0 (39) () 2 u 0 (C F ) = (1p )(u(R 2 ) 2 u(C F )) R 2 C 0 = (u(R 2 ) 2 u(C F )) R 2 C F : (40) (41) Since RHS = (u(R 2 ) 2 u(C F )) R 2 C F is the slope of l 2 (c) and u is concave, l 2 (c) is tangent to 2 u(c) and the point of tangency isc =C F . Finally, uniqueness of tangency follows from concavity of u(c). Lemma 7. Let u(c) be the power utility function. Then,9^ c2R + where g 0 (^ c) = 0. Moreover, g(^ c) is a local and global (hence unique) maximum () g 0 (^ c) = 0. 111 Proof. Clearly,u 0 (c)> 0 for all c, hence by lemma 5, (u(R 2 ) 2 u(C F )) R 2 C F > 0. If u(c) is the power utility function, thenu 0 (c) =c which tends to +1 at 0 + and converges to 0 asc!1. Hence by IVT, 9^ c2R + where g 0 (^ c) = 0. The second part (local and global maximality) is a general property of strictly concave functions. Lemma 8. ConsiderM ( 1 ;R;C 0 ) andM ( 2 ;R;C 0 ) with 1 > 2 and dene l i (c), i = 1,2 con- formably as before. Then l 1 (c)>l 2 (c);8c2 (0;R). Proof. By Lemma 3, @C F () @ > 0, henceC F ( 1 )>C F ( 2 ). Since u'(c) is decreasing in c,u 0 (C F ( 1 ))< u 0 (C F ( 2 )). By Lemma 5, i u 0 (C F ( i )) = u(R) i u(C F ( i )) RC F ( i ) , i = 1; 2, and therefore u(R) 1 u(C F ( 1 )) RC F ( 1 ) < u(R) 2 u(C F ( 2 )) RC F ( 2 ) . Note also that l 1 (R) =l 2 (R) )l 1 (c)>l 2 (c);8c2 (0;R). Lemma 9. Let u(c) be the power utility function. Then the following are equivalent. g 1 and g 2 are two distinct roots of g(c) in (0;1) () g(^ c) > 0, where g 0 (^ c) = 0 and ^ c in (g 1 ;g 2 ) () max c2(g 1 ;g 2 ) g(c)> 0 Proof. We prove the rst equivalence. The second equivalence is a direct corollary from lemma 6. ())9 2 roots in (0;1) implies (by MVT) that9^ c in (g 1 ;g 2 ) such thatg 0 (^ c) = 0, whence by lemma 6 g(^ c) is a global maximum in (0;1). If the global maximum, g(^ c); 0, then9 0 or 1 root in (0;1), a contradiction. (() Suppose g(^ c) > 0, where g 0 (^ c) = 0 and ^ c2 (0;1), then by concavity of g(c), we can pick 0 < c < ^ c < c such that g 0 (c) < 0 < g 0 (c). Taylor expanding around c, c, (and using the fact that u(c) is a power utility function which tends to1 as c# 0 + ), we can show that lim c!0 + g(c) = lim c!+1 g(c) =1. ) by IVT,9 at least two distinct roots in (0;1). Lemma 10. g(c) has at most two distinct roots in (0;1). Proof. Suppose not. Then by lemma 8, there are at least two distinct local maxima> 0, which contradicts lemma 6, in particular, the uniqueness of local maximum. With these, we now construct the main body of the proof. By Lemma 9, we know g(c) has 0, 1, or 2 distinct roots. Let S 0 denote the subset of S such that g(c) has 0 root. Let S 1 denote the subset of S such that g(c) has 1 root. Let S 2 denote the subset of S such that g(c) has 2 roots. Clearly, S 0 ;S 1 ;S 2 partition S; S 0 _ [S 1 _ [S 2 =S. Claim 3. S 0 H 112 0.5 1 1.5 2 2.5 3 3.5 −2 −1.5 −1 −0.5 0 0.5 1 Figure 11: g(c) has 2 roots. 0.5 1 1.5 2 2.5 −2 −1.5 −1 −0.5 0 0.5 1 Figure 12: g(c) has no root. (Many of these are ruled out by Lemma 1) Proof. Given anys2S and its associated ~ s, letp and ~ p denote the optimal solution toM (s) and M (~ s). LetC F (p ) andC F (~ p ) denote theC F 's whenp and ~ p are implemented. LetEU (M (s)) and EU (M (~ s)) denote the maximized EU when p and ~ p are implemented. For a given s2 S, C F (p ) can either be higher or lower than R 1 . We look at the two cases. Step 1) We rst look atfs :C F (p )<R 1 g and showfs :C F (p )<R 1 g\S 0 H. Here, EU (M (s)) =p U(R 2 ) + (1p )U(C F (p )) =p u(R 2 ) + (1p ) 2 u(C F (p )); because C F (p )<R 1 . On the other hand, EU (M (~ s)) = ~ p U(R 2 ) + (1 ~ p )U(C F (~ p )) = ~ p u(R 2 ) + (1 ~ p ) 2 u(C F (~ p )) Since ~ p is the unique solution to max p2(0;1) EU(M (~ s)), we must have p = ~ p , and consequently, C F (p ) = C F (~ p ) and EU (M (s) = EU (M (~ s)), i.e.,fs :C F (p )<R 1 g\S 0 H. Step 2)fs :C F (p )R 1 g\S 0 is empty; i.e., C F (p )R 1 never happens in S 0 . 113 Suppose C F (p ) R 1 . Since on S 0 , g(c) has 0 root, g(c) < 0 on the entire domain. Therefore, l(c)> 1 u(c) and in particular, l(C F (p ))> 1 u(C F (p )): (42) Also, l(R 2 ) =u(R 2 ) (43) by denition, so taking linear combinations of the two sides, and recalling that we are assuming to be infs :C F (p )R 1 g so that 1 u(C F (p )) =U(C F (p )), (1p )l(C F (p ) +p l(R 2 )> (1p ) 1 u(C F (p )) +p u(R 2 ) =:EU (M (s)): (44) Note that l() is a linear operator in its argument, and plug in C 0 = ~ p R 2 + (1 ~ p )C F (~ p ) to get LHS =l(C 0 ) = ~ p u(R 2 ) + (1 ~ p ) 2 u(C F (~ p )) =:EU (M (~ s)): (45) Thus, EU (M (s)<EU (M (~ s)), so p is never chosen, a contradiction to global optimality of p . Hence C F (p )R 1 never happens in S 0 . Claim 4. S 1 H Proof. Pick any s2 S 1 . First, note that l 1 (c) = l 2 (c) because g(c) is concave and unique root denes the tangency, hence the tangent line for on C F ( 1 ) and C F ( 2 ) is a common line by con- struction ofl(c). Then, by same logic as in Claim 1, we can show thatEU (M (s)) =EU (M (~ s)). Modulo the (innocuous) assumption that the agent chooses ~ p = p when indierent, s and ~ s are indistinguishable, hence S 1 H as desired. Claim 5. S 2 =R + _ [R where R + H and R H c Proof. Let R<R be the two roots of g(c). Let R + :=fs2S 2 :R 1 Rg and let R :=fs2S 2 : R 1 <Rg. Clearly, S 2 =R + _ [R . Step 1) R + H 114 It suces to show that U(c) l 2 (c) on all c2 (0;R 2 ). Then, we can use the same argument as in Claim 1 to show that EU (M (s) = EU (M (~ s)). (Namely, we partition C F (p ) into (0;R 1 ) vs [R 1 ;R 2 ) and argue that on (0;R 1 ), s2H and the sub-case for [R 1 ;R 2 ) is empty.) Whenc2 (0;R 1 ),U(c)l 2 (c) certainly holds sinceU(c) = 2 u(c) 1 (0;R 1 ] andl 2 (c) is line tangent to 2 u(c) 1 (0;R 1 ] from above. When c2 [R 1 ;R 2 ), by lemma 6,9^ c2 (R;R) such that g 0 (^ c) = 0 and g 0 (c) < 0 everywhere on c2 (^ c;1). ) g(c) = 1 u(c)l 2 (c) < 0 for all c2 (R;1), in par- ticular for allc>R 1 R, since we are inR + . Hence, 1 u(c)<l 2 (c) for allc2 (R 1 ;R 2 ) as desired. Step 2) R H c This assertion will be proved in Theorem 7 (Lemma 10). Summing up, we know thatS 0 _ [S 1 _ [R + =H andR =H c . Recall thatR 1 = inffR 1 : ( 1 ;R 1 ; 2 ;R 2 ;C 0 )2 Hg. For anys2S 0 _ [S 1 _ [R + ,R 1 = 1, because for allR 1 ;s2H andR 1 is dened to satisfyR 1 > 1. On the other hand by construction of the proof of Claim 3, for any s2R =H c , R 1 =R. More- over, for any R 1 R 1 , the corresponding s is in H. Therefore, R 1 = R is the demarcation point between H and H c . Finally, suppose we consider the ex-R 1 4-tuple of any s2H C . Then R 1 > 1. As 1 " 1, the R of the associated g(c) monotonically increases to R 2 , thus proving the claim that R 1 "R 2 as 1 " 1. . Proof of Theorem 7: Lets2R and letC F (p ) be the theC F associated toM (s). LetC F ( 1 ) be the C F associated to the single-kink utility maximization problem ( 1 ;R 2 ;C 0 ). Similarly, let C F ( 2 ) be the C F associated to associated single-kink EU problemM (~ s). Note that C F (p ) is in either one of the two intervals: (0;R 1 ) or [R 1 ;R 2 ). We assume C F (p )2 [R1;R2) and later verify this. Assuming C F (p )2 [R1;R2), consider two sub-cases. Case 1) C F ( 1 )R 1 115 Because we assumed C F (p )2 [R1;R2), EU (M (s)) =p U(R 2 ) + (1p )U(C F (p )) =p u(R 2 ) + (1p ) 1 u(C F (p )): Because we assumed C F ( 1 )R 1 , C F ( 1 )R 1 C F (p ) (46) Recall by Lemma 4, @EU @C F < 0 on all C F ( 1 )<C F , hence the 0 best 0 C F (p ) is the smallest C F that respects C F ( 1 )<R 1 C F (p ). )C F (p ) =R 1 : (47) Case 2) C F ( 1 )>R 1 In this case, we argue exactly as in the proof of Theorem 6, Claim 1, Step 1 to arrive at: C F (p ) =C F ( 1 ): (48) Pulling these two cases together, C F (p ) =max(R 1 ;C F ( 1 )) (49) We now justify our assumption C F (p )2 [R 1 ;R 2 ). Claim 6. C F (p )2 [R 1 ;R 2 ) Proof. Suppose C F (p )2 (0;R 1 ). Then, by same logic as Theorem 6, Claim 1, Step 1 C F (p ) =C F ( 2 ) (50) Pulling these together, C F (p ) = 8 > < > : C F ( 2 ); ifC F (p )2 (0;R 1 ) max(R 1 ;C F ( 1 )); ifC F (p )2 [R 1 ;R 2 ) Therefore, it suces to show that agents always choosemax(R 1 ;C F ( 1 )) overC F ( 2 ). We proceed (as before) in two cases. 116 Case 1) C F ( 1 )R 1 Here, max(R 1 ;C F ( 1 )) =R 1 , so the task is to compare C F ( 2 ) and R 1 . Consider g(c) and its two roots R and R. By Lemma 6,9~ c2 (R;R) such that g 0 (~ c) = 0. 1 u 0 (~ c) =l 0 2 (c) (51) On the other hand, by Lemma 5, 1 u 0 (C F ( 1 )) =l 0 1 (c): (52) Also, (as in the proof of Lemma 7) it is easy to see that l 0 1 (c)<l 0 2 (c) (53) Putting these together, 1 u 0 (C F ( 1 )) =l 0 1 (c)<l 0 2 (c) = 1 u 0 (~ c): (54) which means C F ( 1 )> ~ c. Combine this with the assumptions C F ( 1 )R 1 and R 1 <R (because we are in R ), we get: ~ c<C F ( 1 )R 1 <R (55) Since g(c) is concave in c, and g 0 (~ c) = 0, g 0 (c)< 0 on (~ c;R 2 ). This, (49), and Lemma 8 tell us g(c)> 0 on (~ c;R), in particular, g(R 1 )> 0. Namely, 1 u(R 1 )>l 2 (R 1 ) (56) Finally, since C F (p ) =R 1 , p must satisfy (1p )C F (p ) +p R 2 = (1p )R 1 +p R 2 =C 0 (57) We compare 117 EU(R 1 ) = (1p )U(R 1 ) +p U(R 2 ) = (1p ) 1 u(R 1 ) +p u(R 2 ) and EU(C F ( 2 )) =l 2 (C 0 ) =l 2 ((1p )R 1 +p R 2 ) = (1p )l 2 (R 1 ) +p u(R 2 ): Using (50), EU(C F (p ))>EU(C F ( 2 )) as desired. Case 2) C F ( 1 )>R 1 Task here is to compare EU underC F ( 1 ) vsC F ( 2 ). By Lemma 7,l 1 (c)>l 2 (c), and in particular for c =C 0 . Therefore, EU(C F ( 1 ))>EU(C F ( 2 )) as desired. Therefore, in each of the cases, indeed, C F (p )2 [R1;R2) as we assumed. We can now use (46) without any qualications. This immediately ties up a loose end we left in Theorem 6 (R H c ), which we state as a Lemma. Lemma 11. R H c . Proof. It suces to show that C F (p )6= C F ( 2 ). Recall that 2 < 1 , and hence by Lemma 3, C F ( 2 )<C F ( 1 ). )C F ( 2 )<C F ( 1 )max(R 1 ;C F ( 1 )) =C F (p ), as desired. We now use (46) to wrap up the proof. Consider the two cases: Case 1) C F ( 1 )R 1 By (43), C F (p ) =R 1 . This implies p = C 0 R 1 R 2 R 1 =, whence (i), (ii), (iii) follow directly. Case 2) C F ( 1 )>R 1 118 By (43),C F (p ) =C F ( 1 ). By same logic as in Theorem 6, Claim 1, Step 1, we can think ofM (s) asM ( 1 ;R 2 ;C 0 ), which is a setting where Theorem 1 - Theorem 5 apply. (i) follows from Theorem 5 and the chain rule. (ii) and (iii) follow from Theorem 1 and the chain rule. . Proof of Theorem 8: Without loss of generality, suppose that C <R always holds. (Assuming otherwise only strictly increases EU whence the same analysis can be used for the proof.) Consider a hypothetical sub-martingale L(p) with p = 1 and > 0. Namely, L(1) is a bet that gives with probability 1, and gives -1 with probability 0. Then, for any N > 0;EU(1;N ) =U(C S ) = U(C 0 +N)>U(C 0 ), with strict inequality. Since EU is continuous in p,9p 2 (0; 1) such that EU(p ;N )>U(C 0 ). Proof of Theorem 9: For the given problem, let p be the optimal solution when = 0; ceteris paribus. We want to show that any candidate p SM with p SM <p (1 +) is strictly dominated by p (1 +). Under the assumption, we know that C S = R, so we will assume this along the way. We rst establish some notations and gather related facts. Let L(C;C F ;R;U()) be the line (as function of C) that passes through the points (C F ;U(C F )) and (R;U(R)), whereU() is the single-jump aspirational utility function we have dened and used thus far. For the given,C 0 , R, and for any choice of p SM , letC 0 F (p SM ;;C 0 ;R) be theC F which satises the budget constraint-cum-; equation (17). Explicitly it is given as: C 0 F (p SM ;;C 0 ;R) := C 0 (1 +) 1 +p p 1p + R: Let C POE (p SM ;;C 0 ;R) be the `point of evaluation', dened as: C POE (p SM ;;C 0 ;R) :=(1p SM )C F (p SM; ) +p SM R =C 0 (1 +) p SM 1 p SM (1 +) + 1 p SM 1 p SM (1 +) p SM R C 0 (1 +) : The reason for this nomenclature will become evident. 119 Given these denitions and notations, some immediate facts can now be observed. Fact 1 First, under the given problem, the expected utility (EU) given the choice p SM is given by L(C POE (p SM )), with the parameter notations suitably suppressed. This is an immediate conse- quence of the denition ofL(C) andC POE and the denition of expected utility under our binomial setting. This observation is where the name `point of evaluation' comes from; the expected utility can be regarded as L(C) evaluated at the `point of evaluation', C POE . Needless to say, since U() in monotone in its argument and C F <R by denition,L(C) has positive slope. Fact 2 Second, we already know that C F (p ) is the tangent point to U(), so insofar as p SM 6=p , L(C;C F (p SM ;> 0);R;U()) will be strictly lower than L(C;C F (p ; = 0);R;U()) Fact 3 Third, some straightforward calculations inform us that L(C;C F (p (1 +);> 0);R;U()) =L(C;C F (p ; = 0);R;U()): That is, if changes from 0 to some strictly positive value, inducing a change in the associated L(C), we can revert back to the original ( = 0) line simply by increasing the choice of p SM by a factor of (1 +): Fact 4 Fourth, from the denition of C POE above and by dierentiation, it is not hard to sign the quantity: @C POE @p SM > 0; again, with suitable abuse of notation. We now have enough mustered up to prove the theorem. The task is to show that given the setup with> 0, any choice ofp SM such thatp SM <p (1 +) is strictly dominated by an alternative, feasible choice of choosing p (1 +), which is, by denition, greater than p . From the second and third fact, we know that: L(C;C F (p SM ;> 0);R;U())<L(C;C F (p ; = 0);R;U()) =L(C;C F (p (1 +);> 0);R;U()) From the rst fact, we know that EU can be evaluated from the lines L(C) at suitable C POE 's. Recall thatL(c) have positive slopes. Hence, the only case where EU(p SM )>EU(p (1 +)) can 120 ever happen with p SM <p (1 +) is when: C POE (p (1 +);;C 0 ;R)<C POE (p SM ;;C 0 ;R); for some p SM <p (1 +). But this contradicts the fourth fact, proving our claim. Proof of Lemma 3: Step 1) Show that any consumption scheme whose C S exceeds R is dominated by a consumption scheme whose C S equals R. Let N 0 be the N such that C S = R on equation (6). Then as in the proof of Lemma 1, EU NN 0 =(1p)U(C F ) +pU(C S ) =(1p)U(C 0 N) +pU(C 0 +N(M 1)) Then dierentiate EU NN 0 with respect to N at N 0 , p : @EU @N N 0 ;p =(1p )U 0 (C 0 N) + (M 1)p U 0 (C 0 +N(M 1)) N 0 = (1p ) (1 + (1p ) )U 0 (C S )U 0 (C F ) N 0 = (1p ) (1 + (1p ) )U 0 (R)U 0 (C F ) < 0; where the last inequality follows from the domain of p, and assumption (18). Step 2) Showing that C S <R never happens is same as in Lemma 1. Theorem 10: We proceed in two steps. In Step 1, we show that the optimal consumption po- sitions are identical to the those of the optimized binomial scheme. In Step 2, we show that the probability mass on C 0 must be 0 for optimality. 121 Step 1: Consider the standard solution to the EU-maximization problem using binomial con- sumption scheme. Let C F and C S be optimal C F and C S in this binomial solution. LetT with P = (p 1 ;p 2 ;p 3 ) andC = (C S ;C 0 ;C F ) be the optimal trinomial consumption scheme toM (;R;C 0 ). ThenC = (C S ;C 0 ;C F ): Proof. (Given any p 2 , 0 p 2 < 1.) The proof goes by comparing the objective functions of the binomial and trinomial optimization. The trinomial optimization problem is: max p 1 ;p 2 ;p 3 p 3 U(C F ) +p 2 U(C 0 ) +p 1 U(C S ); subject to p 3 C F +p 2 C 0 +p 1 C S =C 0 and p 1 +p 2 +p 3 = 1 or equivalently, max p 1 ;p 3 U(C 0 ) +p 1 U(C S )U(C 0 ) +p 3 U(C F )U(C 0 ) ; (58) subject to p 1 (C S C 0 ) +p 3 (C F C 0 ) = 0 and p 1 +p 3 = 1p 2 (59) Recall that the objective function for the binomial scheme was: max p U(C 0 ) +p U(C S )U(C 0 ) + (1p) U(C F )U(C 0 ) ; (60) subject to p(C S C 0 ) + (1p)(C F C 0 ) = 0 (61) Scaling (58) - (59) by 1 1p 2 yields a monotone ane transformation (ane in the choice variables) of (60) and identical constraint, hence identical solution, up to the C positions. (P cannot be pinned down yet, because it is determined only up to scaling.) Step 2: p 2 = 0. Proof. By an argument similar to the proof of Lemma 5-Lemma 7, we know that: (1)U(C F ) +U(R)>U(C 0 ); for 0<< 1: 122 On the other hand, some manipulation on (58) yields EU = (1p 2 ) (1p 2 p 1 ) (1p 2 ) U(C F ) + p 1 (1p 2 ) U(R) +p 2 U(C 0 ): Hence, to maximize EU, we require p 2 = 0. Proof of Theorem 11: This is really a Corollary of Theorem 10. By Theorem 10, we can reduce the solution spaceT to the space ofB. Then by Lemma 1, theC S of the associatedB must equal R. 123 Part III The Quest for Status in Two Flavors Abstract The status literature has suggested mainly two reasons why social status may be desirable. First, status can mean entitlement to more resources in the future. Second, higher status gives a sense of advancement vis-a-vis peers. We model status-seeking behavior explicitly as an activity that maximizes satisfaction drawn from these two drivers of status. The model provides a framework to (i) classify status goods and (ii) endogenize the choice of optimal status goods we classify, depending on the wealth level. The model also generates an interim utility function that reminisces that suggested by Friedman and Savage (1948). 124 1 Introduction Economists have recognized the human desire to seek social status ever since the days of Adam Smith, who notes that at higher levels of income, people value the \social esteem" brought on by their wealth more than the consumption of goods and services (see Smith (1759), p.70). Similar observations have been made by Duesenberry (1949), Friedman and Savage (1948). Some biologists argue that status-seeking is neurologically engraved into the behavior of animal species at large, observing that for certain organisms, (such as lobsters) individuals that enjoy high status within their pack are known to secrete hormones (serotonin) that are thought to be associated with the reduction of stress and heightened sense of satisfaction. If status is desired, why is it desired? One possible reason, as is identied in the literature 1 , is that people desire status for the resources it can buy. Namely, status typically carries an expectation of entitlement to certain exclusive forms of resources not readily available to those with lower sta- tus. Another reason discussed in the status literature - similar to the notion of \keeping up with the Joneses" - is that people enjoy high status because it gives them the sense of advancement in their positions vis-a-vis their peers. That is, people simply enjoy the sense of feeling superior to others, in the form of higher status. Given that wealth is closely related to status, this could lead to relative wealth concerns, and the possibility of drawing satisfaction from the opportunity to signal their wealth level. With these notions in mind, we propose a model with the aim of directly modeling status-seeking behavior itself and to explore its implications. We also show how some of these considerations give rise a form of utility function reminiscent of Friedman and Savage (1948), which in turn leads to endogenous levels, and composition of status demanded by an agent. Status is not always a directly traded good 2 . However, actions can be taken to attain status. The actions agents can take in this model explicitly re ect the two sources of desirability of status dis- cussed in the literature. One action agents can take is to `invest' in status, in the hope of gaining better access to resources. That is, an agent expends resources on activities that could potentially lead to an advancement of her status, which in turn would allow her to command access to more resources. This notion is in line with ndings such as by Ball et al. (2001) where those endowed with high status are shown - in an experiment setting - to command a premium in the price of goods they trade. Thus, it is reasonable for a rational agent to seek status as a future `investment' 1 For example, the survey paper by Heetz and Frank (2011) 2 Some assume that status is explicitly traded in a hedonic market for status. See for example, Becker et al (2005) while many others posit that it is traded indirectly, for example, through a signalling good. 125 with a clear understanding that this could give her to access to more resources. A realistic example would be membership to an exclusive country club, which serves as a venue for networking and exchange of valuable information, some of which could potentially lead to further opportunities. For professional musicians (especially classical music performers), the purchase of an expensive musical instrument (e.g., a Stradivarius violin) could be an investment in status in the sense that owning such an instrument often gives the performer some `credibility', and hence prospects for further success. An alternative way to seek status is to purchase conspicuous, expensive goods to `signal' a level of wealth, much like peacocks do with their tails. While this is not a direct purchase of status, it achieves its goal through signaling a certain positionality in the wealth distribution, which constitutes a source of satisfaction drawn from status. Clearly though, these two forms of `ac- tions' are not mutually exclusive, even within a single given status good observed in the real world. For example, the purchase of an expensive violin could primarily be an `investment' for the per- former, but could also function as a way to `signal' the successful status and wealth of the musician. The idea of adding preference for status to the utility function is not altogether new. And the literature has also explored the impact of status considerations on risk-taking (Ray and Robson, 2012), savings decisions (Hopkins and Kornienko, 2006), income inequality (Becker et al., 2005), investment decisions (Roussanov, 2010), just to name a few. Yet most of these studies focus on fragmentary elements of status, quite often reducing it down to what is indistinguishable from simple relative wealth concerns. We believe that the human desire for status is likely to be more avorful than the common treatment of it and propose to rene it by adding at least two elements. Firstly, we observe that our concern for status is often local, in the sense that comparison is made predominantly among peer groups rather than against the global population. In fact, this is in the spirit of the empirical ndings by Charles, Hurst and Roussanov (2009) where they document that the low-income race tend to consume more conspicuous goods in order to separate themselves from those that are marginally poorer than themselves. These suggest that it is more realistic to model status concerns as an impetus to win over one's immediate rivals, rather than ones that are \out of one's league", yet the current focus of the literature seems to be on the later. Secondly, we directly model status as an `investment', i.e., as an instrument to empower oneself to even more resources. While this is an idea recognized ever since Adam Smith 3 , it is to our knowledge, neglected in models that incorporate status concerns to analyze economic choices. Yet we show 3 In the Wealth of Nations, he writes: \A linen shirt, for example, is, strictly speaking, not a necessary of life. The Greeks and Romans lived, I suppose, very comfortably, though they had no linen. But in the present times, through the greater part of Europe, a creditable day-labourer would be ashamed to appear in public without a linen shirt, the want of which would be supposed to denote that disgraceful degree of poverty, which, it is presumed, nobody can well fall into without extreme bad conduct." 126 in this paper that incorporating these elements yields many interesting outcomes, for example, it entertains the notion that we \choose whether to be the big sh in a small pond, or the small sh in a big pond", and when those choices are made. Furthermore, it illustrates how we use status to facilitate a jump to the next pond. Our model is distinguished from the literature in that we model status-seeking behavior itself and explore the consequences of such activity, whereas the literature takes status-seeking activity as a given, often devolving it down to the notion of simple external habits. Also, in the process of the analyses, we recover a utility function whose shape reminisces one proposed by Friedman and Savage (1948) and more recently, an aspirational utility function (Diecidue and van de Ven, 2008). Obviously, the shape of this utility function has implications on investment behavior as well. In a separate paper (Lee, Zapatero, Giga, 2019) we explore a particular case of this implication on investment decisions: preference for skewness. Lastly, the model provides a rationale for why sometimes we make consumption decisions in indivisible bulks (especially when the good is conspicuous), much in line with our experience of saving up every penny to buy a house or luxury cars, etc., as opposed to the perhaps more comfortable alternative to just settling for a modest lease or rent. 2 Setup In this section, we set up an optimization problem that explicitly incorporates preference for status. We start by introducing a set of assumptions that govern the agent's preference over status. We then introduce status goods (S2R 2 + ) on top of the usual consumption good (C2R + ). The reason status goods are modelled as two-dimensional objects (S2R 2 + ) is to accommodate the two drivers of status identied by the literature, each component inR 2 + representing a driver. The consumption good is standard, over which we assume CRRA preference. We then introduce a utility function that encodes agent's joint preference over C andS, and specify the optimization problem that the agent faces. 2.1 Preference Over Status The model we propose aims to incorporate two realistic features of status-seeking. First is that the focus of status seeking is often `local,' the exact meaning of which we specify shortly, then propose its representation of this local preference. We also discuss a natural (continuous) extension of this local preference. Second, we acknowledge that status-seeking is probably deeper than simple relative wealth concerns (keeping up with the Jones) and model what we call the `investment component' of status. The idea is that status-seeking may perhaps serve practically useful purposes - almost 127 like investments - rather than simply being a vain and senseless \ego trip". We begin with locality. 2.1.1 The Local Nature of Status Consider an agent i who owns wealth of value W i . One clear value of this level of wealth (W i ) is that it allows her to consume an amount C(W i ) of her choice. On top of this pure consumption value, the rough whereabouts of W i - relative to others - may also be of interest in its own right in the sense that it determines her `status.' This representation of status concerns, which takes the form of the relative position of wealth vis-a-vis her peers, is very well-recognized in the status literature, a excellent review of which is by Heetz and Frank (2011). Who qualies as peers is also a topic much discussed in the literature. Co-workers at workplace is often mentioned as a plausible candidate, as are residential neighbors. We observe that one com- mon feature in these proposed denitions of peers is that they are likely to possess similar levels of wealth. This suggests that status concerns - in the form of relative wealth concerns - is predom- inantly local. For example, an agent with median wealth is likely to compare his standing against other agents possessing wealth levels close to the median level, rather than an extremely wealthy individual (e.g., Bill Gates or Warren Buet) who is completely `out of his league'. In other words, the evaluation of status is much more likely to be made locally against those that are considered `peers'. This notion of locality nds support from empirical studies - for example, Charles, Hurst and Roussanov (2009) where the main thrust of the nding is the urge to avoid being stigmatized as belonging to an income group just lower than their own - yet the vast majority of the literature treat status as a global phenomenon 4 . Suppose that the agenti nds that her peers are agents whose wealth level fall in the closed interval W i l ;W i u . This interval, W i l ;W i u , is the target status bracket of wealth for agent i, following the apparently well-accepted focus on wealth in the literature in determining peer groups. LetL denote the length of this wealth bracket, (i.e., L =W i u W i l ) which represents the `variance' of the wealth levels of an agent's peers. Insofar as the interval W i l ;W i u represents the wealth level of her peers, it is likely that: W i 2 W i l ;W i u ; or at least that W i falls somewhere in the vicinity of the bracket W i l ;W i u , although we do not necessarily make this a requirement. All we require is that the agent i has a target interval of wealth, of length L, which she feels spans the wealth level of those she considers her peers. For 4 For example, in many studies, status is represented simply as deviation from the average income level, etc. 128 simplicity, we assume that L is identical for all i. 2.1.2 A Representation of the Local Preference for Status Suppose that the agent can signal convincingly that her level of wealth is S. Recall that W i l ;W i u is the target status bracket for agent i. The positional nature of status, combined with the local- ity we assume implies that the relative position of S within the bracket W i l ;W i u is an important source of (dis)satisfaction for agenti. How an agent interprets her signaled position within the local wealth bracket is an open question. One extreme case is a preference represented by (S) [W i l ;W i u ] , a step function restricted to the target bracket W i l ;W i u , depicted as the blue line in Figure 1 below. The stepwise increment of (S) [W i l ;W i u ] occurs at a threshold point (say, the mid-point W mid ) after which (S W mid ) the agent feels endowed, and satised with her status, at least locally. The choice of the midpoint (W mid ) as the threshold is motivated by the observation that we are often dualistic in our interpretation of the state of the world, and consequently, we would at the very least like to be in the `better half' of the local status bracket, rather then the inferior half. Nevertheless, this choice (W mid ) is a simplication that is not crucial, and can be varied without disturbing the essence of the results we derive: for example, we could have let the threshold be at the 70 th percentile. Even more generally, (W mid ) may even represent the wealth level of her `rival'. Figure 1: Step and mollied versions of (S) [W i l ;W i u ] For now, the magnitude of the step is normalized to be of unit size. But it is perhaps unrealistic 129 to assume that every individual receives an identical amount of satisfaction from exceeding her W fmidg . Hence, later we will weigh(S) by an individual-specic parameter (m i ) that controls this idiosyncrasy. 2.1.3 Mollication: An Alternative Representation The sharpness of the step function is perhaps a simplication that requires some relaxation. A more realistic representation of the local status preference could be obtained by mollifying the step function, in order to entertain the possibility that the gain in status is perceived more gradually than at a sharp, singular point, W mid . Mollication is a technique that is widely used in applied mathematics - for example in the study of partial dierential equations - whose purpose is to nd (a sequence of) smooth functions that approximate a (possibly) sharp function. This approximation is achieved by computing the convolution of the sharp function with a mollier, which is itself also a function. The degree to which the original sharpness is mollied can easily be controlled by a parameter innate to the mollier. For our purpose, the step function is mollied to deliver a more smooth payo when the agent's signal S exceeds W mid , as is depicted in Figure 1. See Appendix for details on the mollication procedure. 2.1.4 Extending the Local Preference to Adjacent Brackets While we assume that status preferences are innately local, we need to impose some structure on how this local preference relates to the agent's interpretation on his `global' status, i.e., within the entire wealth distribution [0;1). The purpose of this, within the context of this model, is simply to make a stand on what happens when the agent's signal (S) migrates away from the initial wealth bracket W i l ;W i u to an adjacent wealth bracket such as W i l L;W i u L or W i l +L;W i u +L : We emphasize nevertheless that the main focus of the agent, as well as ours, is still on the local wealth bracket. We assume that the agent is consistent when she extends her local preference globally, that is, we assume that the agents `smooth-paste' their preference of local status to evaluate her global status, as depicted in Figure 2 of () below. This is an innocuous extension, in the sense that it ensures continuity of (), and hence that there are no other source of gain in utility than what is implied by the local preference for status, even in the global sense. This is in line with our intended focus on the local nature of status concerns. 130 Figure 2: Continuous Extension from Local to Global Status 2.2 The Goods: Consumption and Status There are two goods in this model, consumption good (C2 R) and status good (S2 R 2 ). C is the usual consumption good, and we assume that it enters the utility function as an argument of the CRRA utility function. A status goodS is comprised of two components: `investment' (I) and `signal' (S). Each component corresponds to the two desirable aspects of status discussed above. S is the `signaling' component of status goodS. S denotes the amount of wealth they signal, that is,S entails the recognition that the signaling agent's wealth is (at least) S. The function, whose argument is S quanties the satisfaction an agent draws from the good S. As in I, agents can choose from a continuum of S goods, earmarked by its price, P S 2 (0;W max ]. We assume that it takes a P S -good to signal exactly that much wealth level, that is: P S =S: (1) This means that people are conservative about how they interpret the status signal; they need to commit to expending resources to signal exactly how much they have spent. 5 5 Clearly, PS (S) = S (i.e., an identity mapping) is not an equilibrium notion. An equilibrium would require a `correct belief' condition, namely that PS (S) mapping satises W = S(Ps(S);W ) so that the price scheme would support a truthful choice of an individual with wealth level W . Yet we chose to stay with PS = S because we are more focused on an individual's choice rather than a general equilibrium, hence opted for a simplifying conservatism. This conservatism also abstracts away from game-theoretic strategic action, for example, the possibility of deviating from a truth-telling equilibrium to signal a wealth level that the agents cannot aord. 131 I is the `investment' component of the status goodS. Its sole value is in that if it is purchased at time t, it oers agents a positive chance (with probability Pr W ) of propelling her to the upper adjacent wealth bracket W i l +L;W i u +L at time t + 1. The idea is that if the `investment' on status is successful, the agent will leverage this investment component of status and use it to propel herself to the upper adjacent level of wealth. Examples would include membership to a social club or perhaps even education (e.g., MBA degree, etc.) The agents can choose from a continuum of I goods, earmarked by its price, P I 2 [0;L]. That is, I =P I ; P I 2 [0;L] (2) The real-world interpretation is that there would be a continuum of status goods to choose from, each with varying degree of emphasis on the level of \investment" component they represent. For example, some goods are more conspicuously suited for unabashedly `showing o', while others are geared towards providing more networking opportunities. Quite naturally, we assume that higher P I , implies higher Pr W ; that is, choosing a more expensive `status investment good' ensures a higher probability of propelling the agent to the next level. A simple example would be a linearly increasing Pr W in P I , namely: Pr w = P I L ; P I 2 [0;L]: (3) Note that in line with our assumption that status concern is predominantly local, we assume that P I 2 [0;L] to ensure that the leap is, at best, to the adjacent wealth bracket. This also happens to be a realistic feature, in the sense that advancements are typically earned step by step, even for those that are most successful. 2.3 The (Single Period) Budget Constraint At any given point in time, an individual can expend her wealth, W i , on the following items: C i 2R andS i = S i I i 2R 2 . Also, as per previous argument, I =P I and S =P S . Hence, C i +P i I +P i S W i ; (4) at any given time, where C i ;P i S 2 0;W i , and P i I 2 [0;L]. The structure of the problem dictates that expenditure of I = P S at a given period results in a future budget constraint that is proba- bilistically determined, depending on the value of I. That is, given I =P I today, the next period 132 budget constraint is: C i +P i I +P i S W i with probability (1 P I L ), and C i +P i I +P i S W i +L with probability P I L . Finally, note that given our specication (3), in spirit, I is a `martingale'. An investment of amount P I today leads to an expected gains of size: 0 (1 P I L ) +L P I L =P I ; which exactly cancels out the initial investment P I . 2.4 The (Single Period) Utility Function The utility function for the i th agent takes the form: U i (C i ;S i ) =u(C i ) +m i (S i ) (5) The rst term u(C i ) is the power utility function over consumption, assumed identical across all agents. As was previously discussed,(S i ) in the second term represents how agenti interprets her signalS i , and can either take the form of a step function ( s ) or its mollied version ( ) for better realism. The term (S i ) is scaled by m i , representing the individual's weight given to signaling status, namely the size of the jump. If m i is high, we may think of the agent as an `insecure' type; one who seeks a lot of recognition from others, and hence derives big utility from signaling his wealth level to others through S. Conversely, if m i is low, the agent is a `secure' type whose perception of self-worthiness is largely self-driven. Finally, note that I does not immediately enter U(). This is because the role of I is to probabilistically propel the wealth of agent i (W i ) to the next level (W i +L) and hence I only appears on the budget constraint. 2.5 The Full, Multi-period Problem Because investment is a dynamic activity, the model needs to incorporate an multi-period setting. We consider the simplest case: a two period model. For the sake of tidiness, we will suppress the notation indicating the agent (i) going forward, and consider a representative agent with a xed m i =m and Pr i w = P I L . At time t, if the agent with wealth W expends I =P I , the budget constraint is: C +P S WP I : (6) As per our previous argument, this yields a probabilistic budget constraint at timet 0 (next period): C 0 +P 0 I +P 0 S W; (7) 133 with probability (1 P I L ), and C 0 +P 0 I +P 0 S W +L with probability P I L : Regarding utility, we make the usual assumption that utility is additive in time, and hence multi- period utility function takes the form: U(C;S;C 0 ;S 0 ) =U(C;S) + E[U(C 0 ;S 0 )jI]; (8) where is the time discount factor, and E[U(C 0 ;S 0 )jI] denotes the conditional expectation, con- ditional on I at time t. Suppose the agent of interest is endowed with the wealth level of W , and has a target wealth bracket W l ; W u . Given the target wealth bracket and current wealth, we can dene the multi-period utility U(C;S;C 0 ;S 0 ). The full optimization problem is: max fC;S;I;C 0 ;S 0 ;I 0 g U(C;S;C 0 ;S 0 ) =U(C;S) + E[U(C 0 ;S 0 )jI] (9) subject to (2), (3), (1), (6), and (7). 2.6 Remarks First, we assume utility function to be separable in C andS but reality, this does not have to be so. Suppose that the optimal choice that arises as solution to (9) picks C andS := S I . But ultimately,S has to be realized through some act of purchase in the market, and in the end, an additional unit ofS could impact the marginal utility of C in either direction. If an additional unit ofS increasesMU C , it would suggest thatS andC are complements, and a likewise decrease would imply that they are substitutes. The literature has unfortunately argued for a case in each direction. One the one hand, insofar asS takes the form of a luxury good, it can displace the utility for ordinary consumption good, for instance, the purchase of a luxury car can render commuter rides less useful. But on the other hand, higher status can necessitate complementary consumption, for instance, a luxury house must be complemented by a decent outt or gardening services 6 . As there is no consensus in either direction, we chose separability, which also helps simplify the analysis. Second, modelling status goods (S) as vectors inR 2 may seem slightly unconventional, and perhaps 6 For a good review, see Becker et al (2005). 134 warrants an elaboration. Suppose an agent choosesS := S I . In reality, the optimized choice S I may be realized by purchasing a basket of status goods in the market with dierent S I proles; a luxury car, a house in a decent neighborhood and membership to a country club. In particular, we assume linearity in the composition of the basket. For example, if we restrict to two goods in the basket, any two choices of status goods that embody S 1 I 1 and S 2 I 2 such that S 1 I 1 + S 2 I 2 = S I works, allowing us to add status goods just as we would add vectors in R 2 . We recognize that the composition structure may be non-linear in the full complexity of reality. But we chose this additive structure because our aim is to model the agent's choice, rather than the composition of the choice. 3 The Reduced Problem Solving (9) may seem onerous given the multitude of choice variables (C;S;I;C 0 ;S 0 ;I 0 ) to be determined. So we rst deal with a special case: a single-period problem where the market for investment component of status (I) is shut down, automatically xing I = 0. We call this the reduced problem, which is to nd: U (W ) = max C U(C;WC); 8W; (10) whereU() is the utility function dened in (5). Roughly speaking, reason why (10) helps is because both the investment decision (i.e., how much to invest) and its outcome (i.e., whether or not it successfully increases the next period budget) are suciently captured as variations in W . Hence, once U (W ) is known contingent on a W value, the only remaining task is to determine I based on U (W ), eectively reducing (9) to: max I U ( WI) + E[U (W 0 )jI]: (11) Thus, the full problem (9) is amenable to a two-step reduction: rst (10), then (11). See Appendix for more detailed description of this breakdown. Intuitively, we can understand U (W ) as the utility function of a locally status-aware individual when the market forI is incomplete. ThusU (W ) has signicance in its own right - over and above its value as an intermediary bridge to the full solution - insofar as the market for I is more sparse than the market for S, as is likely to be the case in reality. We rst establish the solution for the case where is a step function ( s ), then generalize to the case for mollied step functions ( ). 135 3.1 Solution to the Reduced Problem Under Step Function Representation ( s ) Underneath anyU (W ) value are the optimized choices onC andS that support it. For a givenW , letfC (W );S (W )g be the optimal choices of C and S in the reduced problem so that U (W ) = U(C (W );S (W )). When we are using a step function representation ( s ), there is a simple way to characterize these choices. 3.1.1 Characterization of C (W ) and S (W ) The slope of the CRRA utility function decays quickly in its argument (C), whereas this is not the case for S, as graphed in Figure 2 under `smooth-pasting' assumption. Intuitively, it is thus clear that C (W ) will be bounded from above. Proposition 1 species this bound (depicted as C in Figure 3 below) and in fact goes on to completely characterize the solution. Proposition 1. Consider f(C;k) = u(C) m L (Ck). Let k L be the (unique) value of k such that f(C;k L ) has two roots C and C such that CC =L. Then C 2 [C;C): Figure 3: u(C) and m L (Ck L ), where L =m = 3 Proposition 1 connes the optimal C (W ) to within an interval of length L : [C;C). This conne- ment is already powerful enough to characterizefC ;S g completely for any given W and L. To see why, rst let G :=fW mid1 ;W mid2 ;W mid3 ;g be the points of stepwise increments of s , each consecutive element being a distance of L apart. Note that S can only take values in G. This is because any deviation from these choices, say ^ S, will be strictly dominated by an alternative 136 feasible choice of the largest element in G(< ^ S), and re-allocating the dierence ( ^ SW mid ) to consumption (C). Also, by construction of k L , the interval [C;C) is of length L, so determining the point S 2G will x a point in [C;C) unequivocally. Pulling these together, we can conclude that S is determined as the largest feasible element in G that the agent can choose (W mid ) given W and the restriction that C 2 [C;C), and the rest is allocated to C . 3.1.2 `Resetting' the Marginal Utility of Consumption The connement of C is intuitive, in the context of the diminishing marginal utility of consump- tion, as opposed to status concerns that persists as long as there are `peers' to compare to. It is then natural to see agents allocating resources to desires that persist, than to those that fade away. This oers a possible explanation as to why individuals who seem to have accumulated enough wealth - where the marginal utility is vanishingly low according to standard (e.g., CRRA) utility - still seem engrossed in ceaseless pursuit for more wealth: even the richest never run out of `peers' to compete their status against. At a more granular level, Proposition 1 works through the process of `resetting the marginal utility of consumption'. Figure 3 shows that (C;C) - the roots of f(C;k L ) - describe the point where consumption and status are interchangeable. This means that the agents are indierent between the following (C;S) bundles, both available with wealth level ~ W : (C; ~ WC) (C; ~ WC) That is, the agent is indierent between substituting a high level of consumption (C) to a lower level of consumption (C) as long as she can get compensated by a reciprocating ascent in status going from S = ~ WC to S = ~ WC: The structure of f(C;k L ) ensures that these are feasible, and exactly compensating variations. Given this possibility, the agent engages in this substitution every time her wealth level allows her to do so (W ~ W ). This is because by doing so, she can reset her marginal utility of consumption to a higher level (u 0 (C)>u 0 (C)) given any W ~ W , so that her utility will dominate the utility when the substitution has not been made. Therefore, the optimal consumption C is held within the bounds of [C;C). Figure 4 succinctly illustrates the process of substitution and the consequent `resetting' of MU c . Whenever the wealth level W allows the agent to do so, she switches the consumption of amount L for an equal amount of S, and the marginal utility of consumption jumps, leading to a new `carapace'. Graphically, the optimizedU (W ) emerges as we keep inscribing triangles (of height m 137 and length L) into the CRRA utility function. Figure 4: Resetting Marginal Utility of Consumption by Substituting for Signal (S) 3.1.3 Remarks ConsiderW i 2 W i l ;W i u , namely, the case where agent's wealth level falls within the target bracket. In some cases this may mean thatS i <W i l . That is, some agents would be forced to signal a wealth bracket below their own. This feature arises because signaling detracts from consumption, which can be very costly in utility terms if their wealth level is not suciently large. These agents could optimally choose to signal down a bracket, if their wealth level is not sucient to signal their own bracket. 7 As a real world example, consider the purchase of a house in a decent neighborhood. Housing provides some consumption value, but the location and size also often signal the dweller's wealth, functioning as an S good in our model. While it may be the case that signaling an agent's own target wealth bracket requires the full purchase of a house, she may still optimally choose to scale down to a less opulent neighborhood and buy a luxury car to signal a bracket below the target instead, if the agent nds that the purchase of a house will force her consumption level to be 7 Of course, there could be other cases whereW i is high enough, so that the agent could optimally choose to signal her own target bracket while also enjoying an acceptable amount of consumption. This could happen if the agent's wealth, for example, is close to the upper end of the bracket, W i u . 138 intolerably low. The agent will optimally choose which bracket to signal by weighing the marginal utility of consumption (cost) against the satisfaction drawn from signaling (benet). One may wonder whether the propensity to signal below the target wealth bracket inherits from the sharpness of s , in the sense that the discreteness of G :=fW mid1 ;W mid2 ;W mid3 ;g may coerce the agents to make an involuntary choice to signal downwards if the wealth level falls in the discrete interval before the next jump can be made. An obvious way to explore this possibility is to dull away the sharpness by mollifying s . Also, we have already seen (in Proposition 1) thatS2G in the current step-function representation. That is, the choice of S is made in discrete bulks, occurring precisely at the points of increments of s . This seems obvious since s increases sharply in steps. A natural question is whether this feature holds even when s is mollied to the more realistic . Mollication is therefore a worthwhile pursuit, which is the focus of the next section. It turns out that it also reveals several interesting choices made by status-conscious individuals. 3.2 Solution to the Reduced Problem Under Mollied Representation ( ) As per our previous discussion, we mollify s to get and explore the solution to the reduced prob- lem U (W ) under this representation. This will generalize the results in the previous subsection. Going forward, we assume that the mollication is not `too strong' in the following sense. 3.2.1 Assumption On the Strength of Mollication Assumption 1. Considerf(C;k) =u(C) m L (Ck) as in Proposition 1. LetkL 2 be the (unique) value of k such that f(C;kL 2 ) has two roots C1 2 and C1 2 such that C1 2 C1 2 = L 2 . Assume that `' is suciently small, such that the following holds: jm 00 (W mid )j>ju 00 (C1 2 )j: The lefthand side of Assumption 1 determines the curvature of , whereas the righthand side determines the curvature of the CRRA utility over consumption. Note that on the lefthand side, the parameter controls how strong the step function ( s ) is mollied. On the one extreme, (!1), the step function is so mollied that it becomes a straight line with slope m L . On the other extreme (! 0), the step function does not get mollied at all. The role of this assumption is to put an upper limit on , so as to ensure that the curvature of the mollied step function does not become so dull that its impact on the utility function is overshadowed by the impact on CRRA utility over consumption. That is, we want the uctuations from status concerns to be at least as `interesting' as those arising from consumption. This is not a strong requirement: for example, 139 the level of mollication exemplied in Figure 2 very safely meets Assumption 1. Even when this assumption does not hold, we still observe the eects we document, but their magnitudes tend to vanish as!1. Given Assumption 1, the eects become more conspicuous, and proofs are easier. 3.2.2 Characterization of C (W ) and P S (W ) GivenW , letC (W ) andP S (W ) denote the optimal choices such thatC +P S W 8 . Propositions 2 - 5 are mollied analogues of Proposition 1, and jointly help characterize C P S . But even taken separately, they each carry economic meaning on their own. Proposition 2 (Equal Marginal Principle). C and P S satisfy u 0 (C ) =m 0 (P S ). Proposition 2 is the familiar `equal marginal principle' from microeconomics. It tells us that the agent will allocate C and P S so that the marginal benet from consumption (MB C ) and status (MB S ) are equalized, much in the same way we model the optimal choice between apples and bananas in microeconomics. In particular, decisions are based on marginal considerations. Now, suppose further that Assumption 1 holds. Proposition 3 (Big Fish Preference). P S 2 (W mid ;W U ). Recall thatW mid denotes the point of status gain in a generic local target wealth bracket, and W U denotes the upper end of the bracket, as depicted in Figure 1. Hence the blue curves in Figure 5 depict the candidate locations ofP S outlined by Proposition 3. Therefore, Proposition 3 eectively states that given an equally feasible choice of (1) signalling higher than theW mid of a lower bracket versus (2) signalling lower than the W mid of a higher bracket, the agent would choose the former, even if choosing (2) signals higher status. In other words, the agent would always choose to be the \big sh in a small pond (lower bracket)", rather than a \small sh in a big pond (higher bracket)". One may wonder why this would be the case, since after all, a \bigger pond" dominates status in the \small pond" even if the agent is a \small sh" in it. While it may be mysterious at rst, this `big sh preference' simply stems from the fact the extra pleasure the agent derives from signalling that they are \(the small sh) in a big pond" is not big enough to outweigh the satisfaction she would have drawn from settling for a signal that they are \(the big sh) in a small pond" and diverting the resources to consumption instead. In other words, the pleasure of signalling that they are inferior toW mid - even if it is in the higher bracket - is not quite enough to overcome the utility loss in consumption required to do so. The `big sh preference' is ultimately a re ection of the fact 8 Recall that we are assuming (1): P S = S . We switch to denote the task as characterizing the choice of PS (instead of S) to emphasize the fact that choices can be made over the entire set of signalling goods earmarked by price. This choice was rather trivial in the stepwise representation. 140 that agents in this model have peers to rival and win against, and that they wish to signal when they win. Finally, note that we are still in the reduced problem. When we complete the I-good market in the full problem, an interesting twist is added to the `big sh preference'. Proposition 4 (Saving Every Penny to Make Status Jumps). P S is everywhere strictly increasing. Combining Proposition 4 with our earlier observation - thatP S is located on disjoint (closed) inter- vals as in Figure 5 - meansP S must jump at certain points asW increases. That is, if the candidate solution space is disjoint, and increasing throughout, it must be the case that in some cases, the increment must eventually be discrete. This is depicted in Figure 5 as the upward jumps over the blue curves. This means that somewhat surprisingly, even when the is mollied and agents are obeying the `marginal' principle as specied in Proposition 2, the choice of status good are made in discrete bulks 9 . At a fundamental level, this counter-intuitive result shares the same root as the big sh preference (Proposition 3) - the human desire to be a `winner' in the status game, rather than a `loser'. This desire to win compels agents bypass the `losing' region and jump straight to the next `winning' region, when the wealth level W hits a trigger point 10 . A natural question to ask would be \what is funding this jump in status?" Note that Proposition 3 restricts optimal choices (P S ) to the concave parts of . Then, by Proposition 2, the corresponding C must share the same slope on u(C), thereby restricting C to the red portion of u(C) on the lower panel. 11 Furthermore, given that the budget constraint is binding (hence, C =WP S ), the discontinuity ofP S (W ) would imply thatC (W ) is also discontinuous. In other words, the jump in status would have to be funded by reciprocating downward jumps in consumption, every time the status jump is made. The recurrent downward jumps on the red curve in Figure 5 illustrate this conjecture, which is in fact shown to be true in Lemma 2 that follows. This renders an explanation as to why we sometimes economize ambitiously before making bulky consumption decisions, such as the purchase of a house, even when there is clearly a more comfortable alternative of settling for a more humble choice that relieves us of the pains of economizing. In the context of the model, this is an agent making a jump for the next level of status (blue curve) while at the same time 9 If Assumption 1 fails, this may not be true; there may be cases where the discrete jumps do not occur. It is interesting to note, however, that even when Assumption 1 does not hold, agent will still `sail through' the (W mid ;WU ) interval slower than they would on the (WL;W mid ) interval. This means that in spirit, agents still `jump' over the losing region even when the strength of mollication is beyond the tolerance level of Assumption 1. 10 Mathematically, the discreteness is simply due to the periodically alternating sign on the second derivative of , which means that in order to stay on the regions where the second order condition holds, the optimal choices will have to `jump'. 11 To be drawn more precisely, the red portion should be wider to honor the equal slope restriction, but since the jumps must be identical in size (in opposite directions) the candidate solutions cannot be too wide. This will become more evident in the upcoming sections. 141 economizing to facilitate that very jump (red curve). The standard utility functions (e.g., CRRA) would not be able to reconciling these choices as it would simply dictate that the agent would invest in a equity or perhaps a REITs fund if it provides a high enough return on average. Figure 5: Solution Space Reduced by Propositions 2 - 4 Proposition 5 (Status Creeps). Let S s denote the size of S-good jumps in the unmollied problem and let S denote the size of S-good jumps in the mollied problem. Then S < S s L. Proposition 5 simply says that the size of jumps are smaller in the mollied case. Consequently, this implies that there must be continuous increments (`status creeps') to compensate for the smaller size of jumps in the mollied case (Corollary 1, Appendix). A numerical example veries this 12 . Figure 6 below depictsC (W ) in blue andP S (W ) in red, where the left panel pertains to the step- wise s and the right panel pertains to the mollied . Indeed, we can see that status (P S (W )) `creeps' for a while (Proposition 5) before it `jumps' (Proposition 4) in the mollied case (red line, right panel), whereas the increment is completely discrete in the stepwise case (red line, left panel). 12 Figure 6 also veries other Propositions. In particular, it shows that: (i) P S (W ) is monotone increasing every- where, (ii) continuously increasing almost everywhere, yet (iii) makes discrete jumps that are `funded' by drops in C (W ). 142 Figure 6: C (W ) and P S (W ) with m = 1 and L = 2 Propositions 4 and 5 are realistic. Recall that the jumps of P S (W ) represent a jump to the `bigger pond' motivated byG =:fW mid1 ;W mid2 ;W mid3 ;g. The status literature has argued that wealth brackets typically compete in their own signature signalling goods 13 . Suppose that the signature signaling goods at these points are, respectively: S class =fjewelry; luxury car; house;g: Proposition 4 describes leaps to the next class of goods and the economizing required to do so. On the other hand, Proposition 5 represent dierentiation within each class of signaling goods. That is, Proposition 5 implies that agents will buy dierent signaling goods even within each class - for example, a BMW 3-series vs 5-series - depending on their level of wealth. Intuitively, this realism is a consequence of the fact that in the mollied case, the utility from S is no longer concentrated singularly at W mid as in the stepwise representation, but rather `fanned out' around W mid . This means that the utility from S is perceived incrementally, which translates to incremental choice of S-goods, even within the same class of status goods. 13 For example, Charles, Hurst and Roussanov (2009). 143 3.2.3 Characterization of U (W ) Given the characterization of C (W ) and P S (W ), the next task is to nd the mollied analogue of Figure 4 using Lemmas 1-2 below. The general message is that the utility function (U (W )) retains the rough shape of Figure 4, even in the mollied case. Lemma 1. Given W, let C (W ) and P S (W ) denote the optimal choices such that C (W ) + P S (W )W , and let U (W ) :=U(C (W );P S (W )). Then (i) U (W ) is continuous in W (ii) @U (W ) @W is not continuous everywhere. In particular, there exists W: a partition on the pos- itive real line (R + ) with equal lengths L, such that each element of W holds a unique point of discontinuity of @U (W ) @W . Lemma 1 essentially tells us the U (W ) is continuous but not smooth, since its slope is not ev- erywhere continuous. Moreover these points of slope discontinuities are spaced approximately L apart 14 . And the following Lemma helps us pin down what happens in these slope discontinuities. Lemma 2. Let W D :=fW 1 ;W 2 ;g be the points of slope-discontinuity in Lemma 1, and let W C :=W C D , i.e., the slope-continuous domain of U (W ). Then, (i) On W C , C (W ) and P S (W ) are both increasing in W. (ii) On W D , P S (W ) is increasing in W , and C (W ) is decreasing in W . Intuitively,W C is the `smooth' domain over which - according to Lemma 2-(i) - agents increase both C andS in response to marginal increments ofW . Furthermore, it shown in the Appendix (Lemma 4) that the slope of U (W ) in this smooth region (W C ) essentially inherits that of MU C ; 15 hence U (W ) much resembles the familiar CRRA utility function on W C . On the slope-discontinuous points, (W D ) Lemma 2-(ii) tells us that P S increases. By denition of W D , the increment is a jump with a corresponding downward jump in C to honor the budget constraint. Just as in the unmollied case,MU C `resets' itself asC funds the ascent to the next level of status and this process reiterates. Hence, we recover the broad shape of Figure 4, even in this case with the mollied : 3.3 A Numerical Example We provide a numerical example with m = 1 and L = 2, depicted in Figure 7. The blue solid line is the U (W ) with the step function s , denoted as U s (W ). The red dashed line is the U (W ) with the mollied step function , denoted as U (W ). The numerical example veries many of 14 Because the points of discontinuity are spaced apart, they only constitute a measure-zero set, and hence @U (W) @W is in fact continuous almost everywhere, but not everywhere. 15 Also, MUC equals MUS by Proposition 2. Hence, MUC =MUS = @U (W) @W . 144 our analytic results and in particular, the two graphs look much alike. Figure 7: U s (W ) (blue, solid) and U (W ) (red, dotted) with m = 1 and L = 2 We focus on the red line, U (W ). When the wealth level is very low, the agent allocates all her wealth to consumption. This consumption is essentially the subsistence level. Before long - in fact, beforeW increases by an amountL according to Lemma 1 - agents get `bored' of the low marginal utility of consumption. That is, agents derive incrementally lower pleasure from consuming high- end goods with little or no signaling value such as organic food, high quality cosmetics, electronic gadgets, etc. Lemma 2-(ii) tells us that once W reaches a threshold level (W mid1 2W D , which in Figure 7 occurs at aroundW mid1 = 1:5), agents escape the boredom by leaping to the next level of P S by sacricing some ofC (`status jump', Proposition 4). This is a discrete jump to a higher level of status funded by lower level of consumption. Eectively the agent is saving up every penny to buy a house, or perhaps more modestly, a designer bag. Given the more immediate need for status (P S ), agents endure the high marginal utility of consumption (MU C ) as they make the jump (`equal marginal principle', Proposition 2). AsW increases andC increases along with it, the pain of high MU C begins to fade away. The agent also allocates more resources to a more signicant S good (for example, the move from Mercedes-Benz C-class to E-class), while simultaneously consuming more (`status creep', Proposition 5). This simultaneous increment ofC andS (onW C ) continues until the marginal utility becomes so low that agents ponders another `jump', perhaps now to a luxury home at around W = 3:5 in Figure 7. Again, this is the process of `reseting the marginal 145 utility' of consumption, and as MU C resets, the slope of U (W ) ratchets up as well. Thus we pictorially recover the shape predicted by the Lemmas in the previous section. And ultimately in our model, the decision to jump is made by comparing the gain from jumping to the next `winning' region against the pain of reducing consumption in order to achieve it. AsC hits a threshold point, joining the next `winning region' is perceived more desirable compared to the `boredom' from C, and the jump is made. This aligns with our everyday perception of life: when we achieve a goal, the attainment may at rst provide contentment, but with ow of time (and wealth) we tend seek new jumps to aspire to. As predicted by the Lemmas, U s (W ) and U (W ) graphs share a common overall shape but are nevertheless, not identical. One dierence is that the magnitude of slope change at W D is smaller under the mollied case. This represents the fact that size of the jumps of P S (W ) is smaller in the mollied case (Proposition 5), and hence so is the change in C (W ) required to fund the jump. Another dierence is the starting point of the jump. The jump locations ofU (W ) is located to the right of U S (W ). This is intuitive since mollication, by denition, dulls the impact of exceeding the midpointW mid , thus it takes a higher wealth level to initiate a status jump under the mollied case. 4 The Full Solution As previously discussed, we can reduce the full problem (9) considerably using the solution to the auxiliary problem. The two-period problem is now reduced to solving (11), which amounts to nd- ing the optimal investment I at time t. The other control variables of the full problem, P S (W ) andC (W ), are automatically determined as solutions toU (W ). This allows us to focus solely on the task of determining I . More explicitly, we rewrite (11) under (2) and (3) using the auxiliary problem as: max P I 2[0;L] U (WP I ) + P I L U (W +L) + (LP I ) L U (W ) (12) The rst term represents the (maximized) utility at time t, given an investment level. The second term represents the maximized expected utility at timet 0 given the investment level, discounted by . While choosing a higher level of P I decreases utility at t in the rst term, it compensates this loss by increasing the expected utility att 0 , through an increased likelihood of success on the status investment ( P I L ). Once P I is chosen, the auxiliary problem species the remaining choices, and 146 hence completes the entire choice prole of the agent;S := I S = P I P S 2 R 2 and the optimal consumption C . 4.1 A Numerical Example The lower panel of Figure 8 pictorially describes the choice of P I (W ) for (12) on each point of W using the values m = 1 and L = 2. For simplicity, the graph is drawn using the stepwise representation ( s ), however the general shape is essentially the same even with the mollied . The upper panel is the solution to the reduced problem, U s (W ). Figure 8: U (W ) and P I (W ) with m = 1 and L = 2 The shape ofP I (W ) shows that there are two `regimes' of I : the linearly increasing regime (from aboutW = 2 toW = 3:8) and the stagnantly low regime following a drop (from about W = 3:8 to W = 4). To understand what the two regimes represent, it is helpful to rst recall the underlying actions on the reduced problem. When investment is not an option (i.e., on the reduced problem), the agents had to fund their jump in status by reducing consumption. While this is the optimal choice for the givenW , it also entails the pain of high marginal utility of consumption as they make the jump. When given the choice to invest in status, agents respond by mitigating this high-MU C . That is, in Figure 8, the segments ofW (domain) whereP I is highest coincide with segments ofW where @U (W ) @W is high. By Lemma 4 (in the Appendix, or essentially by Envelope Theorem) these are the points where MU C is high. This reveals that agents (when endowed with possibility of I) 147 substitute high S for high I: on the domain where @U (W ) @W is high, agents respond by demanding higher I, sacricing S in order to fund the high I. Eectively, I allows agents to invest in future status, albeit probabilistically. When given this possibility, the agent (whose wealth level falls in this domain) chooses probabilistic status instead of buying it outright. In everyday parlance, this is the case where an agents ponders purchasing a status good that would help exude her high wealth level, but realizes that doing so would come at a cost of painfully reducing consumption. In this case, the agent -instead of buying the good outright - postpones the purchase and instead, invests in activities that are less costly (so as to avoid high MU C ) but could potentially help her attain high status tomorrow, eectively making the quest for status probabilistic. In the context of \choosing the right pond", the availability to choose an I-good eectively means that the agent now chooses to be the \small sh in the big pond". Indeed, whereas Proposition 3 dictates that the preference is unilaterally to be a \big sh", when endowed with I-goods, agents aim for the next bracket probabilistically, which means that in expectations, the agent is in the next bracket, or is already the \small sh in a bigger pond". Yet once the gamble is realized, the outcome is binomial, and the agent is always the \big sh", either in the lower or the higher bracket. 4.2 Status-seeking Across the Wealth Cycle Given this, we can track an agent through a wealth cycle. (Or equivalently, understand the cross- sectional choice of status goodsS2 R 2 depending on the wealth level of agents.) Suppose the agent has wealth level of W where @U (W ) @W (= MU C ) would have been high without the option to invest. Note that at this point, the agent would have been choosing a high level of P S , since the high MU C implies that the status jump has been made. Now, given the choice to invest (I), the agent chooses high investment P I instead of high P S . P I keeps increasing until at some point the agent feels condent enough to buy the signaling status good (S) outright. At this point, status- seeking is no longer probabilistic, and P S completely replaces P I , causing the drop in P I . Once the outright purchase is made, the agent feels content, and P I is stagnant (zero). Perhaps this is the `heyday' of her life: the agent has enough wealth level to consume a comfortable amount, and at the same time feel securely endowed with status. As income further rises, the MU C decreases and this `boredom' encourages the agent to look for the next step. However, the next step in the stairway of status is too remotely costly for the agent to attain. Hence the agent optimally responds by `investing' in status. At rst, the investment is modest. If we think of the agent's position as a `barbell strategy' between ordinary consumption (C) and the investment component of status 148 (I), the agent is at rst only modestly investing into the next level of status, and heavier weight goes to ordinary consumption. But as wealth level increases, the investment component of status (linearly) increases and becomes a more serious component, and the barbell begins to tilt more to the investment side. If wealth increases suciently, the investment component of the barbell is completely replaced by an outright purchase of status (S). 149 5 References [1] Ball, S., Eckel, C., Grossman, P., and W. Zame (2001): \Status in Markets". Quarterly Journal of Economics, 116, 161-188. [2] Becker, G., Murphy, K.M, and Werning, I (2005): \The Equilibrium Distribution of Income and the Market for Status". Journal of Political Economy, 113, 282-310. [3] Charles, K., Hurst, E, and Roussanov, N. (2009): \Conspicuous Consumption and Race". Quarterly Journal of Economics, 124, 425-467. [4] Diecidue, E., and J. van de Ven (2008): \Aspiration Level, Probability of Success and Failure, and Expected Utility". International Economic Review, 49, 683-700. [5] Duesenberry, J. (1949): Income, Saving and the Theory of Consumer Behavior. Harvard Uni- versity Press [6] Friedman, M., and L. J. Savage (1948): \The Utility Analysis of Choices Involving Risk". Journal of Political Economy, 56, 279-304. [7] Heetz, Ori, and Robert H. Frank (2011): \Preferences for Status: Evidence and Economic Implications." In Jess Benhabib, Matthew O. Jackson and Alberto Bisin editors: Handbook of Social Economics, Vol. 1A, 69-91 [8] Hopkins, E., and Kornienko T. (2011): \Inequality and Growth in the Presence of Competition for Status". Economic Letters, 93, 291-296 [9] Lee, S., Zapatero, F. and Giga, A (2019): \Rolling the Skewed Die: Economic Foundations of Preference for Skewness". Working Paper. [10] Ray, D. and Robson, A. (2012): \Status, Intertemporal Choicem and Risk-Taking". Econo- metrica, 80, 1505-1531. [11] Roussanov, N. (2010): \Diversication and Its Discontents: Idiosyncratic and Entrepreneurial Risk in the Quest for Social Status". Journal of Finance, 65, 1755-1788. 150 Appendices H Mollier TBA I Reduced Problem The reason the reduced problem helps is because of the structure of the full problem (9). At time t 0 , optimal investment (I 0 ) is 0, because it is the nal node. The only remaining strategy variable at time t 0 is therefore C 0 , since S 0 will then be automatically decided (S 0 = W 0 C 0 ). Hence, contingent on a given W 0 (which is either W , or W +L at t 0 ), U (W 0 ) gives us the maximized utility at time t 0 , as per its denition. Now, rolling back one period, the strategy variable that aects the realization ofW 0 (att 0 ) isI at timet. But for the purpose of nding the optimal solution at time t, any given choice of I is observationally equivalent to reducing the endowment ( W ) by exactly that amount, W new = WI, and setting I = 0. Hence, given a specied I , the rst period's maximized utility is U ( W new ) =U ( WI ). That is, for any given I and W , max fC;Sg; given (I,W) U(C;S) =U (WI): Therefore, once we know U (W ) for any given W , the full problem (9) then becomes: max I U ( WI) + E[U (W 0 )jI]; (13) hence it is reduced to simply ndingI , the optimal investment att. The full problem is eectively reduced to a two-stage problem. J Corollary 1 Corollary 1. TBA K Proofs Proof of Proposition 1: We specialize tom =L for tidiness, since the extension is just straight- forward scaling. First, it is easy to show that k R exists and is unique for any given R, a result which is in fact also self evident from Figure 9. 151 Figure 9: u(C) and m L (Ck R ), where L =m = 3 Next, given the k R and the corresponding C and C, suppose that C > C. We will show that this induces a contradiction by suggesting an alternative pairfC ;S g which does better. Let C =C R, whence S =WC +R. It is useful to collect a few facts rst: (i) By construction, u(C) - u(C) = R (ii) By the fact that C C =C C and by concavity, u(C )u(C)<u(C )u(C). Then we compareu , the optimized utility underfC ;S g againstu , the utility underfC ;S g. 152 That is, u =u(C ) +WC =u(C) + u(C )u(C +WC <u(C) + u(C )u(C) +WC =u(C) + u(C )u(C) +WC +R =u(C ) +WC +R =u(C ) +S =u ; a contradiction to the optimality of u . The inequality follows from (ii) above, and the equality that follows re ects (i). Hence we establish that C <C. A very similar line of argument can be used to show that CC . Proof of Proposition 2: Rewriting (10), max fP S g u(WP S ) +m(P S ); (14) and dierentiation by P S yields the rst order condition: u 0 (WP S ) =u 0 (C ) =m 0 (P S ): (15) Proof of Proposition 3: Since U (W ) must at the very least be a local optimum, the second order condition has to hold locally. Namely, u 00 (WP S ) +m 00 (P S )< 0: (16) Given Assumption 1, the rst term of (16) is (in absolute value,) smaller than the second term of (16) Hence, m 00 (P S )< 0 must hold true, therefore, P S 2 (W mid ;W u ): Proof of Proposition 4: For this, we need a Lemma. Lemma 3. P S (W ) is injective everywhere. Proof. Suppose not, and letW 1 andW 2 be two distinct wealth levels such thatP S (W 1 ) =P S (W 2 ). But by the budget constraint (which is clearly binding), this implies C (W 1 )6=C (W 2 ), which in turn implies that u 0 (C (W 1 ))6=u 0 (C (W 2 )): By Proposition 2, m 0 (P S (W 1 ))6=m 0 (P S (W 2 )), 153 a contradiction to P S (W 1 ) =P S (W 2 ). Now, consider two cases. Case 1: W 0 2fW :P S (W ) is continuous in Wg. OnW 0 , supposeP S is (continuously) decreasing. This means that given the binding budget constraint, C must be increasing inW atW 0 which in turn means that u 0 (C ) must be decreasing atW 0 . By (i), this implies that m 0 (P S ), is decreas- ing in W . By (ii), P S 2 (W mid ;W u ) where the second derivative 00 () is negative, whence P S is increasing in W , a contradiction. (Note that since () is mollication, it is an innitely contin- uously dierentiable function, hence 00 () is continuous at P (W 0 ), as composition of continuous functions. This means that 00 does not make `jumps' given that we are on W 0 . This is why we could invoke (i) and (ii) to make statements about monotonicity of arguments by looking at the monotonistic behavior of the function itself. Without continuity, we cannot transport the behavior of the function onto the behavior of the argument variables.) Case 2: W 0 2fW : P S (W ) is continuous in Wg c . (Sketch) In this case, P S is discontinuous and hence `jumps' at W 0 . Suppose that P S jumps downwards. By Proposition ??-(i) this implies a reciprocating jump of C upwards, whence the marginal utility of consumption (u 0 (C )) jumps downwards. By Proposition 2, this implies a commensurate downward shift in m 0 (). The only case this is possible is when P S moves away from the current carapace to a lower one. This auto- matically indicates that the size of the jump is bigger than L 2 , and the induced change in `m()' is larger than m 2 . LetJ(> L 2 ) denote the size of jump inP S andC , and letU (> m 2 ) andU C (> m 2 ) denote the associated drops in utility. It can be shown (somewhat tediously) that given Assumption 1, the decreasing jump is never optimal because it can be replaced by not jumping to achieve higher U (W ). (This can be done by comparing the sizes of (U m 2 and U C m 2 under Assumption 1.) Finally, by Lemma 3, we can conclude that the increase must be strict. Proof of Lemma 1: (i) SupposeU (W ) is discontinuous atW 0 . Then9> 0 (not to be confused with the mollifying parameter) such that for any > 0;9W d satisfyingjW d W 0 j<, so that: jU (W d )U (W 0 )j>: (17) Going forward, we choose and x any such , and consider (and x) a sequencef n g such that n > 0 and n ! 0. Also, construct (and x) from Equation (17), a sequencefW dn g such that for 154 each n,jW dn W 0 j< n andjU (W dn )U (W 0 )j>. From these, we construct: G n (W 0 ) := jU (W dn )U (W 0 )j jW dn W 0 j (18) By construction, we know that lim n!0 G n (W 0 ) =1; hence for any positive numberG, there exists anN G 2 N, such thatG n (W 0 )>G whenevernN G . Now, given discontinuity at W 0 , consider the solutions to the auxiliary solution C (W 0 );P S (W 0 ), and its maximized value U (W 0 ). Let: G :=u 0 (C (W 0 )); (19) and choose any n N G , thereby choosing n and W dn (from the previously xed sequencef n g andfW dn g) such thatjW dn W 0 j< n . Now, consider: V (W ) :=u(C (W 0 ) +W 0 W ) +m(P S (W 0 )): (20) Clearly, V (W 0 ) =U (W 0 ). Also, by denition of the rst derivative and Equation (19): jV (W dn )U (W 0 ))jGjW dn W 0 j; (21) where the approximation can be made arbitrarily accurate. Since we have chosen nG n we know that G<G n (W 0 ) hence: jV (W dn )U (W 0 )jGjW dn W 0 j <G n (W 0 )jW dn W 0 j =jU (W dn )U (W 0 )j; where the last equality follows from the denition of G n (W 0 ). But this contradicts optimality of U (), because it implies that either U () is not optimal at W 0 or U () is not optimal at W dn . Therefore, U (W ) is continuous everywhere. (ii) We start with a Lemma. Lemma 4. @U (W ) @W =u 0 (C (W )) =m 0 (P S (W )) Proof. The rst equality is by standard application of the envelope theorem, and the second equality 155 is just a restatement of Proposition 2. LetW m L :=fW : @U (W ) @W = m L and W >Lg. ThisW m L will serve asW. We rst note that every ele- ment ofW m L are equi-distance (L) apart. This is from the binding budget constraint (W =C +P S ), Lemma 4, the structure of() and the invertibility of the functionu 0 (). Pick any two neighboring elements of W m L : W i and W i+1 (=W i +L): We proceed in two steps. Step 1) We rst show that there exists at most one point of discontinuity in the interval (W i ;W i+1 ). For this, we borrow the conclusion of Lemma 2-(ii), (which can be proven independent of Lemma 1) thatC (W ) decreases at any point of discontinuity, hence u 0 () increases at any point of discon- tinuity. This implies that for each discontinuous point, there must be at least one corresponding point of in ection on () such that 00 () > 0. But the nature of mollication stipulates that there exists at most one point of in ection per L interval, hence there can be at most one point of discontinuity. Step 2) To show that there exists at least one, we rst note that U (W i+1 ) =U (W i ) +m; (22) and let: U h (W ) :=U (W i+1 ) Z W i+1 W @u(WP S (W )) @W dW; (23) and U l (W ) :=U (W i+1 ) + Z W W i @u(WP S (W )) @W dW: (24) U h (W ) and U l (W ) together almost construct U (W ) completely. This can be seen from the en- velope theorem (Lemma 4) and that given these denitions, U (W i+1 ) = U h (W i+1 ), U (W i ) = U l (W i ), and U (W i+1 ) = U (W i ) +m. (That is, Equations (23) and (24) are antiderivative re- constructions of U from their partial derivatives, with suitable conditions on the endpoints, by construction.) The construction can be completed by observing the following claims: Claim (i):9W 0 2 (W i ;W i+1 ) such that U l (W 0 ) =U h (W 0 ), and Claim (ii): @U l (W ) @W < m L < @U h (W ) @W , in particular, @U l (W 0 ) @W < @U h (W 0 ) @W . 156 Claim (ii) can be obtained by rst noting that u 0 () is everywhere continuous. Then, borrowing from Lemma 2-(i) (which can be proven independent of Lemma 1) the conclusion that C (W ) is increasing in W on continuously dierentiable domain, we can deduce that m L < @U h (W ) @W . Similar argument leads to m L > @U l (W ) @W . Given Claim (ii), Claim (i) can easily be obtained by the Inter- mediate Value Theorem, and Equation (22). Claim (i) informs us that Equation (23) and (24) coincide at W 0 . We know that U (W ) is contin- uous in W (by Lemma 1-(i),) and that U h (W ) decreases continuously from the right and U l (W ) increases continuously from the left, meeting atW 0 . Also, bothU h (w) andU l (W ) representU (W ). Therefore, it must be that: U (W ) =maxfU h (W );U l (W )g: (25) Moreover, by (ii), W 0 is a point of discontinuity for @U (W ) @W . Hence, by construction, this veries our claim that there exists a point of discontinuity of @U (W ) @W in the interval (W i ;W i+1 ). Note that becauseW i andW i+1 areL apart, and there can be at most 1 point of discontinuity in anL interval (by Step 1). Therefore, this point of discontinuity (W 0 ) is unique. Proof of Lemma 2: (i) We know thatP (W ) is increasing is from Proposition 4, so it suces to prove that C (W ) is increasing. We rst gather some facts: (a) If W 0 2W C , then C (W ) is also continuous at W 0 . This clear, since C (W ) =WP S (W ). (b) Both u() and () are (at the very least) inC 2 (twice continuously dierentiable functions). (c) On the interval S2 (W mid ;W u ), 00 (S)< 0. (d) u 00 (C)< 0 for any C. From (a) and (b), we can deduce that u 0 (C (W )) and m 0 (P S (W )) are continuous at W 0 , as composition of continuous functions. SupposeP S (W ) is increasing andC (W ) is decreasing at any arbitrary point W 0 2W C (continuously). By (c) and (d), this means that an incremental increase in W leads to incremental decrease in 0 (P S (W )) and incremental increase of u 0 (C (W )). This contradicts Proposition 2. (ii) This is a direct consequence of the fact that the budget constraint must bind: C (W )+P S (W ) = W . Proof of Propositions 5: TBA 157
Abstract (if available)
Abstract
In this dissertation, I explore the possibility of adding some rational foundations to what is commonly perceived as behavioral finance or behavioral economics. I begin with minimal, yet realistic assumptions on the underlying states or preferences, and build models where agents rationally respond to these elements. This approach is promising, as it renders plausible explanations for many asset pricing puzzles, generates preference for skewness that is prevalent across multiple asset classes, and can account for rich set of status seeking activities, all in parsimonious settings.
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Essays on behavioral decision making and perceptions: the role of information, aspirations and reference groups on persuasion, risk seeking and life satisfaction
PDF
Workplace organization and asset pricing
PDF
Essays in behavioral and entrepreneurial finance
PDF
Essays on bounded rationality and revenue management
PDF
Three essays on macro and labor finance
PDF
Essays in asset pricing
PDF
The effect of managerial retention incentives on the relationship between financing constraints and voluntary disclosure
PDF
Investment behavior of mutual fund managers
PDF
Essays on the effect of cognitive constraints on financial decision-making
PDF
Expectation dynamics and stock returns
PDF
Essays in tail risks
PDF
A general equilibrium model for exchange rates and asset prices in an economy subject to jump-diffusion uncertainty
PDF
Essays on real options
PDF
Effects of preferred physical activity on stereotypical behaviors in children with autism spectrum disorder: adapting from in-person to telehealth
PDF
Mutual fund screening versus weighting
PDF
Power, status, and organizational citizenship behavior
PDF
Two essays on the mutual fund industry and an application of the optimal risk allocation model in the real estate market
PDF
How do acquirers govern the deal-making process? Three essays on U.S. mergers and acquisitions 1994 – 2017
PDF
Essays on information and financial economics
PDF
Enchanted algorithms: the quantification of organizational decision-making
Asset Metadata
Creator
Lee, Suk won
(author)
Core Title
Three essays on behavioral finance with rational foundations
School
Marshall School of Business
Degree
Doctor of Philosophy
Degree Program
Business Administration
Publication Date
05/01/2019
Defense Date
03/22/2019
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
asset pricing puzzles,behavioral finance,OAI-PMH Harvest,preference for skewness,rational foundations,status-seeking
Format
application/pdf
(imt)
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Zapatero, Fernando (
committee chair
), Frydman, Cary (
committee member
), Tuzel, Selale (
committee member
)
Creator Email
sogwonlee@gmail.com,sukwonle@usc.edu
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-163962
Unique identifier
UC11660655
Identifier
etd-LeeSukwon-7378.pdf (filename),usctheses-c89-163962 (legacy record id)
Legacy Identifier
etd-LeeSukwon-7378.pdf
Dmrecord
163962
Document Type
Dissertation
Format
application/pdf (imt)
Rights
Lee, Suk won
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
asset pricing puzzles
behavioral finance
preference for skewness
rational foundations
status-seeking