Close
About
FAQ
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
Essays on behavioral decision making and perceptions: the role of information, aspirations and reference groups on persuasion, risk seeking and life satisfaction
(USC Thesis Other)
Essays on behavioral decision making and perceptions: the role of information, aspirations and reference groups on persuasion, risk seeking and life satisfaction
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
Essays on Behavioral Decision Making and Perceptions The Role of Information, Aspirations and Reference Groups on Persuasion, Risk Seeking and Life Satisfaction by Andreas Aristidou A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ECONOMICS) December 2020 Copyright 2020 Andreas Aristidou Acknowledgments I am grateful for my main advisors Giorgio Coricelli and Arie Kapteyn who took a chance on me and invested their time, money and resources on someone they didn’t know all that well. I will forever be in their debt and I hope that my work for each has justified their respective decisions. I am also very indebted to Odilon Camara who took a special interest in nurturing me academically by investing many hours of his time to assist me with my paper and develop my other research ideas while constantly keeping an eye on my progress and subtly nudging me to do better. I additionally thank Cary Frydman for his incredible comments on two of my papers and Simon Wilkie for his part in my Qualifying Exam. Finally, I give massive appreciation to Yilmaz Kocer who selflessly invested endless hours discussing my ideas and helping me develop the best paper of my PhD program. Academically, I have also benefited greatly from conversations with faculty from the Department of Economics, Jonathan Libgober, Matthew Kahn, Jeff Nugent, Juan Carillo, Isabelle Brocas, Yu-Wei Hsieh, as well as countless lunches, dinners and one-on-one meetings with seminar speakers. Outside of the department, I have had great conversations, classes and research collaborations with students and faculty from the USC Viterbi Department of Computer Science (Milind Tambe, Sarah Cooney, Nathan Bartley, Giorgos Constanti- nou, Chrisovalantis Anastasiou), Marshall School of Business (Fernanndo Zapatero, Kristian Yordanov, Ventsi Stamenov, among others), Dornsife Department of Psychology (Morteza Dehghani, Mohammad Atari, Matt Baucum, among others) and Center for Economic and Social Research (Dan Benjamin, Juan Saavedra, Titus Galama, Chelsea Watson, among others). Completing a PhD program is so much more than academics. I am very grateful for the support I received from staff at the Department of Economics, especially Young Miller, ii Morgan Ponder and Irma Alfaro, staff at the Office of International Services, the Graduate School, medical professionals at the Engemann Student Health Center and staff at the Lyon Recreational Center. Lastly and most importantly, I will forever be indebted to the best PhD program advisor, Alex Karnazes, whose knowledge, eagerness, helpfulness and calm demeanor were instrumental in assisting and empowering me to navigate the logistics of this program. I have been lucky to share the past five and half years with wonderful friends. This pro- gram has blessed me with the life-changing friendships of Ali Abboud, Rachel Lee, Bryson Yee, Adamos Andreou, Grigori Frangouridi, Jason Choi, Brian Finley, Rashad Ahmed, Fab- rizio Piasini, Jorge Tarraso, Ida Johnson, Mahrad Sharifvaghefi, Monira - The Queen - Al Rakhis, Juan Espinosa, Jake Schneider, Youngmin Ju, Ray Yiwei Qian, Sheryl Weiran Deng, Eunjee Kwon, Jisu Cao, Jeongwhan Yun, Mike Yinqi Zhang, Andrew Yimeng Xie, Qin Jiang, Aleksandar Giga, Georgios Effraimides, Chelsea Watson, Chris Jeong Woo, Sam Boysel, Rajat Kochhar, Clement Boulle, Fatou Thioune, Karim Fajouri, Dario Laudati, and several others. I will forever be bonded with my incoming cohort of the PhD program. The connecting experience that is created by the shared struggles of the first year in the program and the core exams is unparalleled. Sometimes, academia has a way of encapsulating you in a bubble. Thus, friends outside the PhD program played a crucial role in helping me balance life. I am grateful to Judy Zhang, Josie Xiao, Ksenia Fiorin, Becky Harris, Nejwa, Jenane, Lena Smith Abboud, Dena Taha, the Lee family, members of the five-a-side soccer teams I’ve been a part of. While this was not the first time living away from home, the Greek-Cypriot community in LosAngelesblessedmewithfriendswholook,speak,thinkandmostimportantly,eatlikeme. Adamos Andreou, Maria Allayioti, Chrysostomos Marassinou, Constantina Stylianou, Geor- gios Effraimides, George Constantinou, Chris Anastasiou, Paraskevi Hadjicosta, Panayiotis Petousis, Panayiota Loizou, Chrystalla Havadja, Demetris Yiorkajis, Vassilia Meliou, Savvas Christou and Papa Christos, among many others. Their presence made life in Los Angeles feel so much closer to home. Taking on such a big challenge so far away from home, you oftentimes forget who you are at the core and why you started this in the first place. In that regard I am grateful to several iii of my best friends from childhood back in Cyprus who, on several occasions, lifted me up by reminding me of who I am and where I come from and helped instill some perspective to my life whene I needed it the most. Chrysanthos Papachrysanthou, Andreas Tofaris, Christos Hadjiraftis, Dafne Morroni, Maria Gregoriou, Anna Chrysanthou, thank you for not giving up on me, even after all these years being so far apart. I am grateful to my loving family. The unwavering support of my parents to pursue my dreams even if they don’t fully understand them nor want me to be so far away from them has been a great source of motivation as well as inspiration. Stalo and George Aristidou, thanks for everything. My grandparents have waited patiently to be able to call one of their grandchildren, “Doctor”. Now they can, even if not all of them are alive to say it. I’m sure they are watching proudly and I’m humbled to be the one to help them achieve this dream. Most importantly, I’m grateful for the most amazing younger brother, man and friend Alexandros Aristidou. The quest to be a role model for him has been the biggest drive and reason for most of the quests I’ve pursued, including this program. I cannot wait to be standing by his side while he is conquering the world. Through this program I met Amy Mahler, and this alone, makes all the struggles that brought me here seem insignificant. I’m blessed to have her be part of my life. I thank her for her love, patience and wisdom. Her presence in the latest chapter of this program brought joy, confidence and calmness in a turbulent and transitional phase. I’m looking forward for the future with you. Finally, I thank the USC Department of Economics and Graduate School for having me as a PhD student, the Center for Economic and Social Research for having me as a research assistant, the Leventis Foundation and the McKinsey Global Institute for support, the Fulbright Cyprus Committee for taking a chance on me, the Department of Economics’ third year paper committee for declaring me the the third year paper award winner, the University of Trento for having me as a visiting scholar, several conferences for having me as a presenter, the City of Los Angeles and the Los Angeles Data Science Federation for having me as an affiliate researcher, and Zillow Group and Cornerstone Research for having me as an intern. On to the next. Life... bring it on! iv Table of Contents Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv 1 Relativity and Habit Formation in Life Satisfaction 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.3 Theory and Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.1 The welfare function of income . . . . . . . . . . . . . . . . . . . . . 5 1.3.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.3.3 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.4 Log-moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.3.5 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4 Empirical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.4.1 Data description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.4.2 Model adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 1.4.3 Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 1.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.5.1 Summary statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.5.2 Reduced-form evidence . . . . . . . . . . . . . . . . . . . . . . . . . . 30 1.5.3 Structural estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.5.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 v 1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 2 Incentives or Persuasion? An Experimental Investigation 39 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 2.2 Theoretical Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.2.1 The Baseline Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.2.2 Information Design Extension . . . . . . . . . . . . . . . . . . . . . . 46 2.2.3 Mechanism Design Extension . . . . . . . . . . . . . . . . . . . . . . 47 2.3 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 2.3.1 Section ID (Information Design) . . . . . . . . . . . . . . . . . . . . . 50 2.3.2 Section MD (Mechanism Design) . . . . . . . . . . . . . . . . . . . . 52 2.3.3 Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 2.3.4 Summary of the Experimental Procedure . . . . . . . . . . . . . . . . 53 2.3.5 Section 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 2.3.6 Design Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 2.4.1 Bayesian Persuasion in the Lab . . . . . . . . . . . . . . . . . . . . . 56 2.4.2 Incentives or Persuasion? . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.4.3 Bargaining Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 2.4.4 The Determinants of Principals’ Success . . . . . . . . . . . . . . . . 67 2.4.5 Heterogeneity in Agents’ Behavior . . . . . . . . . . . . . . . . . . . . 70 2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 2.5.1 Risk and Payoff Uncertainty. . . . . . . . . . . . . . . . . . . . . . . . 73 2.5.2 Monetary Contracts versus Informational Contracts. . . . . . . . . . 74 2.5.3 ComputationalComplexityandtheMonetaryEquivalentofInformation. 75 2.5.4 Theoretical and Policy Implications. . . . . . . . . . . . . . . . . . . 78 2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3 Rolling The Skewed Die: The Economic Foundations of the Demand for Skewness 82 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 vi 3.2 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 3.2.1 A Motivation: Local, Bulky Status Goods . . . . . . . . . . . . . . . 87 3.2.2 A Formal Representation of Aspirational Utility . . . . . . . . . . . . 89 3.2.3 The Aspirational Agent’s Choice Set: Binomial Martingales . . . . . 89 3.2.4 Utility Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 3.3 Single Jump Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 3.3.1 The ‘Four Seasons of Gambling’ . . . . . . . . . . . . . . . . . . . . . 95 3.3.2 Comparative Statics . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 3.4 Double Jump Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 3.4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 3.4.2 Optimal Choice Under Two Jumps . . . . . . . . . . . . . . . . . . . 104 3.5 Departure from Fair Gambles: Sub-martingale and Super-martingale . . . . 106 3.5.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 3.5.2 > 0: ‘Winter (keep C 0 )’ Is Replaced by Near-Arbitrage . . . . . . . 108 3.5.3 > 0: Lowers Demand for Positive Skewness . . . . . . . . . . . . . 110 3.5.4 < 0 :Super-martingales . . . . . . . . . . . . . . . . . . . . . . . . . 112 3.6 Volatility and Skewness: A Role Analysis . . . . . . . . . . . . . . . . . . . . 114 3.6.1 The New Consumption Scheme: Tri-nomial Martingales . . . . . . . . 114 3.6.2 Principle of Maximal Volatility . . . . . . . . . . . . . . . . . . . . . 116 3.6.3 A Rule of Thumb Under Limited Choice of Skewness . . . . . . . . . 119 3.7 Experimental evidence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 3.7.1 Study 1: Within-subject test of preference reversals . . . . . . . . . . 122 3.7.2 Study 2: Between-subject test of skewness preferences with aspiration shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 3.7.3 Discussion of experimental results . . . . . . . . . . . . . . . . . . . . 131 3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Bibliography 134 A Appendix to Chapter 1 140 vii A.1 Mathematical derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 A.1.1 Log moments of the perceived income distributions . . . . . . . . . . 140 A.1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 A.1.3 Model adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 A.2 Empirical analysis and robustness checks . . . . . . . . . . . . . . . . . . . . 156 A.2.1 Robustness of empirical specifications . . . . . . . . . . . . . . . . . . 156 A.2.2 Other robustness checks . . . . . . . . . . . . . . . . . . . . . . . . . 160 A.3 Data appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 A.3.1 Coding of vignette dummy variables . . . . . . . . . . . . . . . . . . 161 A.3.2 Vignette descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 A.3.3 Countries, sample sizes, interview years and non-representative samples164 B Appendix to Chapter 2 167 B.1 Theoretical Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 B.1.1 Equilibrium in the Information Design Extension . . . . . . . . . . . 167 B.1.2 Equilibrium in the Mechanism Design Extension . . . . . . . . . . . . 170 B.1.3 ID and MD in Normal Form: Nash Equilibria and Bargaining . . . . 171 B.1.4 Calculation of the Nash Bargaining Weights in ID and MD . . . . . . 174 B.2 Variables Used in the Regressions . . . . . . . . . . . . . . . . . . . . . . . . 176 B.3 Additional Regressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 B.4 Additional Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 B.5 Additional Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 B.6 The Role of Risk Preferences . . . . . . . . . . . . . . . . . . . . . . . . . . . 180 B.7 ID: The Lasting Impact of Feedback State C . . . . . . . . . . . . . . . . . . 183 B.8 Instructions and Screenshots . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 B.8.1 Instructions for the ID Section . . . . . . . . . . . . . . . . . . . . . . 184 B.8.2 Screenshots for the ID section . . . . . . . . . . . . . . . . . . . . . . 186 B.8.3 Instructions for the MD Section . . . . . . . . . . . . . . . . . . . . . 187 B.8.4 Screenshots for the MD section . . . . . . . . . . . . . . . . . . . . . 189 B.8.5 Intructions for Section 3 . . . . . . . . . . . . . . . . . . . . . . . . . 190 viii C Appendix to Chapter 3 193 C.1 Mathematical Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 C.2 Appendix to the experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 C.2.1 Lotteries used in study 1 . . . . . . . . . . . . . . . . . . . . . . . . . 218 C.2.2 Lotteries used in study 2 . . . . . . . . . . . . . . . . . . . . . . . . . 219 C.2.3 Charity organization options: . . . . . . . . . . . . . . . . . . . . . . 220 C.2.4 Additional tables for study 1 . . . . . . . . . . . . . . . . . . . . . . . 221 C.2.5 Additional tables for study 2 . . . . . . . . . . . . . . . . . . . . . . . 223 ix List of Tables 1.1 Summary of vignette descriptions. . . . . . . . . . . . . . . . . . . . . . . . . 22 1.2 Summary statistics for reported life satisfaction of vignettes. . . . . . . . . . 28 1.3 Average responses for several variables of interest. . . . . . . . . . . . . . . . 30 1.4 Reduced form ordered Probit regression. . . . . . . . . . . . . . . . . . . . . 32 1.5 Structural estimation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 1.6 Simulation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 2.1 Payoff matrix in the baseline model . . . . . . . . . . . . . . . . . . . . . . . 46 2.2 Average choices in the ID and MD games . . . . . . . . . . . . . . . . . . . . 58 2.3 Average Matched Expected Payoffs (MEP) in ID and MD . . . . . . . . . . . 59 2.4 Principals’ feedback states in the ID and MD games . . . . . . . . . . . . . . 68 2.5 Principals’ earnings by group and game . . . . . . . . . . . . . . . . . . . . . 70 2.6 Average choices of selfish and generous agents . . . . . . . . . . . . . . . . . 71 3.1 Distribution of choices across pairs of rounds . . . . . . . . . . . . . . . . . . 125 3.2 Distribution of choices across treatments . . . . . . . . . . . . . . . . . . . . 131 A.1 Model estimation results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 A.2 Parwise correlations and multiple regression of on other vignette character- istics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 A.3 Coding of vignette dummies used in Table 1.5. . . . . . . . . . . . . . . . . . 161 A.4 Coding of vignette dummies used in Table A.1. . . . . . . . . . . . . . . . . 161 A.5 Countries, sample size, interview year, non-rep samples . . . . . . . . . . . . 164 B.1 Variables used in the regressions . . . . . . . . . . . . . . . . . . . . . . . . . 176 x B.2 RandomeffectslinearregressionsofaveragechoicesontheDictatorandCutoff choices in Section 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 B.3 Randomeffectslinearregressionsofchangeinchoice(X orY)betweenperiods t and t 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 B.4 SummaryofpayoffsofPrincipalsandAgentsbygame(ID/MD),sample(All/Matched) and payoffs (Actual/Expected) . . . . . . . . . . . . . . . . . . . . . . . . . . 179 B.5 Average choices in the Cutoff MD tasks . . . . . . . . . . . . . . . . . . . . . 182 C.1 Description of lotteries used in study 1 . . . . . . . . . . . . . . . . . . . . . 218 C.2 Description of lotteries used in study 2 . . . . . . . . . . . . . . . . . . . . . 219 C.3 Distribution of choices by distinct skewness option combinations . . . . . . . 221 C.4 Distribution of choices in pairs with positively skewed options . . . . . . . . 221 C.5 Distribution of choices in pairs with negatively skewed options . . . . . . . . 222 C.6 Distribution of choices across rounds of study 2 . . . . . . . . . . . . . . . . 223 xi List of Figures 1.1 Simulation graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.1 Average choices of principals and agents per period . . . . . . . . . . . . . . 61 2.2 Percentage of matched pairs (XY) per period in each game. . . . . . . . . 62 2.3 Average choices of matched pairs for each period of the ID and the MD games 63 2.4 Best response correspondences and Nash Equilibria in the ID and MD games. 65 2.5 Expected payoffs, Nash bargaining and disagreement outcomes . . . . . . . . 66 2.6 Principals’ reactions to feedback states . . . . . . . . . . . . . . . . . . . . . 69 2.7 Agents’ reactions to feedback states . . . . . . . . . . . . . . . . . . . . . . . 72 2.8 Averagechoices, theoreticalpredictions, andparticipants’estimatesofrational (risk neutral) cutoffs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 3.1 Local, Bulky Status Goods Imply Aspirational Utility . . . . . . . . . . . . . 88 3.2 The Four Seasons of Gambling . . . . . . . . . . . . . . . . . . . . . . . . . . 97 3.3 Power utility functions with different . . . . . . . . . . . . . . . . . . . . . 101 3.4 Aspirational Utility with Two Jumps . . . . . . . . . . . . . . . . . . . . . . 103 3.5 Almost Achieving Arbitrage by Selling Lottery: p SM 1 . . . . . . . . . . . 109 3.6 Maximum Tolerable Negative Expected Return . . . . . . . . . . . . . . . . 113 3.7 Top: Skewness, Bottom: Volatility . . . . . . . . . . . . . . . . . . . . . . . 115 3.8 Loss of Skewness is Prohibitively Costly . . . . . . . . . . . . . . . . . . . . 118 3.9 Loss of Skewness is Acceptable . . . . . . . . . . . . . . . . . . . . . . . . . . 119 3.10 Example: Round 3 of Section 1 . . . . . . . . . . . . . . . . . . . . . . . . . 122 3.11 Histograms of preference reversals . . . . . . . . . . . . . . . . . . . . . . . . 126 3.12 Example: Round 1 of Section 2 in one of the treatments . . . . . . . . . . . 129 xii B.1 Best responses and Nash equilibria for the ID and MD games . . . . . . . . . 172 B.2 Expected payoffs in the ID and MD games . . . . . . . . . . . . . . . . . . . 173 B.3 Histograms of agents’ choices in ID and MD. . . . . . . . . . . . . . . . . . . 180 B.4 Scatterplot of the principals’ average Expected Payoffs . . . . . . . . . . . . 181 B.5 Average choices of agents after experiencing state C in ID . . . . . . . . . . . 183 B.6 Choice screens in the ID section. . . . . . . . . . . . . . . . . . . . . . . . . . 186 B.7 Feedback screens in the ID section. . . . . . . . . . . . . . . . . . . . . . . . 186 B.8 Choice screens in the MD section. . . . . . . . . . . . . . . . . . . . . . . . . 189 B.9 Feedback screens in the MD section. . . . . . . . . . . . . . . . . . . . . . . 189 C.1 Example where l 2 (c) is tangent to U(c) at C F . . . . . . . . . . . . . . . . . 203 C.2 Examples where g(c) has 2 roots and g(c) has no root. . . . . . . . . . . . . 205 xiii Abstract Thisthesisbringstogetherthreeresearchpaperswhichinvestigate, boththeoreticallyand empirically, real-world situations in which behavior and perceptions depart systematically from the rational economic model. The first paper develops a structural model of an individual’s subjective well-being through the lens of their current and historical income position in a perceived income distri- bution that itself depends on a reference group and memory of the past. It then uses global survey data to estimate model parameters of reference weights and habit formation. The second paper brings together two prominent economic theories of incentives, Mech- anism Design and Information Design (or Bayesian Persuasion). First, it identifies and develops a theoretical equivalence between the two theories, which then tests empirically using a laboratory experiment. Second, it conducts an analysis of the observed behavioral differences of monetary versus informational incentive structures. Thethirdpaperdevelopsatheoreticalmodeloffinancialdecisionmakingthatcanexplain preferences for skewness seeking (both positive and negative) in an individual’s portfolio choice through the presence of aspirations causing discontinuous utility jumps. It then collects evidence for such behavior through a laboratory experiment. xiv Chapter 1 Relativity and Habit Formation in Life Satisfaction 1.1 Introduction Life satisfaction measures have been used for decades to inform researchers and policy- makers on individuals’ subjective evaluation of their own well-being. While plenty of mea- sures of life satisfaction have been proposed and are used in the literature, the most common measure used throughout the literature is the“Cantril Ladder” after Hadley Cantril (Cantril (1965)). The question is, what reference points do people use to evaluate their own or others’ life satisfaction? We postulate that rather than one reference point, individuals use a “frame of reference” to evaluate their own or others’ situation. In this paper, we concentrate on the role of income in contributing to life satisfaction, while controlling for other dimensions that have been found in the literature to be important. The simplest hypothesis regarding the frame of reference used to judge satisfaction with income is to assume that individuals assess incomes by their rank in a “subjective income distribution”. The issue is however, which income distribution? Following a theory developed in Kapteyn (1977), we hypothesize that the relevant subjective income distribution is a convex combination of income distributions 0 Joint Work with Arie Kapteyn 1 an individual has perceived over her lifetime. Furthermore, the perceived income distribu- tion at any given moment includes an individual’s own income, so that this formulation incorporates both habit formation and notions of relativity in evaluations. In this paper we use data from 109 countries collected by the Gallup World Poll (GWP), including a special module with anchoring vignettes. The vignettes represent brief descrip- tions of hypothetical individuals with respect to their health, family situation, job, and income. Every respondent sees six vignettes (out of a set of twelve) and is asked to rate the life satisfaction of the person described in each vignette. A key element of the vignette descriptions is that the incomes are either equal to half the median income in a country, or the median, or twice the median. This allows us to relate the vignette ratings by respondents to the income distribution in their country of residence, as well as to the respondent’s own income. The GWP is a cross section, so we don’t observe the individual income history of respondents. We make the strong assumption that all incomes in a country have grown at the same rate. As an estimate of the growth rate of incomes we then take observed GDP growth in the various countries in the dataset. Theremainderofthepaperisstructuredasfollows: Thenextsection(1.2)containsabrief discussion of the literature on life satisfaction. Section 1.3 lays out the formal model. Section 1.4 presents the data and the operationalization of a number of key concepts. Section 1.5 presentstheresultsoftheempiricalanalysis. Moreover, toprovideintuitionoftheestimation results, we also provide some simulations, where we vary growth rates and income dispersion and show their impact on observed life satisfaction. Section 1.6 concludes. 1.2 Literature Review There is a vast literature on the relative nature of satisfaction with consumption or income. (Duesenberry (1949); Veblen (1973)). (Frank (2012)) discusses evolutionary factors that would suggest that relative ranking should matter. For example, during a famine 2 the strongest individuals would survive; the strongest individuals would be most likely to find a mate, etc. Primate studies have found that if they move up in the social hierarchy, their serotonin levels increase. Serotonin is related to feelings of happiness and well-being (Van Vugt and Tybur (2015)). The most direct evidence that subjective well-being of humans is influenced by how well others are doing in an individual’s reference group (roughly defined as: others the individual compares herself with) is provided in laboratory settings. For instance (Fliessbach et al. (2007)) use functional magnetic resonance imaging (fMRI) to measure brain activity of subjects who have to perform an estimation task and are rewarded according to their performance. Subjects are not only informed about their own payments but also about the payments of other subjects. It is shown that neurophysiological activity responds strongly to relative payments. Outside the laboratory, establishing the effect of relative comparisons requires the definition of reference groups, i.e. groups of people one compares oneself to (Dahlin et al. (2014)). Reference groups have traditionally been defined a priori, e.g. by using individuals’ characteristics to define groups, for instance based on education, gender, or age (Van de Stadt et al. (1985), McBride (2001), Ferrer-i Carbonell (2005), Andrew and Senik (2014)). Similarly, coworkers or individuals in the same profession have been used as reference groups to explain the impact of individuals’ rank in the wage distribution on their satisfaction with the pay they receive (Brown et al. (2008)). The most commonly made assumption is that individuals mainly compare themselves to others in the same geographical area (Blanchflower and Andrew (2004), Luttmer (2005), Ferrer-i Carbonell (2005), Barrington-Leigh and Helliwell (2008), Andrew et al. (2008)). The definition of reference groups based solely on geography leads to complications, since for instance neighbors’ incomes are likely to be related to the quality of public and private goods in an area (access to parks, better stores, etc.). Using the Gallup Healthways Well-being Index survey, (Deaton and Stone (2013)) regress SWB on log income (and average SWB on average income) at increasingly high levels of aggregation in the U.S. (individual, zip 3 code, metro area, state). They note that if relative income is all that is important, then the coefficient on income should be rapidly declining as we move to higher levels of aggregation. Instead, they find that a regression of average SWB on average log-income at the zip-code level yields a higher coefficient than when using individual level data. When moving to larger geographic units the coefficient declines, but by only about one fifth compared to the individual level coefficient estimate. Also using the Gallup Healthways Well-being Index survey, (Ifcher et al. (2019)) find that neighbors’ incomes positively impact SWB in the U.S. at the local (zip code) level, but that at higher levels of aggregation (metro area), the effect is negative. This is true for several measures of well-being, including Cantril’s ladder. Yet, studies that try to elicit directly what reference groups respondents use find little evidence that geography is important. For instance, (Goerke and Pannenberg (2013)) use pretest modules of the German Socio Economic Panel for the years 2008-2010, which contain questions about the importance of different groups for income comparisons. Their sample is restricted to employed respondents aged 17 to 65. They find that only colleagues at work, other people with the same occupation, and friends matter. (Dahlin et al. (2014)) also find very little support for the notion that reference groups would be primarily formed on the basis of geographical proximity. It is safe to assume, however, that for the vast majority of individuals their reference group will be limited to their own country. In the empirical analysis in this paper, we take advantage of that assumption by using data from a large number of countries. 1.3 Theory and Model Life satisfaction is inherently multi-attribute and thus it is reasonable to expect that evaluation of life satisfaction is the combination of individual evaluations of the various components that make up life satisfaction. A simple way to represent this is in a linear equation 1.1 (e.g. Kapteyn et al. (2010)): 4 W (x) = 1 W 1 (x 1 ) + 2 W 2 (x 2 ) +::: + K W K (x K ) (1.1) Where x is a K x 1 vector of life domains, W () is overall life satisfaction and k 2 f1;::;Kg indicates the K separate life domains. For instance we may interpret W k (x k ) as the satisfaction with x k in life domain k. Without loss of generality we set k = 1 to represent income. For notational simplicity, from here on, we will replace x 1 with y. i.e. W 1 (y) would represent an individual’s satisfaction with income y. We will refer to W 1 as a “welfare function of income” (Van Praag (1971); Van Praag and Kapteyn (1973)). Finally the parameters are weights representing the effect of the satisfaction with each life domain on overall life satisfaction. 1.3.1 The welfare function of income In what follows we will focus mainly on the functional form of W 1 () in which we will incorporate the effects of habit formation and relativity, first proposed in (Kapteyn (1977)). The basic idea is that individuals use the incomes of others and themselves, both currently and in all past periods, as a frame of reference for evaluating their satisfaction with an income y. This evaluation is done on a finite scale, which we normalize to [0; 1], where a higher number corresponds to a higher satisfaction with income. We model time as discrete: 2 f1;::; 0g, with = 0 representing the present. At each time , the individual assigns non-negative time-independent reference weights q i to incomes y i of individuals i =f0; 1;:::;ng in her reference group. 1 The weightsfq 0 ;q 1 ;:::;q n g sum to one. Note that whilethereferenceweightsareassumedconstantacrossperiods, thiscaneasilybegeneralized at the cost of added notational complexity. Also, the development is done in terms of individuals rather than households; this can be generalized to an analysis of households by 1 Without loss of generality we can assume that any individual not in one’s reference group is assigned weight of q i = 0 5 using equivalence scales. Our interest and analysis will be individual 0 (the self). Obviously, q 0 then represents the weight that the individual assigns to her own income, denoted by y 0 at time . Next, we proceed with defining notation and the four income distributions that will play a central role in the development of our model. 1.3.2 Definitions 1. Inclusive Perceived Income Distribution at Time : As is perhaps suggested by the name, this represents an income distribution that indi- vidual i = 0 perceives at time =f0;1;:::;1g. The adjective “perceived” signifies that this distribution is not necessarily the true income distribution but rather the distribution that individual i = 0 perceives. The adjective “inclusive” signifies that an individual’s own income is part of the perceived income distribution (we will make the distinction between own income and income of others in definition 2). To motivate this, one can think of an extreme case where individual i = 0 only observes the income she receives, or chooses to ignore the incomes of others. It might then be reasonable to assume that individual i = 0’s perception of the income distribution at time in the society is degenerate and equal to her own incomey 0 . In our model, such a case would be represented by the individual assigning weights of q 0 = 1, q i = 0;8i6= 0. The inclusive perceived income distribution at time is a weighted empirical CDF, which includes the individual’s own income (y 0 ) with weight q 0 , and the incomes of others (y i ) with weights q i ;i =f1; 2;::;ng: 6 H(xj) = n X i=0 q i 1[y i x] =q 0 1[y 0 x] + n X i=1 q i 1[y i x] =q 0 1[y 0 x] + (1q 0 ) X i6=0jy i x q i (1q 0 ) For later purposes it will be useful to define how individual i = 0 perceives the distribu- tion of incomes excluding her own. We do that next. 2. Perceived Income Distribution at Time : When we remove the “inclusive” portion of definition 1 we have the perceived income distributionattime. Thisrepresentstheincomedistributionthatindividuali = 0perceives at time 2f0;1;:::;1g of all individuals in her reference group excluding herself i2 f1;:::;ng. F (xj) = n X i=1 q i (1q 0 ) 1[y i x] = X i6=0jy i x q i (1q 0 ) Since this empirical CDF does not include i = 0’s income, we have that P n i=1 q i 1, but P n i=1 q i (1q 0 ) = 1. Thus we essentially define two perceived income distributions at particular time periods . One which includes one’s own income (i.e. the full income distribution that the individuali = 0 perceives) and one which does not (i.e. the income distribution that the individual i = 0 perceives for everyone else excluding him or herself). 3. Inclusive Perceived Income Distribution: 7 This is a memory weighted convex combination of all inclusive perceived income distri- butions that individual i = 0 has perceived over her lifetime. H(x) = 0 X =1 a()H(xj) where a() denotes the memory function, which represents the weight that individual i = 0 places on time period =f0;1;:::;1g. The memory function generalizes the notion of habit formation, by not just including the effect of own past incomes, but also the incomes of all others who receive a non-zero reference weight. We impose a() 0;8 and P 0 =1 a() = 1. A simple specification of the memory function, which we will use in the empirical analysis, is geometric decay of memory: a() = (1a)a , where 0a 1. In that case we call the parameter “a” the memory parameter. The higher its value the more past perceived income distributions matter in the perception of the current income distribution. At the one extreme of a tending to zero, the individual ignores all past per- ceived income distributions at =f1;2;::;1g and thus H(x) =H(xj0). At the other extreme of a tending to one, the individual weights every period in her life equally. 4. Perceived Income Distribution: We define the perceived income distribution (the analog of the inclusive perceived income distribution) as the memory weighted convex combination of all perceived income distribu- tions that individual i = 0 has perceived over time of all individuals excluding herself. F (x) = 0 X =1 a()F (xj) Our definitions 3. and 4. are essentially the analogues of definitions 1 and 2. when applying the memory function a(), which combines the perceived income distributions at 8 each time into individual i = 0’s perception of the income distributions over his or her lifetime thus far. Definition 3., the Inclusive Perceived Income Distribution represents indi- vidual i = 0’s full perception of the income distribution over all individuals in her reference group (including herself). It integrates relativity and habit formation in one framework and will be our main object of interest. 1.3.3 Assumptions We postulate the following hypothesis in equation 1.2: W 1 (y)H(y) (1.2) This is the theory of preference formation developed in Kapteyn (1977). An individual’s satisfaction with income (or her welfare function of income) is given by the position of her incomeinherinclusiveperceivedincomedistribution. Inotherwords, anindividualevaluates her income satisfaction by where it ranks in her inclusive perceived income distribution. The setup combines both habit formation (all own past incomes are represented in H(y)) and relative income considerations. For later purposes it will be useful to put a parametric structure onH(y). In line with the descriptive literature on the shape of income distributions, we will assume that the inclusive perceived income distribution is lognormal: H(y) = ln(y) (1.3) Where () is the CDF of the standard normal distribution. 1.3.4 Log-moments The above formulation allows for an exploration of the effect on income satisfaction of both income growth and of one’s position in the income distribution. To further spell out 9 the implications of the model, it is useful to concentrate on the first two log-moments of the four distributions defined above. We define those moments as follows: [ 2 ] as the log-mean [log-variance] of the inclusive perceived income distribution at time ,H(xj);m [s 2 ] as the log-mean [log-variance] of the perceived income distribution at time , F (xj); [ 2 ] as the log-mean [log-variance] of the inclusive perceived income distribution, H(x); and m [s 2 ] as the log-mean [log-variance] of the perceived income distribution, F (x). We first examine the log-means of the perceived income distribution at , (m ) and the inclusive perceived income distribution at , ( ): m = Z 1 1 ln(x)dF (xj) = n X i=1 q i (1q 0 ) ln(y i ) = Z 1 1 ln(x)dH(xj) =q 0 ln(y 0 ) + (1q 0 )m = n X i=0 q i ln(y i ) Thelog-means ofthe perceivedincome distribution, (m)& theinclusive perceivedincome distribution, () are: m = Z 1 1 ln(x)dF (x) = 0 X =1 a()m = Z 1 1 ln(x)dH(x) = 0 X =1 a() Thus the first log-moment of the inclusive perceived income distribution is: 10 = 0 X =1 a()[q 0 ln(y 0 ) + (1q 0 )m ] =q 0 0 X =1 a() ln(y 0 ) + (1q 0 ) 0 X =1 a()m =q 0 0 X =1 a() ln(y 0 ) + (1q 0 )m Next we investigate the log-variances of the perceived income distribution at , (s 2 ) & inclusive perceived income distribution at , ( 2 ): s 2 = Z 1 1 ln(x)m 2 dF (xj) = n X i=1 q i (1q 0 ) ln(y i )m 2 2 = Z 1 1 ln(x) 2 dH(xj) = (1q 0 ) q 0 (ln(y o )m ) 2 +s 2 and the log-variances of the perceived income distribution, (s 2 ) & inclusive perceived income distribution, ( 2 ): s 2 = Z 1 1 ln(x)m 2 dF (x) = 0 X =1 a() Z 1 1 ln(x)m 2 dF (xj) 2 = Z 1 1 ln(x) 2 dH(x) = 0 X =1 a() 2 + ( ) 2 Obviously, will be higher if past incomes were higher or if people in an individual’s reference group have (or had) higher incomes. 2 will be higher if incomes in the individual’s 11 reference group have been more dispersed, if her income is further removed from the median and if incomes have varied a lot over time (so that the term ( ) 2 tends to be large). 1.3.5 Examples With the formulation of our model in place, we can examine various example situations in the hope of shedding further light on some important debates in the literature. One such debate is whether economic growth increases life satisfaction. Easterlin and coauthors in various papers (Easterlin (1974); Easterlin et al. (2010); Easterlin (2015)) have argued there is no long-term effect of economic growth on life satisfaction, although there are short-term business cycle effects. However, others have claimed that growth actually positively impacts life satisfaction through various channels (Stevenson and Wolfers (2008a), Stevenson and Wolfers (2008b), Stevenson and Wolfers (2013); Sacks et al. (2012)). Our model is very much in the spirit of Easterlin’s work and we will illustrate how the model generates stylized facts consistent with the work by Easterlin and his collaborators. Additionally, we can examine more micro-level questions like for example the short-run and long-run effects on life satisfaction of one-off income boosts like winning the lottery, or the effects of regular income growth over the life cycle (e.g. as a result of seniority wages). One-off across the board income increase To simplify the example, assume that all incomes in society have been constant for a sufficiently long time so that we can assume: = ; m =m ; 2 = 2 ; s 2 =s 2 i.e. the log-moments of the inclusive and non-inclusive perceived income distributions are equal to the log-moments of the respective perceived income distributions at any time . Further, from now on we will impose a geometric decay of memory a() = (1a)a , 12 0a 1. We will also assume that an individual’s reference group consists of all members of the society with equal reference weights except the weight on self. i.e. q i =q i 0,8i6= 0. Consider an identical one time increase in income for all individuals in society by a factor exp(), i.e. y i = exp()y i;1 . It is straightforward to derive satisfaction with income in the next period (see Appendix A.1.2). The numerator in (1.3) becomes ln(y) +a. This is intuitively plausible: The memory parameter a determines the speed of adjustment in the first log-moment of the of the perceived income distribution. The smaller a is, the quicker the adjustment. For the denominator in (1.3) we obtain the square root of 2 + (1a) 2 . This also makes intuitive sense: The higher the one time income increase is, the higher becomes the variance of the inclusive perceived income distribution due to the increased distance between the new income and the past incomes. An increased variance scales down the positive effects of the increase in the relative level of the new income given by the numerator. The new level of satisfaction with income becomes: W 1 (y exp()) = ln(y) +a p 2 + (1a)a 2 (1.4) In comparison with the case with no income increase, we see that the right hand side of equation 1.4 will be larger than equation 1.3, for positive and smaller for negative . 2 This illustrates that economic growth, defined as an equal proportional increase of all incomes will increase well-being in the short-run. However, the gain will dissipate over time as both numerator and denominator converge to their previous values (due to the effects of the memory parameter a). Sustained income growth across the board The previous example of a one time increase in income can be contrasted with the case were all incomes grow permanently by a factor. This straightforwardly showcases the real- 2 This statement holds for values of of that are not “too large”. For extremely large values of the increase in the value of the denominator may dominate the effect of the increase in the numerator. 13 world example of the effects of sustained economic growth, which most developed economies have experienced over extended periods of time. Once again, to keep the discussion manage- able it is useful to consider a special case. Let the log-mean and log-variance of the societal income distribution at time be m and s 2 (which are also the moments of the perceived income distribution at time ). Since we assume that all incomes grow at the same rate , we have that ln(y i ) = ln(y i1 ) +,8i; and that m = m 1 +, = 1 +. The fact that all incomes increase by the same factor additionally implies that: s 2 =s 2 and 2 = 2 , i.e. the variance of the perceived income distributions at any time remains constant. We can express various terms in terms of their current values: s 2 =s 2 0 , 2 = 2 0 ,8. ln(y i ) = ln(y i0 ) +; m =m +, = +,8 Using these relations we obtain the following satisfaction with income (see Appendix A.1.2 for detailed derivations): W 1 (y 00 ) = (1q 0 )[ln(y 00 )m 0 ] + a 1a q 2 0 + 2 a (1a) 2 ! (1.5) As before, the numerator shows that income satisfaction will be larger for higher growth rates, but the effect is tempered by the fact that higher growth rates also increase the denominator through higher variance of the overall inclusive perceived income distribution (the individual observes a higher inequality of income distributions over time). What is most important in this case though is pointing out that, in contrast to equation 1.4 where the level of income satisfaction dissipated over time, the income satisfaction in equation 1.5 is now constant. In other words the positive income growth rate does imply a sustained increase in the level of life satisfaction. However it does not imply a positive growth rate in life satisfaction. To get a constant growth in the level of life satisfaction, we would then need 14 a constant increase in the growth rate of income satisfaction, a scenario that is, at most, rarely observed - and even when it happens, it does not last long. One-off, self-only income increase We now turn our attention to changes in the relative position of one’s income level in comparisontotheotherincomesinherreferencegroup. Ourthirdexampleexaminesthecase where an individual experiences a one-off income increase equal to exp(), while all other incomes remain unchanged. Examples of such events would include winning the lottery, receiving an unexpected inheritance or being successful in the stock market (Gardner and Andrew (2007); Lindqvist et al. (2018). We will approach this case similar to the example of the one-off income increase for all members of the society in section 1.3.5. We once again assume that all incomes of individuals in the society have been constant for a sufficiently long time, which will allow us to exploit the following simplifications: = ; m =m ; 2 = 2 ; s 2 =s 2 i.e. the log-moments of both perceived income distributions at time are constant and equal to the log-moments of the respective overall perceived income distributions. In this example, we are assuming a one-off increase in individual i = 0’s income by exp(), while all other incomes remain unchanged. In this case we obtain the following expression for income satisfaction (see Appendix A.1.2 for derivations): W (y 00 exp()) = ln(y 00 ) +[1 (1a)q 0 ] q 2 +(1a)q 0 q 0 a 2 + (1q 0 )A ! (1.6) Where A = [ + 2(ln(y 00 )m 0 )]. 15 Similartoexample1.3.5, weseethattheindividuali = 0’sincomesatisfactionisamended (in comparison to before the income increase) by changes in the individual’s income and changes in the first two log-moments of her inclusive perceived income distribution as follows. The interpretation of the numerator in equation 1.6 is relatively straightforward. The extra term[1(1a)q 0 ] is non-negative for any positive income increase. While the increase in her own income is exp(), some of the effect dissipates due to the effect of own income on the log-meanoftheinclusiveperceivedincome distribution. Thisisa purehabitformationeffect. The size of the habit formation effect depends on the memory parameter a and the weight q 0 of own income in the inclusive perceived income distribution. The smaller a is (signifying quick adaption) and the higherq 0 , the more habit formation will dilute the positive effect of the income increase. The expression in the denominator is unwieldy, but it is instructive to consider some special cases. First consider the case a = 1, i.e. there is no memory decay, or equivalently, preferences are constant. We see immediately that the numerator becomes ln(y 00 ) +, while the denominator is equal to . Thus the full income increase will be reflected in an increaseinincomesatisfaction. Thepolaroppositecaseiswherea = 0, i.e. preferencesadapt immediatelytotheincomechange. Inthatcasethenumeratorbecomes ln(y 00 )+(1q 0 ). In this case part of the effect of the income increase dissipates due to habit formation. The denominator in this case is equal to p 2 +q 0 (1q 0 )A. Ifln(y 00 )m 0 is negative (i.e. own income before the increase was below the median), then it is possible that the added term is negative and the denominator would actually shrink, which would magnify the effect of the income increase on income satisfaction. If on the other hand own income before the increase was already above the median then the extra terms in the denominator is positive and the increase of the denominator blunts the effect of the income increase. In other words, the effects of one-off income increases on income satisfaction will tend to be larger for incomes below the median than for incomes above the median. 16 Sustained, self-only income growth Our final example examines an individual’s income satisfaction with a constant income growth rate. This may be the case of a young person entering the workforce and embarking onasuccessfulcareerfilledwithpromotionsandothersalaryincreases, whileaverageincomes in society remain stable. Thus we assume: y 0 =y 0;1 exp(),) ln(y 0 ) = ln(y 0;1 ) +, or ln(y 0 ) = ln(y 00 ) + (note that is negative). For all other individuals, y i = y i;1 , 8i6= 0. The first obvious implication of the fact that the incomes of alli6= 0 are constant is that: m =m 0 =m, s 2 =s 2 0 =s 2 . Calculating the log-moments of individual i = 0’s inclusive perceived income distribu- tion, , , we find her income satisfaction to be equal to (see Appendix A.1.2 for detailed derivations): W (y 00 ) = (1q 0 )[ln(y 00 )m 0 ] +q 0 a (1a) p 2 0 +C[q 0 B + (1q 0 )A] ! (1.7) Where: A = [ 1+a 1a + 2(ln(y 00 )m 0 )], B = 1 1a , C =q 0 a 1a Once again the numerator is relatively easy to interpret. The income satisfaction de- picted in equation 1.7 resembles characteristics of equations 1.5 (sustained income growth for everyone) and equation 1.6 (One-off self-only income growth). The numerator is easy to interpret through equation 1.5. It’s the same except that the term a 1a is now scaled by the own income reference weight q 0 . This makes sense since, in contrast to example 3.5.2, in this case only i = 0’s income is increasing. The denominator resembles the one in equation 1.6 with C similar to the term q 0 (1a). The difference is that the memory parameter a now increases C. This happens because in contrast to example 1.3.5, the income increase is sustained, so the higher the memory of past periods, the higher will be the income variation 17 (due to the income increase of only own income). A similar argument can be made for both the terms A and B in equations 1.6 and 1.7. We can safely ignore the possibility of the negative impact of the term (ln(y 00 )m 0 )] in the variance since, after some point, ln(y 00 ) will clearly overtake m 0 )]. Every period, individuali = 0 will be getting an increase in her income satisfaction because of the increase of her income which will be q 0 scaled down by the fact that this increase also increases the first log moment of her inclusive perceived income distribution and will also be scaled down by the fact that this increase, also increases the log variance of the distribution. Notice that equation 1.7 converges to ( p (1q 0 )=q 0 ) as ln(y 00 m 0 )!1 (which will be the case as an individuali = 0’s income keeps increasing at any positive rate> 0, while the incomes of everyone else remain constant) (see Appendix A.1.2 for derivation). Thus, at the limit, individuali = 0’s income satisfaction will depend on the weight thati = 0 attaches to her own income versus the incomes of everyone else in her reference group. Let’s take a look at the two limiting cases. (1) As q 0 ! 0 (i = 0’s perception of the income distribution resembles more and more the true income distribution), ( p (1q 0 )=q 0 )! (1) = 1. In this case, the individual’s income satisfaction is at its maximum since her income is so far to the right of her perceived (true) income distribution. i.e. The more the individual perceives the world around her, the more she internalizes the benefit of her increased income and consequently, the happier she is with that income level. (2) As q 0 ! 1 (i = 0 only perceives her own income - no other incomes are relevant in her perception of the income distribution), ( p (1q 0 )=q 0 )! (0) = 0:5. In this case, since the individual does not perceive any other income as relevant, the constant increase in her own income is entirely offset by the identical increase in her perceived income distribution (which consists of only her own income). Thus, the less concerned an individual is with the incomes of others, the less gain in income satisfaction the individual receives from increases in her own income. 18 1.4 Empirical Analysis 1.4.1 Data description We are using data collected by Gallup World Poll (GWP). Since 2005, the Gallup World Poll continually surveys residents in over 150 countries, interviewing about 1,000 randomly sampled individuals in each country. World Poll questions measure opinions about national institutions, corruption,youthdevelopment, communitybasics,diversity,optimism, violence, religiosity, and other topics. The World Poll questionnaire is translated into major languages of each country. The translation process starts with an English, French, or Spanish version, depending on the region. A translator proficient in both original and target languages translates the survey into the target language. A second translator reviews the language version against the original version and recommends refinements. With some exceptions, all samples are probability-based and nationally representative of the resident population aged 15 and older. The coverage area is the entire country including rural areas, and the sampling frame represents the entire civilian, non-institutionalized, aged 15 and older population of each country. Exceptions include areas where safety of interview- ing staff is threatened, scarcely populated islands, and areas interviewers can reach only by foot, animal, or small boat. Specifically, sampling in the Central African Republic, Demo- craticRepublicoftheCongo,Lebanon,Pakistan,India,Syria,Azerbaijan,Georgia,Morocco, Myanmar (Burma), Chad, Madagascar, Moldova, and Sudan was affected by security; some of these as well as Canada, China, Laos, and small parts of Japan had non-representative sampling of some geographic regions. In Arab countries (Bahrain, Kuwait, Saudi Arabia), sampling was of citizens (including Arab expatriates) and those who could complete the survey in Arabic or English; in the United Arab Emirates, all non-Arabs were excluded, i.e. more than half of the population. In the Philippines, urban areas were over-sampled. Israel excludes East Jerusalem (Gallup reports Palestinian Territories separately). We conduct 19 various robustness specification tests excluding various non-representative sample countries, still our results are largely robust to these exclusions. Telephone surveys are used in countries where telephone coverage represents at least 80% of the population or is the customary survey methodology. In Central and Eastern Europe and most of the developing world, an area frame design is used for face-to-face interviewing. In some countries, over-samples are collected in major cities or areas of special interest. In some large countries, such as China and Russia, samples of at least 2,000 are collected. Gallup has created a worldwide data set with standardized income and education data. Similarly annual household income in international dollars is calculated using the Individual Consumption Expenditure corrected for the Household PPP ratio from the World Bank. ThesePPP-correctedvaluescorrelatestrongly(r=0.94)withtheWorldBankestimateofper- capita GDP (PPP-corrected). The result is a household income measure that is comparable across all respondents, countries, and local and global regions. Response rates are calculated according to AAPOR Standard Definitions (Callegaro and Disogra, 2008), and reported figures include completed and partial interviews, refusals, non- contacts, and unknown households. Gallup World Poll response rates vary by mode of survey and region. Response rates in Sub-Saharan Africa are higher than other world regions, ranging from a high of 96% in Sierra Leone to a low of 54% in Nigeria, with an average response rate of 80%. Average response rates for the Middle East, Asia, South America and former Soviet Union countries are 63%, 56%, 43%, and 50%, respectively. As part of a National Institute on Aging supported project, Gallup added a module with anchoring vignettes to surveys conducted in 109 countries conducted during 2011-2014. These countries provide the sample for the current paper. Eighteen countries were inter- viewed in 2011, 39 in 2012, 26 in 2013, and 26 in 2014. Most countries have approximately 1,000 observations, with the exceptions of Russia (1,500), India (5,000), China (4,500), Ger- many (3,000), United Kingdom (3,000) and Haiti (500), for a total of about 120,000 obser- vations. 20 The GWP contains several questions about life satisfaction, including experiential well- being questions. However, in this paper we focus on evaluative well-being by using variations of the following question: 3 “Please imagine a ladder with steps numbered from zero at the bottom to 10 at the top. The top of the ladder represents the best possible life for you and the bottom of the ladder represents the worst possible life for you. On which step of the ladder would you say you personally feel you stand at this time?” Anchoring vignettes In fact, the variant of the Ladder question we will use, takes advantage of the inclusion of anchoring vignettes in the GWP surveys. Respondents were asked to rate the life satisfaction of six hypothetical people (vignettes) who varied in gender, income, age and life situation. The six vignettes were from one of two sets (set A and set B), where the set administered to a respondent was randomly assigned within each country. Table 1.1 summarizes the 12 vignettes, characterizing the description of each vignette along six dimensions, Income, Sex, Age, Health, Family, Job. 4 Within each country, roughly half of the respondents rated the set A vignettes and about half answered the set B vignettes. For each of the six vignettes, each respondent was asked to answer the following question: “Imagine again a ladder with steps numbered from zero at the bottom to 10 at the top, where zero is the worst possible life and 10 is the best possible life. I will now read you some descriptions of several people’s lives. Please tell me on which step of the ladder you think each person stands.” The fact that the vignette rating question is almost identical to the formulation of the evaluativewell-beingquestioncitedabove, indicatestheirintendeduseasanchoringvignettes 3 The question is often referred to as the “Cantril ladder” (Cantril (1965)) and has been used extensively in the literature to elicit life satisfaction. 4 For a complete description of the vignettes see Appendix A.3.2 21 Table 1.1: Summary of vignette descriptions. ID Income Sex Age Health Family Job A1 Median Female 40 Back pain Good - A2 Twice Male 50 Good Divorced; Good Secure median relationship with daughter A3 Half Male 25 Good; Single; Worries about median Some stress Many friends losing job A4 Median Female 35 - Married; Dull; No children Secure A5 Half Female 70 Back pain Widow; - median Many friends A6 Twice Male 60 Very active Single; No work but median Many friends happy with it B1 Twice Male 40 Severe Happy; Good Likes job median back pain B2 Half Female 65 Heart prob. Widow; Misses. No job median husband; because median Good family of health B3 Half Male 35 - Married; Dull; Secure median No children B4 Twice Female 60 Sleep trouble Divorced; Little Interesting median children contact B5 Median Female 70 Overweight; Married; lives - Trouble away from walking family B6 Median Male 50 Rare excercise; Married; Spend Secure Knee little time trouble; together to correct for response scale differences across countries 5 . Montgomery (2016) has used the anchoring vignettes in our dataset to explain how cross-country differences in rating scales between men and women may explain some counter-intuitive findings about the subjective well-being of men and women in some countries. However, we take a different approach. 5 For an excellent introduction to the field and a review of existing literature visit https://gking.harvard.edu/vign. 22 Rather than using vignettes to correct for response scale differences, We will utilize responses to vignette questions to estimate a model of the determinants of life satisfaction. 1.4.2 Model adaptation The richness of our dataset allows us to explore life satisfaction questions in a more struc- tured way than was previously possible. Nevertheless, we need to make some adaptations and impose some assumptions that allow us to fully identify the parameters of primary in- terest, namely the income reference weight on one’s own income q 0 , the memory weighting parameter a and the weight of income satisfaction in the equation explaining overall life satisfaction 1 . Relativity. An important part of our model is relativity, i.e. the weight that individual i = 0 places on incomes of others in her perception of the income distribution around her. Since we do not have information on the potential relationships across people in our sample we will assume that an individual’s reference group consists of all the people in her country with an equal weight on each individual except herself. This is estimated from the sample in each country as q i 1q 0 = 1 n1 ,8i6= 0, where n is the size of the sample in each country (usually n = 1000, see table A.5 in Appendix A.3.3 for details). 6 The assumption implies that the first two log-moments of the (non-inclusive) perceived income distribution, m and s 2 , are equal to the log-moments of the actual observed income distribution in each country. We estimate the log-moments from corresponding country samples in the GWP (which, for most countries, is a representative sample). Given the assumption of equal reference weights assigned to others in society, the remaining parameter of interest is q 0 , the weight that an individual places on her own income history in the inclusive perceived income distribution. 6 We tweak this assumption in a number of ways to test for robustness by redefining reference groups to be individuals within one’s own country, education level (elementary, tertiary, beyond), and age group ( 10 years) and various linear combinations of the above. Our estimation results are largely unaffected and so we stick with the most general specification in order to maximize the power of reference group samples (See Appendix A.2.1 for results). 23 Habit Formation. Another important part of our model is habit formation, i.e. the memory-discounting of past periods in one’s perception of income distributions over time. Unfortunately, our data is cross-sectional, so we have no information on the evolution of individual income histories. Absent this information we assume that all individuals’ incomes have been growing at the rate of their respective countries’ GDP. Information about past GDP levels varies by country. To maximize the number of countries that can be included in estimation, we only consider GDP over the past n years, where n = 10. We assume that GDP has been constant for all years prior to =n. 7 This assumption implies a convenient simplification: The relative position of all incomes remains constant over time. For example, an individual currently in the top 20th percentile of the income distribution in her country has always been in the top 20th percentile of the income distribution in her country. This assumption together with GDP growth data from the World Bank allows us to then use the variation of GDP growth rates across countries to identify the structural memory decay parameter a. The income of the vignettes in our sample was presented in local currency and normalized to each country’s median income in order to facilitate comparability across countries. As is evident from Table 1.1, the 12 vignettes can be partitioned in three groups according to differences in income: {Half median, Median, Twice median}. Denoting the median income multiplier by 2f0:5; 1; 2g, we have that the incomes shown in the vignettes are equal to exp(m) and equivalently that vignette log-income is equal to ln( exp(m)) = ln() +m. According to equations 1.2 and 1.3, an individual will evaluate a particular income level y =k exp(m) by y’s relative position in her inclusive perceived income distribution H(x), with parameters and as follows: 7 In the spirit of the robustness checks used for the relativity assumptions, we tweak this assumption in the following way: 1. We estimate the cross-sectional income growth with age for each country and education group. 2. We add this growth to the GDP growth for each period. We do this in order to best account for the impact of age on one’s income growth. As before we find no significant differences, so we stick with the more general assumption to maximize sample size per reference group. 24 W 1 ( exp(m)) = ln() +m (1.8) Where and were defined in section 1.3.4. The adaptations we made in this section implythatwithinanindividual’sreferencegroup, incomesriseatthesamerate, i.e. i . We can use this implication to simplify some of our expressions as follows: 89 y i = y i1 ;8 m = ln( ) +m 1 ln(y 0 )m = ln(y 01 )m 1 (constant) =q 0 [ln(y 0 )m ] +m s 2 =s 2 1 (constant) 2 = 2 1 (constant) These expressions allow us to rewrite and as follows: = 0 X =1 a() = 0 X =1 a()m +q 0 [ln(y 0 )m ] 2 = 0 X =1 a()f 2 + ( ) 2 g = + 0 X =1 a() 0 X s=1 a(s)m s m 2 Where: 0 X =1 a()m = (1a) 0 X =n a m +a n+1 m n M 8 We use a bar over an expression to denote that this part is constant across . 9 See Appendix A.1.3 for derivations 25 Thus we have that: =M +q 0 [ln(y 0 )m ] (1.9) 2 = + (1a) 0 X =n a (Mm ) 2 +a n+1 (Mm n ) 2 (1.10) The availability of data on GDP growth rates varies by country. For most countries we have estimates of GDP growth, going back in time by ten years. Thus we set n = 10. We assume that prior to =10, income growth is zero. 10 1.4.3 Specification With the above, we have a complete specification of the welfare function of income W 1 , which we can use in estimation. However, income satisfaction is just one component of overall life satisfaction (equation 1.1). To complete a specification that can be estimated from the data, we need to define K 1 more domain-specific satisfaction functions. For the purposes of this paper, we will take an agnostic view on the W k for k2f2; 3;::;Kg, and represent satisfaction with these domains through a collection of dummy variables. For example, let the second domain be gender, then the satisfaction for the domain of gender will be represented as: W 2 (x 2 = female) = 1;W 2 (x 2 = male) = 0. The life domains work, family, health are represented similarly (see Appendix A.3.2), while age is represented in linear form. The specifications of the non-income domains can be seen as reduced form expressions and mainly serve to reduce omitted variable bias. In sum, we will be estimating equation 1.11: W (x ivc ) = 0 + 1 W 1;ic v exp(m c ) + 2 W 2 (x 2;iv ) +::: + k W k (x K;iv ) + K+1 Z 1 +::: + K+J Z J + i (1.11) 10 If a is small enough, this assumption should have little effect on the estimation results. 26 where i identifies the individual respondent, v2fA1;:::;A6;B1;:::;B6g is the vignette identifier and c is the country identifier. x 0 s represent other vignette characteristics defined above, while Z’s represent both respondent (age, gender, education, etc) characteristics measured in the GWP and country-level characteristics collected from the World Bank (Life expectancy, Education expenditure, GDP per capita). The W 1 function varies across indi- viduals according to equation 1.8 and the assumptions made on the idividual parameters and . For a given value of , W 1 (x 1 ) is fully specified by the set of equations {(1.8), (1.9), (1.10)}. 11 . We estimate the structural parameters of the model {a, q 0 } as well as the reduced-form parameters using nonlinear least squares clustering standard errors at the country level. 1.5 Results In what follows and prior to the structural estimation of our model we present some sum- mary statistics and reduced-form evidence that further motivate our theory and estimation results. 1.5.1 Summary statistics Anchoring vignettes Table 1.2 summarizes average responses for the twelve vignettes as well as the life satis- faction question. 12 (Respondents use an 11-point discrete scale [0-10]). For ease of exposition and interpretation, we provide the gender and income of each vignette. The columns depict average ratings of each vignette. Each vignette identifier is superscripted with the corre- sponding (Gender, Income) characteristics. For example “A1 (F;m) ” represents vignette A1 which describes a female earning median income (for full vignette description see Appendix 11 To test for robustness we also utilize a different set of vignette dummies and individual-level variables. The results are largely unchanged. See Appendix A.2.1 12 This is the question where respondents were asked to rate their own life satisfaction. 27 Table 1.2: Summary statistics for reported life satisfaction of vignettes. A-set A1 (F;m) A2 (M;Tm) A3 (M;Hm) A4 (F;m) A5 (F;Hm) A6 (M;Tm) Mean 4.46 6.39 4.03 5.27 3.95 6.6 SD 2.04 2.39 2.05 2.16 2.18 2.48 SD-Cntry 0.71 1.15 0.69 1.01 0.94 1.26 SD-Age_grp 0.12 0.30 0.05 0.23 0.24 0.35 B-set B1 (M;Tm) B2 (F;Hm) B3 (M;Hm) B4 (F;Tm) B5 (F;m) B6 (M;m) Mean 5.14 3.95 4.27 5.05 3.97 4.11 SD 2.18 2.17 2.17 2.16 1.95 2.00 SD-Cntry 0.75 0.88 0.88 0.64 0.54 0.59 SD-Age_grp 0.14 0.17 0.17 0.15 0.06 0.07 A.3.2).“SD” is the standard deviation across the whole sample, “SD-Cntry” is the standard deviation of the averages responses across countries. “SD-Age_grp” is the standard deviation across eight age groups, each with a 10-year range each: [15, 25], (25, 35], (35, 45], (45, 55], (55, 65], (65, 75], (75, 85], (85, 100]. Vignettes in the set B receive significantly lower ratings from respondents than the vi- gnettesinsetA.Sinceincomes, genderandagesaresimilaracrossthetwosets, thisdifference is likely caused by the difference in vignette characteristics. Indeed, one can easily see (from Table 1.1 or the Appendix A.3.1) that vignettes in set B, on average, appear to exhibit worse health, family, and job characteristics than the ones in set A. Responses to set A vignettes seem to have higher standard deviations across countries and age groups compared to vi- gnettes in set B. Participants’ reported own life satisfaction is similar across the two vignette sets. This is expected due to random assignment and the fact that own life satisfaction was asked prior to the vignette questions. Interestingly, the mean reported own life satisfaction is much higher than the average reported life satisfaction of vignettes. This is especially true in the B set. Just two vignettes in the A set have received a higher average score than the average own life satisfaction. In fact, about one-third of respondents rate their life satisfac- tion at least as high as their highest rated vignette. If we control for the respondents’ income 28 we find that about half of the respondents whose family incomes are less than the vignette’s family income report at least as high life satisfaction for themselves as for the vignette in the case of the vignettes with half-median or median incomes. In the case of vignettes with twice-median incomes, this becomes one-third. These results may suggest that respondents evaluate their own life satisfaction differently than the life satisfaction of vignettes and for that reason we will not use own life satisfaction responses in our model. Other variables of interest In addition to responses to vignette satisfaction questions that constitute our dependent variable, for our model estimations we include several respondent characteristics, such as log of household income, age, gender, etc. Table 1.3 summarizes the independent variables in our sample. The variable “ln(HHincome)” represents the natural logarithm of the household income in international dollars (see Section 1.4.1 for a description of how household income is calculated). We use the acronym “HH” for household. The variable “Education” takes the value of 1 for elementary education or less, 2 for secondary and up to three years of tertiary education, 3 for four years of tertiary education or beyond. The variable “Life satisfaction in 5 years” is the same Cantril ladder question as in Table 2 except that respondents were asked to imagine their life in 5 years time. The reason for the missing observations is that respondents had the option of declining to answer a particular question. The average age in our sample is 40.7 years (25.7 + 15), 54% of the sample is female, and also 54% is married. Less than 4% is divorced and about 7% is widowed. About 7% is unemployed, while the self-employed make up about 15% of the sample. For 76% of the sample religion is important and just 39% have home internet access; 41% live in urban areas. Interestingly, the average responses for the estimated life satisfaction five years after the interview date is much higher than the responses for current life satisfaction (6.65 as compared to 5.35, see Table 1.2). 29 Table 1.3: Average responses for several variables of interest. Variable Observations Mean Std. Dev. Min Max ln(HH income) 101,867 -2.96*10 8 (0.87) -8.86 5.19 Age-15 120,521 25.7 (17.4) 0 84 Female 120,823 0.54 (0.5) 0 1 Married 120,318 0.54 (0.5) 0 1 Divorced 120,318 0.036 (0.19) 0 1 Widowed 120,318 0.072 (0.26) 0 1 Domestic partner 120,318 0.05 (0.213) 0 1 Unemployed 117,819 0.069 (0.25) 0 1 Self employed 117,819 0.15 (0.35) 0 1 Urban 118,138 0.41 (0.49) 0 1 HH adults 120,743 3.08 (1.82) 1 96 HH Children 120,272 1.27 (1.8) 0 97 HH size 120,194 4.35 (2.89) 1 104 Education 120,236 1.81 (0.67) 1 3 Religion important 99,615 0.76 (0.43) 0 1 Home internet access 120,293 0.39 (0.49) 0 1 Life Satisfaction in 5 years 111,201 6.65 (2.4) 0 10 1.5.2 Reduced-form evidence As a starting point in identifying the determinants of life satisfaction we ran a reduced form ordered probit regression explaining the vignette evaluations by three groups of in- dependent variables: vignette characteristics, respondent characteristics and country-level characteristics. The results of this regression are shown in table 1.4. We examine each set of independent variables in turn. The results for vignette characteristics are mostly signif- icant as expected: Vignettes with higher income (median multiplier ) get rated higher as do female vignettes, while age does not seem to be significant. Our coding of vignette life situations (health, family, job) seems to well capture those features with all “good” vari- ables positively affecting the rating while all the “bad” variables affect the rating negatively, generally with a stronger effect than the “good” ones. Having a bad job seems to have the strongest negative impact on the rating. 30 Turning to respondent characteristics we find that respondents with higher incomes tend to rate vignettes lower. This is consistent with the motivation behind the relativity model, since an individual with a higher income would be more likely to perceive a right-shifted income distribution than an individual with a lower income. As a consequence the former individual will be less satisfied with a given level of income than the latter. While female vignettes tend to be rated higher, female respondents do not exhibit significantly different rating behaviors than male respondents. Individuals who report higher life satisfaction both now and expected in 5 years tend to also rate vignettes higher while education, household size, being married or living in an urban area negatively affect ratings. Being separated, divorced, living with a domestic partner, being unemployed or self employed have a positive effect on ratings. Finally, at the country-level, we find that countries with higher GDP per capita tend to give higher ratings to vignettes. However, this is reversed for GDP growth. 1.5.3 Structural estimation To estimate our model (equation 1.11) we use nonlinear least squares clustering at the country level. As starting values for the estimation algorithm we use { 0 ; 1 ;q 0 ;a } = {0, 1, 0.5, 0.8}. Thesevaluesrepresentfindingsinpastliterature(seeKapteynandFederica(2003), Van de Stadt et al. (1985)). To ascertain robustness, the Appendix A.2.1 shows results based on different starting values. The estimates of the effects of the vignette characteristics are broadly plausible: vignettes are rated higher when health is better, and when the job is better. Bad family relations lead to lower evaluations, although our operationalization of a good family life seems to lead to a lower rating than the left out category. The estimated memory parametera is close to one, which makes the approximation of setting GDP growth to zero for periods more than 10 years before the current date poorer than we would have liked. The estimate of q 0 is small compared to estimates from earlier papers (Van de Stadt et al. (1985)). The country-level macro variables life expectancy and education expenditure do not seem to significantly impact the vignette evaluation while GDP per capita is only 31 Table 1.4: Reduced form ordered Probit regression. Vignette Evaluation Vignette characteristics: 0.6*** (0.008) Female 0.21*** (0.007) Age -0.013*** (0.0003) Health-good 0.2*** (0.009) Health-bad -0.49*** (0.008) Family-good 0.21*** (0.007) Family-bad -0.05*** (0.009) Job-good 0.32*** (0.009) Job-bad -0.66*** (0.013) Respondent characteristics: ln(HH Rel. income) -0.053*** (0.004) Female 0.02*** (0.006) Age -0.002*** (0.0007) Age Squared 0.00004** (0.00001) Education 0.015*** (0.005) HH size -0.009*** (0.001) Married -0.06*** (0.009) Separated 0.06*** (0.02) Divorced 0.026 (0.019) Widowed -0.037** (0.016) Domestic partner 0.03* (0.015) Urban -0.03*** (0.007) Unemployed 0.022* (0.013) Self employed 0.017** (0.009) Country-level characteristics: GDP growth -0.66*** (0.15) Standard of living*** -0.15 (0.008) Population -1.7710 11 ** (8.5210 12 ) Life expectancy 0.008*** (0.0007) ln(Health expenditure) 0.14*** (0.005) N 445,981 Standard errors are clustered at the country level. (*p< 0:1; **p< 0:05; ***p< 0:01. ) 32 Table 1.5: Structural estimation. Vignette Evaluation Deep parameters: 0 -6.93 (8.95) 1 0.209*** (0.02) q 0 0.2*** (0.02) a 0.96*** (0.042) Reduced-form parameters: Age -0.002*** (0.0002) Female 0.05*** (0.006) HealthGood 0.04*** (0.004) HealthBad -0.08*** (0.006) FamilyGood 0.03*** (0.004) FamilyBad -0.03*** (0.007) JobGood 0.08*** (0.01) JobBad -0.09*** (0.007) N 437,246 R 2 0.213 MSE 0.2 Res. Dev. -148583.4 Estimatedusingnonlinearleastsquareswithclusteredstandarderrorsattheindividuallevel. In the regression we additionally control for all the variables reported in Table 1.4. *p< 0:1; **p< 0:05; ***p< 0:01 significant at the 10% level and shows a positive impact on the vignette evaluation i.e. individuals from countries with a higher GDP per capita would tend to rate a vignette higher. InAppendixA.2.1,weprovideadditionalmodelestimationresultswherewecodevignette characteristic in a different way and we add respondent characteristics. The results in table 1.5arelargelyrobusttothesealternativespecifications. Wherewefindsignificantdifferences, themeansquarederrorishigher, suggestingthatourestimatescorrespondaglobalminimum of the sum of squares. 33 1.5.4 Simulations To gain some insight into the meaning of the estimated parameters, we provide a number of simulated outcomes, based on the parameter estimates in Table 1.5. We present results for three countries that vary in growth rates over the previous decade: China (high growth), Italy (stagnation), Greece (negative growth). To make outcomes comparable, all macro variables and vignette dummies are set equal to the overall sample mean. We then simulate three scenarios per country: (1) own income history and own income inequality; 13 (2) own income history and low income inequality (s 2 =.21329 - Belarus in our sample); (3) own income history and high income inequality (s 2 =2.212288 - Namibia in our sample). For each scenario, we consider evaluations by three individuals/households; with theirincomebeingateitherthe20 th ,50 th ,or80 th percentileoftheirrespectivesamplecountry income distributions. Thus we simulate 3 countries * 3 scenarios * 3 incomes/households * 3 vignette income levels ( =f0:5; 1; 2gmedian). The simulated numbers (representing the expected evaluation of life satisfaction) are presented in Table 1.6. Thetableshowsveryclearlytheeffectofrecenteconomicgrowth: accordingtothemodel, Chinese respondents evaluate vignettes higher than Italians or Greeks, because their frame of reference is strongly influenced by their past incomes, which were substantially lower than their current incomes. As one would expect, the evaluations increase with . Note that respondents with higher incomes evaluate a given vignette lower than respondents with lower incomes (within columns, compare vignettes with the same value of , but different respondent incomes). The differences are not large, due to the small estimate of q 0 . Income inequality within a country influences the sensitivity of evaluations to the value of . To provide one example, consider an Italian respondent with median income. In the scenario with low income dispersion, the ratings of the highest and lowest income vignettes differ by 13 For own income history we use the country’s history of GDP growth. For own income inequality we use the sample variance of reported incomes in that country. 34 .340 (.624-.384); in the scenario with the high income dispersion, the difference is only .102 (.555-.453). Figure 1.1 contains the graphs showing how the three hypothetical households would evaluate different incomes. The vertical lines in the graphs show the three different income levels mentioned in the graphs (half median, median, twice median). Of particular interest are the intersections of the evaluation functions with the vertical lines, as these represent the simulated evaluations of the vignette incomes, summarized in Table 1.6. In the graphs blue lines represent the households at the 20 th percentile of the respective country’s income distribution. Red lines represent the households at the 50 th percentile of the respective country’s income distribution. Green lines represent the households at the 80 th percentile of the respective country’s income distribution. Asexpected, bluelinesarehigherthanred, andredlineshigherthangreenindicatingthat higher-income households would rate the same income vignette lower since their inclusive perceived income distribution is more to the right than of poorer households. Nevertheless the differences are not large, reflecting the small effect of own income on the perceived inclusive income distribution. Further, we can see that the lower the inequality in a country the more dispersed are the evaluations by different income-level households. 35 Table 1.6: Simulation results China Italy Greece % Own Low High Own Low High Own Low High 0.5 0.538 0.555 0.535 0.412 0.384 0.460 0.401 0.380 0.454 20 1 0.602 0.635 0.583 0.504 0.498 0.514 0.491 0.480 0.508 2 0.633 0.644 0.616 0.600 0.625 0.567 0.596 0.618 0.562 0.5 0.532 0.553 0.524 0.410 0.384 0.453 0.399 0.380 0.449 50 1 0.597 0.635 0.572 0.500 0.495 0.504 0.487 0.478 0.498 2 0.631 0.644 0.608 0.596 0.624 0.555 0.592 0.616 0.550 0.5 0.525 0.550 0.514 0.408 0.383 0.448 0.398 0.380 0.443 80 1 0.591 0.634 0.561 0.495 0.492 0.495 0.483 0.475 0.489 2 0.628 0.644 0.599 0.592 0.544 0.588 0.588 0.615 0.539 The results represent the simulated evaluation of life satisfaction by a particular household in a particular country evaluating a particular vignette in three scenarios. For example a Chinese household at the 20 th percentile of the Chinese income distribution simulated at the scenario with the Chinese GDP growth and Chinese income inequality would rate the life satisfaction of a vignette with income 0:5median as 0.538. The column “%” shows the representative household at the % of the income distribution in the country; “” shows the vignette median income multiplier; “Own” shows the respective country with own income history and income inequality; “Low” shows the recpective country with own income history and low income inequality (s 2 = 0:21329 - corresponding to Belarus in our sample); “High” shows the respective country with own income history and high income inequality (s 2 = 2:212288 - corresponding to Namibia in our sample); One can get a sense of the effects of economic growth alone (holding inequality fixed) by comparing the “Low” and “High” columns across the three countries. 36 Figure 1.1: Simulation graphs (a) (b) (c) (d) (e) (f) (g) (h) (i) These graphs plot the simulated income satisfactions of the three hypothetical households in three countries (China, Italy, Greece). Blue/Red/Green lines represent the households at the 20 th /50 th /80 th percentile of the respective country’s income distribution. 37 1.6 Conclusion We constructed and estimated a structural model of life satisfaction where an individu- als’ life satisfaction is a weighted sum of their satisfaction with K life domains. We assume that an individual’s satisfaction with income is equal to her ranking in a perceived income distribution. We call this the individual’s perceived inclusive income distribution. It is characterized by two main features: 1. It is a weighted distribution of all incomes in one’s reference group with weights representing the weight that the individual places on the in- comesofothersandherself. 2. Itisaweightedaverageofallsuchperceiveddistributionsover the individual’s lifetime with a memory function representing the weight that the individual places on present versus past periods. The unique feature of our approach is that we model how respondents rate the life satisfaction of others. These others are described by a number of anchoring vignettes that describe individuals who vary in a number of life domains as well as their income. In utilizing our data to estimate our model, we were forced to make a number of simplifying assumptions,such as equal income growth for everyone in a country and reference groups comprising the whole population of a country. These are obviously oversimplifications. On the other hand, by using data on a large number of countries, we can plausibly assume that the reference groups of individuals are for the vast majority contained within their country of residence. 38 Chapter 2 Incentives or Persuasion? An Experimental Investigation 1 2.1 Introduction There are two types of strategies that a principal can employ to influence agent’s behav- ior. The first strategy is to use monetary incentives to alter the size of the potential payoffs that the agent can expect from each available action, and the second is to use (Bayesian) persuasion to alter the likelihood of the potential payoffs resulting from each action. For example, consider an online retailer (the principal) who attempts to convince a consumer (the agent) to buy some product. The consumer faces uncertainty over the quality of the product and prefers to buy if the quality is high and not to buy if the quality is low. The consumer also holds a prior belief about the likelihood of the quality being high and acts to maximize his expected utility. What can such online retailers do to increase the chances that their potential consumers find it optimal to purchase? First, they can provide monetary incentives—like discounts, promotions, or bundling—directed at reducing consumers’ costs. Second, they can utilize (Bayesian) persuasion, for example, design personalized recommen- 1 Joint Work with Giorgio Coricelli and Alexander Vostroknutov 39 dations, expert reviews, or product placement, in order to shift the consumers’ prior beliefs about the quality of the product. While traditionally the design of mechanisms with mone- tary transfers was used in the majority of studies on principal-agent problems (Mechanism Design), more recently, the literature has shifted focus to the design of strategic informa- tion disclosure (à la Kamenica and Gentzkow, 2011) as a robust way of influencing economic agents (Information Design). The timing of the recent developments in strategic information disclosure is no coincidence. The beginning of the informational era of big data and machine learning has given way to massive information gatekeepers (e.g., Yelp, Netflix, Amazon, etc.) who collect and utilize information to guide consumers’ behavior. This has been gradually shifting the focus of interest from monetary incentives to Bayesian persuasion. 2 The goal of our study is to conduct a comparative analysis, both theoretical and behav- ioral, of the two strategies that principals can use to influence agents’ choices. We draw our inspiration from a recent literature that attempts to study Bayesian persuasion in a general game-theoretic framework, making parallels with mechanism design and naming it Information Design. 3 The studies by Bergemann and Morris (2019) and Taneva (2019) ex- amine the problem of a designer who seeks to impose an agenda on a group of players. As is conventional in mechanism design, the designer is assumed to have the ability to commit to a transfer mapping to the players. However, in the parallel world of information design, the information designer is instead assumed to have an informational advantage over the players by being able to commit to a signal structure (probabilistic state-message mapping), essentially recommending actions to players. 4 2 Throughout the paper we use the terms monetary incentives / Mechanism Design; and informational incentives / (Bayesian) persuasion / Information Design interchangeably. 3 The literature on information design is expanding fast with many recent contributions (Alonso and Câmara, 2016a,b,c; Babichenko and Barman, 2016; Bergemann and Morris, 2016; Bizzotto et al., 2016; Boleslavsky and Kim, 2018; DellaVigna and Gentzkow, 2010; Dughmi and Xu, 2016; Dughmi et al., 2016; Dughmi and Xu, 2017; Gentzkow and Kamenica, 2014, 2016, 2017; Gratton et al., 2017; Hernández and Neeman, 2018; Kolotilin et al., 2017; Li and Norman, 2018; Wang, 2013). The early seminal papers include Crawford and Sobel (1982) and Okuno-Fujiwara et al. (1990). 4 For another interesting approach see Mathevet et al. (2020). 40 Thistheoreticalparallelismisbothintriguingandexcitingforgametheoristsandeconomists in general. The mechanism design problems explored in many original studies of the past decades can now be investigated through an alternative route, namely information design. From an applied perspective, this raises several natural questions. How does this theoreti- cal parallelism play out in practice? Can information designers utilize persuasion with the same effect (i.e., to maximize their payoffs) as mechanism designers use monetary incen- tives? How do agents react to being persuaded rather than incentivized? These questions define the scope of our paper. In a simple bilateral setting, where principals can act as both information and mechanism designers, we test whether they are more successful in using incentives or persuasion to influence the agents’ choices and to increase their payoffs. In doing so, we investigate whether behavior in the information design environment follows the predictions of Kamenica and Gentzkow (2011)’s Bayesian persuasion and explore sev- eral behavioral correlates pertaining to the practical differences between the two alternative incentive structures. In the theoretical part of the paper we employ the original two-state Bayesian persua- sion model (Kamenica and Gentzkow, 2011). We start with a baseline setup in which a principal is not able to act as either an information or a mechanism designer and trivially show that, in this case, she is guaranteed to get the lowest possible payoff. Then we extend the baseline game in two directions: 1) the principal can act as an information designer in an attempt to persuade the agent to take the principal-preferred action by communicating informative recommendations and 2) the principal can act as a mechanism designer in an attempt to incentivize the agent to take the principal-preferred action by providing monetary transfers. We show that the two games are equivalent in terms of best response correspon- dences and most importantly that the expected payoffs of both players are identical in what Kamenica and Gentzkow (2011) call the “Principal-Preferred” Subgame Perfect Equilibrium [(PP)SPE]. 5 5 Actually, Kamenica and Gentzkow (2011) name it “Sender-Preferred” Subgame Perfect Equilibrium to emphasize that in all of the cases where the receiver is indifferent between actions she always chooses the 41 With this theoretical equivalence in place, we have a foundation on which we can test our research questions. Experimentally, however, we face two problems: 1) Bayesian persuasion in its standard form is too computationally intensive for lab participants and 2) information design (ID) and mechanism design (MD) are very different games, thus in order to detect behavioral differences that arise exclusively from their inherent features (information vs. money), the experimental setup should maximize the similarity of the two choice environ- ments and eliminate any other confounds. To solve these problems we propose an innovative experimental design, which not only minimizes the differences between the two games, but also renders Bayesian persuasion a relatively “user-friendly” task. In our design, ID and MD tasks are simplified so that subjects in the roles of both principals and agents choose a single number, unlike in the few similar studies (e.g., Fréchette et al., 2019; Nguyen, 2017; Au and Li, 2018) where subjects choose many features related to the incentive structure and the reaction to received signals, which complicates the decision process considerably. Moreover, we use the strategy method that allows us to perform a deeper analysis of agents’ behavior. Overall, average behavior in ID is strongly in line with the theory, whereas the behavior in MD is not. We find that principals, who attempt to influence the agents to choose their preferred action, extract higher rents when using informative recommendations in ID than when using monetary incentives in MD, despite the equivalence predicted theoretically. Specifically,principalsareabletopersuadeagentsmoreoftenthantheyareabletoincentivize them, and successful persuasion attempts are, on average, more profitable than successful incentivization attempts. This result seems to be partially driven by the fact that agents appear to be more demanding when they are being incentivized than when they are being persuaded, which itself hinges critically on the agents’ perception of the relative value of information and money. Despite these differences between ID and MD, principals’ average payoffs in both games still fall short of the theoretical predictions of the Principal-Preferred Subgame Perfect Equilibrium which, according to our analysis, ignores a very important one that the sender prefers. We do the same except that, for reasons pertaining to comparison with the mechanism design literature, we call the Sender Principal and the Receiver Agent. 42 aspect of the principal-agent interaction: the distribution of bargaining power. While the two-stage nature of the non-cooperative game attributes zero bargaining power to the agents, we find that they are able to seize some part of the surplus, as if they have 40% of the bargaining power, a result that is remarkably similar in both ID and MD environments. Our ID condition can act as a standalone experimental test of Kamenica and Gentzkow (2011)’s Bayesian persuasion. The theory’s point-wise predictions are strongly supported in terms of our participants’ choices, since average play is close to the predicted equilibrium in all periods. The predicted equivalence between the two environments breaks down in the MD condition where average play is higher than equilibrium. 6 Behind average play however, we find highly heterogeneous and erratic individual behavior which not only has detrimental effect on some principals’ payoffs, but oftentimes also causes loss of social surplus, indirectly damaging agents’ payoffs too. 7 IntermsofthebehavioraldifferencesbetweentheIDandMDenvironments, thesmoother payoff nature of MD seems to make players more stable in their choices. 8 Conversely, in ID, the possibility of extreme payoffs causes strong reactions that lead to inefficiencies. In addition, the “contractual agreements” implied by ID and MD, while equivalent in theory, seem to be perceived very differently. MD represents a commonplace interaction with a guaranteed payoff and a clear uncertainty resolution by Nature. The persuasion nature of theIDenvironmenthowever—coupledwithasomewhatmoreintricateuncertaintyresolution (i.e., a bad recommendation from the principal)—renders behavior more volatile and perhaps emotion-driven. 9 Finally, in line with past literature (e.g., Kahneman and Tversky, 1973; Charness et al., 2007), our data suggest that participants have more difficulties dealing with 6 AsweshowinSection2.4.2, thishappensduetoagents’monetarydemandsthatforceprincipals’choices upwards. 7 This shows that players’ strategies can vary substantially over time and due to feedback. Thus the practice of computing individual cutoff strategies ex post from individual choices used in some studies might be misleading. 8 In MD, transfers between players mitigate the extreme-payoff nature of the game where participants can receive only 100 or 0 points. 9 For example, when agents are successfully persuaded but end up with a bad outcome they exhibit an extreme reaction, reminiscent of betrayal aversion. In mechanism design such reaction does not arise since agents are directly compensated for exactly this contingency. 43 probabilistic reasoning and Bayesian updating (ID) than with expected utility optimization (MD). Agents’ choices are far from what they themselves believe is optimal to do in ID, but are very close to their beliefs in MD. As such, differences in the computational complexity of the two environmentsplay an importantrole in determining pivotal aspects of the interaction like the agents’ monetary versus informational demands. Overall, our paper shows that, while equivalent in theory, the two types of incentives can have dramatically divergent effects on behavior. Our experiment identifies several dimen- sions where behavior is differentially affected by the use of monetary versus informational incentives, along with the accompanying behavioral correlates responsible for those wedges. Real-world incentive designers (companies like Yelp, Amazon, Google, or governmental agen- cies) can benefit from our findings in choosing the more impactful incentive structure for each environment. Finally, we believe that the insights from our analysis can motivate the de- velopment of more behaviorally-driven theories that could help to bridge the applied side of the parallelism between the two incentive structures. 2.2 Theoretical Framework The results derived from the theoretical framework presented in this section serve as the motivation behind our research questions and the experimental design. Our framework is a version of the original two-state Bayesian persuasion setup of Kamenica and Gentzkow (2011). We begin with an adaptation of the model where the principal (sender) is stripped of his ability to send messages to the agent (receiver), thus making him a mere observer. Thisresultsintheexpectedutilitymaximizingagent(receiver)choosingtheactionthatgives the principal the smallest payoff. We then extend this baseline setup in two independent directions. In the first extension, the principal can commit to probabilistic state-contingent messages to the agent (persuasion) in the same way as in Kamenica and Gentzkow (2011). We call this the “information design extension.” In the second extension, the principal can 44 instead commit to action-contingent transfers to the agent (monetary incentives). We call this the “mechanism design extension.” Finally, the agent observes the principal’s message or incentives and takes an action. Both games admit a unique Subgame Perfect Equilibrium in which the expected payoffs of principals/agents are identical in the two games. More precisely, while the principal benefits from having the ability to commit to messages (per- suasion) or transfers (incentives), she is indifferent between the two. The agent neither gains nor loses from the principal’s ability to persuade or incentivize. 2.2.1 The Baseline Model Suppose that there are two players Principal (P) and Agent (A), and two states of the worldS =fR;Bg(redorblueball)happeningwithprobabilitiesPr(B)>Pr(R)p, which is common knowledge. The agent’s action set is C A =fr;bg, and the Principal’s action set is empty, C P =f;g. The state-action contingent payoffs for each player are denoted by i s;c 2R, wherei2fA;Pg refers to the player,s2S refers to the realized state andc2C A refers to the agent’s action. The agent receives positive payoff if she chooses the action which matches the state (i.e., c = r when s = R or c = b when s = B). The principal’s payoffs are state-independent. She is solely interested in the agent’s action and receives positive payoff only when the agent takes action r. We assume that the payoffs adhere to the following restrictions: A R;r = A B;b A > 0, A R;b = A B;r = 0, P R;r = P B;r P > 0, P R;b = P B;b = 0. To achieve the equality of the expected payoffs in equilibrium in the information design and mechanism design extensions, we need to impose A = P . The payoffs are summarized in Table 2.1. The agent maximizes her expected payoff and thus will always chooseb, which maximizes the (ex-ante) probability of matching the state. Given the agent’s optimal action c =b, the 45 Table 2.1: Payoff matrix in the baseline model State realization R B Agent’s choice r ; ; 0 b 0 ; 0 0 ; In each cell, the leftmost number represents the principal’s payoff. expected utilities of the players are E s P s;c = 0 (Principal Baseline ) E s A s;c = (1p): (Agent Baseline ) 2.2.2 Information Design Extension Consider the following extension of the baseline model where the principal can act as an information designer (Stage 1) prior to the agent’s choice (Stage 2). The principal is now endowed with the ability to construct a state-message mapping (henceforth, a “signal structure”) from which a message m—correlated with the realized state of the world—is communicated to the agent. This mapping determines the level of correlation between the statesoftheworldandthemessages(ortheinformativenessofeachmessage). Theprincipal’s action set becomes C P =f(P R ;P B )j P R ;P B 2 [0; 1]g, where P R =Pr(m =js =R);P B = Pr(m = js = B)g are the probabilities of a message m 2 M = f;g that is to be communicated to the agent 10 Knowing (P R ;P B ) andm, the agent Bayes-updates her beliefs about the likelihood of each state based on the message received and the signal structure from which the message was generated. Given her updated beliefs, the agent maximizes her expected payoff by choosing action c , which matches the state that is more likely to have been realized. Implicit in this are two critical assumptions: 1) the principal is able to condition the messages on the realized state of the world without having observed it and 10 A message or is always sent. When the ball is red (s =R), the message is sent with probability 1P R and when the ball is blue (s =B) the message is sent with probability 1P B . 46 2) the principal can credibly commit to the signal structure (i.e., the agent can observe the mapping which generated the message). Without loss of generality, we assumejMj =jSj (see Kamenica and Gentzkow, 2011). In this way, messages can be thought of as action recommendations. The unique Principal-Preferred Subgame Perfect Equilibrium [(PP)SPE] of this game admits the following expected payoffs: 11 E s P s;c = 2p (Principal ID ) E s A s;c = (1p) (Agent ID ) Note the following: 1) the principal uses Bayesian persuasion (information design) to in- crease her expected payoff from zero (baseline model) to 2p, by providing state-contingent messages(actionrecommendations)thattheagentfindsoptimaltofollowratherthanignore; 2) the agent neither benefits nor loses from the principal’s persuasion. The latter happens because, while the principal tries to extract as much surplus as possible, she is constrained by the expected payoff that the agent can guarantee herself by simply choosing b, which we will refer to as the agent’s outside option. Thus, all social surplus generated by information design is captured by the principal. 2.2.3 Mechanism Design Extension Consider another extension of the baseline model, where the principal can act as a mech- anism designer (Stage 1) prior to the agent’s decision (Stage 2). The principal is now able to construct an action-contingent mapping which determines how payoffs are to be transferred from the principal to the agent conditional on the agent’s action. As a mechanism designer, the principal’s action set becomes C P =f(t r ;t b )j t r ;t b 2 [0; ]g, where t r and t b are the transfers to the agent if she chooses action r orb respectively. The agent takes into account the additional conditional payoffs and chooses action c that maximizes her overall expected 11 See Appendix B.1.1 for derivations. 47 payoff. 12 Implicit in this are two assumptions: 1) the principal is able to condition the trans- fers on the agent’s actions; 2) the principal can credibly commit to the action-contingent transfers (i.e., the agent is guaranteed to receive the transfer that is contingent on her chosen action). The unique Principal-Preferred Subgame Perfect Equilibrium of this game admits the following expected payoffs: 13 E s P s;c = 2p (Principal MD ) E s A s;c = (1p) (Agent MD ) Note that 1) the principal uses monetary incentives (mechanism design) to increase her expected payoff from 0 (baseline model) to 2p, by providing action-contingent transfers which induce the agent to choose action r instead of her baseline-optimal action b; 2) the agent neither benefits nor loses from the principal’s incentivization. The agent’s expected payoff remains the same as in the baseline game, and thus all social surplus from mechanism design is captured by the principal. We summarize the equilibrium expected payoffs from the baseline model and the two extensions: Principal Baseline < Principal ID = Principal MD Agent Baseline = Agent ID = Agent MD The equality of the principal’s equilibrium expected payoffs in the information design and the mechanism design extensions of the baseline model is our theoretical object of interest that the experiment described below is designed to test. 14 12 Here “additional” refers to the conditional payoffs that the agent receives in addition to the payoffs described in Table 2.1. 13 See Appendix B.1.2 for derivations. 14 Thusfar, wehaveassumedrisk-neutralityontheagent’sside. Ifwerelaxthisassumption, theprincipal’s equilibrium payoffs will change in the MD game only and thus the equivalence breaks down. The direction of the wedge will depend on whether the agent is risk averse (in equilibrium, principal benefits relative to ID) or risk seeking (in equilibrium, principal is worse off relative to ID). Principal’s risk preferences do not 48 2.3 Experiment The experiment consisted of two treatments with three sections in each: Section ID (information design), Section MD (mechanism design), and Section 3. The two treatments differed in the order of sections ID and MD, while Section 3 was always implemented the last. This allowed us to investigate possible order effects in the ID and MD sections. At the beginning of the experiment participants were randomly assigned one of two possible roles: Principal (denoted as “Player A”) or Agent (denoted as “Player B”), the roles that were fixed throughout the experiment. In Section ID participants played 10 periods of the information design game, and in Section MD they played 10 periods of the mechanism design game. In each period, every principal was randomly matched with one agent to form a pair and to play the respective game. At the end of each period each participant received feedback about the outcome of the game and points earned. 15 Section 3 consisted of several tasks designed to help us to measure the behavioral traits potentially responsible for behavioral differences in the ID and MD games. Participants were paid for one randomly chosen period from each section with an exchange rate of 100 points corresponding to 5 Euros (thus, they were paid for three choices in total). In each period and for every pair, participants were told that a ball would be randomly drawn from a virtual urn with 10 balls, three of which were red and seven blue. The goal of the agent was to correctly guess the color of the ball, and the goal of the principal was to attempt to persuade (section ID) or incentivize (section MD) the agent to guess red. The color of the ball was revealed to the participants at the end of the period. Experimental points earned by each participant depended on the revealed color and the agent’s guess in accordance with the games in our models with p = 0:3 and = 100. affect the above results. In the experiment, we control for the risk preferences of both kinds of players (see the appendix section B.6 for further discussion). 15 Feedback for principals and agents was not the same and was structured to reflect the information that each player would receive in an extensive form game. See Section 2.3.3 for details. 49 While the ID and MD games in our theoretical framework are described as two-stage sequential games, we implemented both as simultaneous-move games by eliciting agent’s choices with the strategy method. The principal constructed a signal structure (ID) or transfers (MD) and, at the same point in time, the agent chose which signal structures she wished to follow (ID) or which transfers to accept (MD). We describe this in more detail below. 2.3.1 Section ID (Information Design) Principal’s choice. The principal’s role in ID was to act as an information designer in accordance with the ID game described in Section 2.2.2. Specifically, the principal had to construct a signal structure (P R , P B ) that would generate the recommendation “Guess red” or “Guess blue” conditional on the color of the ball drawn. 16 To simplify the decision problem and following the findings of Fréchette et al. (2019), we fixed P R at the equilibrium level of P R = 1 (see Appendix B.1.1). 17 Consequently, the principal’s only choice was to set the percentage chance of generating the correct recommendation when the ball drawn was blue (P B ). We denote this choice by X2 [0%; 100%]. Thus, the principal’s signal structure is (P R ;P B ) = (1;X). To understand what different choices of X imply it is worth considering two extreme cases. WhenX = 100% the recommendation is always correct and fully reveals the color of the ball (Full Information). WhenX = 0% the recommendation is always “Guess red,” so no information about the color of the ball is provided since when the ball is red the recommendation is also “Guess red” (P R = 1; No Information). Agent’s choice. The agent’s role in ID was to determine whether she would follow or ignore the principal’s recommendation for all possible signal structures. Following a recommenda- 16 The signal structure (P R , P B ) sets the probabilities with which each recommendation is generated in each state (ball color) as follows: P R = Pr(m = “Guess red” j s = Red ball) and P B = Pr(m = “Guess blue”j s = Blue ball). 17 Fréchette et al. (2019) allow subjects to manipulate the equivalent ofP R and find that the vast majority of choices are at equilibrium (P R = 1). We, thus, believe that our simplification does not significantly impact the behavior. 50 tion means that the agent guesses the color that the recommendation suggests. Ignoring the recommendation means that the agent guesses blue regardless of the recommendation, which maximizes her payoff given the prior beliefs about the urn composition. Without observ- ing the principal’s choice of X, the agent had to select which signal structures she wished to follow and which ones to ignore by choosing a cutoff minimum value of P B denoted by Y 2 [0%; 100%]. The elicitation of a cutoff is appropriate here because the informativeness of the signal structure is monotonic in X. For example, the agent’s expected payoff from following a recommendation from a signal structure (1;X) is greater than that from follow- ing the recommendation from any signal structure (1;X 0 ) with X 0 < X. By choosing Y in this manner the agent agreed ex ante to follow any recommendation coming from a signal structure that is at least as informative as (1;Y ) and to ignore any recommendation coming from less informative signal structures. Thus, the agent was not explicitly asked for a guess. Instead, if the agent followed the principal’s signal (when XY), her guess of the color of the ball was determined by the generated recommendation (red or blue), while in the oppo- site case (X <Y), her guess was always the color blue. Agents who choose high values of Y are hard to persuade since they only follow recommendations from very informative signal structures, while agents who choose lowY are easy to persuade and follow recommendations from a large range of X. The ID interaction. The principal faces the following tradeoff. Decreasing X yields a higher chance of the “Guess red” recommendation, but a lower chance that it will be followed by the agent (XY). Thus the principal wants to choose X as low as possible conditional on it being weakly greater than Y. The agent is less interested in the principal’s choice. As long as she doesn’t choose Y too low, she is guaranteed a good expected payoff, either by being persuaded or through her outside option (guess blue). 51 2.3.2 Section MD (Mechanism Design) Principal’s choice. The principal’s role in MD was to act as a mechanism designer in accordance with the MD game described in Section 2.2.3. The principal had to choose action-contingent transfers (t r ;t b ) that would be transferred to the agent depending on her guess. In order to make the ID and MD games similar and since the principal earned zero points when the agent’s guess was blue, we fixed t b at the equilibrium level of t b = 0. Consequently, the principal’s only choice was to determine the number of points that would be transferred to the agent if the agent’s guess was red (t r ). We also denote this choice by X2 [0; 100]. Thus, the principal was choosing (t r ;t b ) = (X; 0). Agent’s choice. The agent’s role in MD was to determine whether she would accept or reject the principal’s transfer. If the transfer is accepted, the agent committed to guessing red, while if it is rejected the agents committed to guessing blue. The agent had to choose which transfers she wished to accept and which ones to reject by choosing a cutoff minimum value of t r , once again denoted by Y 2 [0; 100]. Eliciting a cutoff value is also appropriate here since the agent’s expected payoff from accepting the transfer is monotonic in X. By choosing Y in this manner, the agent agreed ex ante to accept the transfer (and guess red) if it was at least Y points and reject it (and guess blue) if the transfer was less than Y points. Thus, like in ID, the agent was not explicitly asked for a guess. If the agent accepted the principal’s transfer (when XY) her guess was red, whereas if the agent rejected the transfer (when X <Y) her guess was blue. The MD interaction. The principal faces the following tradeoff. Decreasing X yields a higher payoff if the agent accepts, but also a lower chance of acceptance (X Y). Thus, once again, the principal should aim to choose X as low as possible conditional on it being weakly greater than Y. The agent is again less interested in the principal’s choice. As long as she doesn’t choose Y too low, she is guaranteed a good expected payoff, either by being incentivized or through her outside option (guess blue). 52 2.3.3 Feedback Attheendofeveryperiod, participantsreceivedfeedbackabouttheoutcomeofthegame. Feedback was designed to reflect the information that participants would have received after playing the two-stage ID or MD game. Both participants learned the color of the ball drawn, the recommendation generated (only in ID), the agent’s guess and the points earned. Agents also learned the principal’s choice ofX (the signal structure or transfer) while principals only learned whether agents followed/accepted (i.e., whether X Y) or ignored/rejected their recommendation/transfer (i.e., whetherX <Y). The reason for this asymmetry in feedback is that neither theory nor practice dictate that the principal should learn the precise amount of recommendation informativeness or transfer amount that would induce the agent to follow the recommendation or accept the transfer. 2.3.4 Summary of the Experimental Procedure In each period (of sections ID and MD) a principal and an agent were randomly paired. Each participant simultaneously chose a number from 0 to 100 by sliding a pointer (see Appendix B.8 for the screenshots). Principals’ choices were referred to as X and agents’ choices as Y. If X Y we say that the principal has successfully persuaded/incentivized the agent (or that the players have matched). In this case, the agent follows/accepts the principal’s recommendation/transfer. This in turn implies that the agent’s guess is deter- mined as follows. In Section ID the agent guesses red if the recommendation is “Guess red” and guesses blue if the recommendation is “Guess blue.” In Section MD if XY the agent guesses red and X points are transferred from the principal to the agent. If X <Y we say that the principal has failed to persuade/incentivize the agent (or that the players have not matched). In this case, the agent ignores/rejects the principal’s recommendation/transfer. This implies that the agent guesses blue in both games and that no points are transferred in MD. 53 2.3.5 Section 3 While the purpose of sections ID and MD was to detect and compare observed behavioral differences across the two parallel—informational and monetary—games, various potential confounds, like risk preferences, probabilistic reasoning, fairness considerations and expected utility maximizing, could be inflating or diminishing the observed wedges. Section 3 was designed in order to control for and further investigate the individual and collective roles played by the aforementioned factors, thus allowing us to obtain a deeper insight into the driving forces behind behavioral differences. The section consisted of a series of tasks/choices and was always implemented last, i.e. after sections ID and MD. All tasks were incentivized and one was chosen randomly for payment. 18 Dictator ID Participants were randomly matched in pairs to play one round of ID. Each player could fully control the outcome, which we call a Dictator choice. That is, principals chose X under the condition that Y = 0, and agents chose Y under the condition that X = Y for any chosen Y. Thus, all pairs were forced to match with the most favorable conditions for the players who were choosing. This task served to control for differences in fairness considerations in the two games and was incentivized in the same way as all the choices in the ID games. Dictator MD Same as above only with MD. Cutoff ID ParticipantswereincentivizedtogivetheirbestestimateoftheBayesianrational cutoff in the ID game (the ID two-stage equilibrium outcome, 57:1 points). Specifically, they were paid proportionally to how close their answer was to the actual cutoff. This served to control for participants’ beliefs about the rational play in the ID game. 18 For detailed instructions see Appendix B.8.5 54 Cutoff MD Same as above only with MD (the MD two-stage equilibrium outcome with risk neutral agent, 40 points). 19 2.3.6 Design Implementation The experiment was conducted in June 2018 at the Department of Economics, University of Trento, Italy. We collected the data from 8 sessions with a total of 108 subjects. In the 4 sessions of Treatment 1, participants played the ID section first, and in the 4 sessions of Treatment 2 they played the MD section first. Each session with 12 or more participants was divided into two groups in which random matching was done independently. Thus, the two groups inside such session never interacted creating two independent observations. We have 6 groups with the total of 50 participants in Treatment 1 and 7 groups with 68 participants in Treatment 2, which constitutes 13 independent observations. Participants were informed that they will take part in an experiment with three parts and that the instructions for each part will be given to them before each part begins. In order to familiarize the participants with the rules of the game and the interface, in Sections ID and MD they played one round of the game in both roles (with themselves). We did not find any significant order effects and thus we merge the data from the two treatments (irrespective of the order of sections ID and MD). Sessions lasted around 1 hour and 30 minutes and participants were paid on average 12 Euros (8 Euros for principals and 14 Euros for agents), a compensation which is in line with the average payment for similar experiments in Italy. 19 To control for the possible effects of risk preferences we also elicited the MD cutoff unconstrained by risk neutrality. Since we did not find statistically significant differences between the two measures, in our analysis we use only the risk neutral one since our theory is based on risk neutral players. 55 2.4 Results 2.4.1 Bayesian Persuasion in the Lab The ID section of our experiment can act as a standalone test of Bayesian persuasion using a novel design that has not been utilized before in similar experiments. Thus, before we beginourcomparativeanalysisofincentivesandpersuasion, wewilldiscusshowthetheoryof Bayesian persuasion fares in the ID section. In what follows we will present and contrast two of our results from the ID section with Fréchette et al. (2019) whose experimental design is the closest to ours, albeit some important differences. We will defer the discussion of participants’ payoffs to Section 2.4.2. Participants’choices. OurdatastronglysupportthepredictionsofthePrincipal-Preferred Subgame Perfect Equilibrium in Kamenica and Gentzkow (2011). In fact, while individual choicesvaryalotbothacrossthechoicespaceandtime,averagechoicesofbothprincipalsand agents are not statistically different from the point-theoretical predictions (see Figure 2.1). On average, principals construct the theoretically optimal signal structure and agents follow any recommendations, which are at least as informative as that signal structure. Fréchette et al. (2019) find that while qualitatively the predictions are supported in terms of the direction of average choices, quantitatively their senders choose probabilities significantly higher than predictions. One reason for this discrepancy might be the complexity of the experiments. While in Fréchette et al. (2019) senders choose nine probabilities in each round, our principals make only a single probabilistic choice. Furthermore, in Fréchette et al. (2019) the prior probability p = 1=3, while in our setup p = 0:3. Cutoff strategies. One importantinnovation ofour experimentaldesign is theintroduction of the cutoff strategies on the agents’ (receivers’) side. 20 We use them for two reasons. First, 20 We ask our agents to make cutoff strategy choices by indicating the minimum level of informativeness of a recommendation in ID or minimum amount of transfer in MD that they are willing to follow/accept. As such, they are essentially indicating a cutoff which splits the action spaces in two parts: the one that will be followed/accepted and the one that will be ignored/rejected. 56 this allows us to extract more information. While other designs attempt to retrieve each participant’s implicit and unique cutoff point through aggregating multiple observed choices (see Fréchette et al., 2019), our design allows us to observe this cutoff directly. This not only gives us more accurate information on each participant’s implicit cutoff, but also allows us to track potential changes in this cutoff over time and through experience. Second, our setup improves time-efficiency and thus allows to implement the ID and MD games in one session. We find that while on average agents’ cutoff choices might look relatively consistent, on closer look we find that they vary dramatically over time, seemingly affected by bad circumstantial feedback and learning. We take a closer look at agents’ reactions in Sections 2.4.5 and Appendix B.7. Our findings call to question the widely utilized back-of- the-envelope calculation of participants’ perceived cutoff strategies under the assumption of stable cutoffs. Result 1. The theoretical predictions of Kamenica and Gentzkow (2011)’s Bayesian persua- sion are strongly supported in the ID section of our experiment in terms of our participants’ average choices: agents select on average the Bayes-optimal level of informativeness, when choosing which recommendations to follow, and principals construct on average the optimal recommendation signal structure. While the average play is on par with the theory, players’ choices—including agents’ cutoff strategies—are highly heterogeneous and volatile. 2.4.2 Incentives or Persuasion? A natural way to start our comparative analysis of persuasion versus monetary incentives is looking at the average choicesX andY and comparing them to the theoretical predictions. In order to be able to use the non-parametric tests we consider averages over choices in each of the 13 independent groups of participants described in Section 2.3.6. 21 Thus, we operate with 13 independent observations. Table ?? shows average choices in all interactions (pairs) 21 We find no significant order effects between sections ID and MD and so we merge the data from all relevant sections irrespective of the order. 57 andthecorrespondingtheoreticalcounterpartspartitionedbyrolesandtypeofgames. Inthe ID game, as discussed in section 2.4.1, principals’ and agents’ average choices are very close to the theoretical predictions (signed-rank tests, p = 0:753 and p = :311 respectively), but are significantly higher than the predictions in the MD game (signed-rank tests, p< 0:002). This means that agents appear to be on average more demanding with incentives than with persuasion: in order to comply with the principal they ask for more points in MD than for the equivalent recommendation informativeness in ID. This can make it harder for principals to influence agents’ choices through monetary incentives, thus potentially making persuasion a more successful strategy. Table 2.2: Average choices in the ID and MD games Section Measure Sample Principals Agents Data Theory Data Theory ID Choices All pairs 56.6 57.1 60.4 57.1 (3.01) (3.41) MD Choices All pairs 50.2 40 56.5 40 (1.52) (3.43) N of independent observations 13 13 Average choices in the ID and MD games in 13 independent groups of participants. Numbers inbracketsindicatestandarderrors. “Theory” columnsshowthetheoreticalpointpredictions based on the (PP)SPE. “Data” columns show the averages over the 13 groups. To understand whether principals are more successful at persuasion or incentivization notice that their earnings critically depend on two factors: 1) how often a principal is able to successfully persuade/incentivize the agents (an unsuccessful attempt gives her zero points); 2)her choices intheperiodswhenshe successfullypersuades/incentivizes theagents. Thus, a principal is facing a trade-off: increaseX (give a more informative recommendation/transfer more money) to improve the chances of a successful persuasion/incentivization and decrease X (give a higher probability weight to her preferred recommendation/transfer less money) to improve the expected payoff from a potentially successful persuasion/incentivization. We 58 find that in ID principals suceed in their persuasion attempts on average in 5.25 periods out of 10, while in MD the success rate is 4.36 periods. The difference is significant (signed- rank test, p = :043). Thus, overall principals are more often successful when persuading rather than when incentivizing agents. Still, successful matches constitute only about half of the total attempts. 22 This represents a stark contrast to the theory where a principal can extrapolate the agent’s cutoff through backward induction. However, in reality, equilibrium play usually arises over time (see section 2.4.2 for the analysis of the evolution of play). Next, we look at the payoffs that principals and agents receive given a successful attempt. To do that we consider average Matched Expected Payoffs (MEP). These are not the actual payoffsobservedbytheparticipants,butratherwhattheyshouldexpecttoreceivegiventheir choicesX andY and conditional on being matched. 23 We find that this is a better measure of players’ aggregate performance than the realized payoffs since MEP do not contain noise due to the random draws of the ball and thus constitute a more natural comparison to the theoretical predictions. 24 Table 2.3: Average Matched Expected Payoffs (MEP) in ID and MD Section Measure Sample Principals Agents Data Theory Data Theory ID Payoffs Matched pairs 47.8 60 82.2 70 (2.73) (2.73) MD Payoffs Matched pairs 36.6 60 93.4 70 (3.41) (3.41) N of independent observations 13 13 Numbers in brackets indicate standard errors. 22 As explained in Section 2.3.4, a match refers to the situation when XY (a successful persuasion or incentivization). The case X <Y is referred to as a non-match (an unsuccessful persuasion or incentiviza- tion). 23 In case of a match in ID, Matched Expected Payoffs are (100 0:7X; 30 + 0:7X) for principals and agents respectively. In case of a match in MD, Matched Expected Payoffs are (100X; 30 +X). 24 The results with MEP are almost identical with the results with realized payoffs. Realized payoffs both for the matched pairs and for the whole sample can be found in the appendix, Table B.4. 59 Table ?? shows average MEP in the 13 independent groups of participants. Successful persuasionsyieldhigherpayoffstotheprincipalsthansuccessfulincentivizations(signed-rank test, p =:0058). MEP are higher in ID than in MD in 11 out of 13 groups. Nevertheless, in both gmes, principals still earn significantly less than the equilibrium prediction (signed-rank tests, ID: p = :0037, MD: p = 0:0015). This observation is also reflected in the earnings of the agents, who get significantly more than the theoretical predictions (signed-rank tests, ID: p =:0037, MD: p = 0:0015). Result 2. Principals make more money by persuading (ID) than by incentivizing (MD): 1) they successfully persuade agents to follow recommendations more often than they manage to incentivize them to accept transfers and 2) successful persuasion attempts yield on average higher payoff for the principals than successful incentivization attempts. This can be par- tially attributed to agents demanding higher monetary transfers in MD than the equivalent informativeness of the recommendation in ID. Evolution of Choices in Time In order to understand what drives the aggregate results in the previous section we examine the per-period evolution of average choices in the 13 independent groups shown in Figure 2.1 and ask the following questions. Do we observe equilibrium behavior in the two games? If so, How fast does aggregate play converge to the equilibrium? Does the qualitative nature of our results change if we only consider the equilibrium behavior? To answer the first two questions, we make use of the two features of the dynamics in Figure 2.1 that are immediately noticeable. First, the average play is around equilibrium predictions in all 10 periods in ID, while generally higher than equilibrium in MD. Second, while choices in ID remain relatively stable for both principals and agents, in MD we observe principals starting out at equilibrium in period 1, then gradually increasing their choices until they reach agents’ average choices, which remain relatively stable throughout the game. 60 Figure 2.1: Average choices of principals and agents per period 30 40 50 60 70 80 Players' average choice: (X/Y) 1 2 3 4 5 6 7 8 9 10 Period Principals' average choices of X Agents' average choices of Y (PP)SPE ID section: Average choices per period 30 40 50 60 70 80 Players' average choice: (X/Y) 1 2 3 4 5 6 7 8 9 10 Period Principals' average choices of X Agents' average choices of Y (PP)SPE MD section: Average choices per period Average choices of principals (dotted lines) and agents (dashed lines) in each period of the ID and MD games. The solid red lines indicate the predictions of the two-stage (PP)SPE. Error bars are1SE corresponding to 13 observations. This behavior may indicate the existence of an inherent stable level of agents’ choices (different in each game) towards which principals converge. In ID principals seem to achieve that point in period 1 and thus do not move away from it. 25 An interpretation is that participants understand the asymmetry in each player’s outside option. When not matched, agentsearn70pointsonaverage, whereasprincipalsareguaranteedzeropoints. Thus, agents remain relatively stable in their choices, while principals are forced to adapt to agents’ choices in order to increase their chance of matching. 26 This idea is illustrated by the dynamics of the number of matches displayed on Figure 2.2. The number of matches in ID is stable, but in MD it grows together with principals’ average choices (Figure 2.1) reaching the levels of ID matches around period 8. It thus seems that a match rate of about 50% is what principals find optimal when resolving the trade-off between successful persuasion/incentivization and recommendation informativeness/transfer amount. However, any unsuccessful persuasion/incentivization attempt results in an expected welfare loss of 60 points. 27 As a consequence, the optimal resolution of this trade-off by the principals, 25 See Section 2.5 for further analysis of the dynamics that leads to equilibrium play at that point. 26 See Section 2.4.3 for the more detailed analysis of this effect. 27 Total expected surplus in a matched pair is 130 points whereas total expected surplus in a non-matched pair is 70 points (the expected payoff of the agent). 61 intensified by the uncertainty regarding agents’ choices (willingness to accept) brings about a sizeable loss of surplus for the two players. Figure 2.2: Percentage of matched pairs (XY) per period in each game. 0 .15 .3 .45 .6 Percentage of matched pairs 1 2 3 4 5 6 7 8 9 10 Period ID section MD section To further examine the extent of the lost surplus, we look at the evolution of choices for pairs that match as shown in Figure 2.3. We observe large gaps between the average choices of principals and agents in both games that appear to be stable over time (about 25% in ID and 20 points in MD). Thus, principals are giving away a much larger share of the surplus than what their paired agents are willing to accept, surplus they could have captured by loweringX had they known their paired agent’s willingness to accept. 28 A natural question is then Why doesn’t this gap shrink over time, as principals receive more feedback on agents’ choices? We hypothesize that the main reason behind the size and the stability of the gaps is the potential heterogeneity in agents’ choices (both between subjects and over time) coupled with the principals’ sensitivity to unsuccessful persuasions/incentivizations (in which case they end up with zero points). In support of this idea remember that around half of the pairs do not match, which suggests that there is a tangible threat for principals who risk getting higher expected payoffs by choosing a lowX. Moreover, the histograms of the agents’ choices in Figure B.3 in Appendix B.5 show significant levels of very high choices by agents (more in ID than in MD). We explore these issues in more detail in Sections 2.4.4 and 2.4.5 below where we indeed find a large heterogeneity in behavior. 28 The average gap of 25% in ID translates to about 18 points. 62 Figure 2.3: Average choices of matched pairs for each period of the ID and the MD games 20 30 40 50 60 70 80 Players' average choice: (X/Y) 1 2 3 4 5 6 7 8 9 10 Period Principals' average choices of X Agents' average choices of Y (PP)SPE ID section: Average choices per period for matched pairs 20 30 40 50 60 70 80 Players' average choice: (X/Y) 1 2 3 4 5 6 7 8 9 10 Period Principals' average choices of X Agents' average choices of Y (PP)SPE MD section: Average choices per period for matched pairs The solid red lines indicate the predictions of the two-stage (PP)SPE. Error bars are1SE corresponding to 13 observations. Result 3. Principals’ inability to ex ante infer their agents’ choices (minimum willingness to accept) coupled with the potential heterogeneity in agents’ behavior results in principals’ acting as if solving the following trade-off: giving more/less informative recommendations (ID)/generous transfers (MD) increases/decreases the probability of influencing the agent but results in lower/higher expected payoffs conditional on successfully influencing the agent. As a result of the principals’ (justifiable) unwillingness to give fully informative recommendations (ID)/maximal transfers (MD) both games exhibit large inefficiencies in terms of the social surplus lost due to non-matches. In what follows we will examine how the share of surplus created by successful persuasion or incentivization attempts (matches) is distributed between principals and agents in the two games and whether potential differences can help explain some of the aforementioned results. 2.4.3 Bargaining Power In Section 2.4.2 we saw that principals earn significantly less than the theoretical pre- dictions in both ID and MD, even conditional on successful matching, and that principals 63 seem to adjust to the choices of the agents since they have much more to lose from a failure of persuasion or incentivization. Further, in section 2.4.2 we saw that principals seem to be content with settling for a match rate (successful persuasion or incentivization) of about 50% in order to not give up more of the expected surplus in the successful persuasion or incentivization periods. The (PP)SPE of the two games are not consistent with such behav- ior. First, agents should accept any offer above the one that guarantees them an expected a priori payoff. Second, through backward induction, principals should correctly anticipate agents’ behavior and offer the least amount of informativeness or transfer such that agents follow the recommendation or accept the transfer. As such, in theory, both games admit a first-mover advantage for principals. However, in practice, the payoff asymmetry between agents and principals and most notably the “threat” of the non-match for principals may introduce some implicit “bargaining power” for the agents. Thus, in this section we analyze the Nash Bargaining Solution (NBS) of the two games, which—unlike the non-cooperative equilibrium concepts—takes into account the outside options of the players. To proceed with this analysis, notice that, conditional on a pair matching (X Y), the share of the total expected surplus (130 points) that each player receives is uniquely determined by X, the principal’s choice (see footnote 23). Therefore, it is in the principal’s best interest to choose X as low as possible, while it is in the agent’s best interest to try to force the principal to choose a high X. Even though it is the principal’s choice that determines the share of expected surplus for the two players, the agent can effectively threaten the principal with the possibility of a non-match by increasing Y, implicitly forcing the principal’s choice upwards. Thus, while theoretically the first-mover advantage gives the principal full bargaining power, in practice the asymmetry in each player’s outside option may change that by introducing behavioral asymmetries between the two players and across the two games. The questions now are What theoretical outcome should this bargaining process yield given the parameters of the ID and MD games? and Can the behavior of the participants 64 Figure 2.4: Best response correspondences and Nash Equilibria in the ID and MD games. Principal-preferred NE = SPE Agent-preferred NE 0 ~57 100 Agent's choice (Y) 0 ~57 100 Principal's choice (X) Principal's B.R. Agent's B.R. Nash equilibria ID section: Best Responses and Nash equilibria Principal-preferred NE = SPE Agent-preferred NE 0 40 100 Agent's choice (Y) 0 40 100 Principal's choice (X) Principal's B.R. Agent's B.R. Nash equilibria MD section: Best Responses and Equilibria be explained by some Nash Bargaining Solution? To find the answers we first look at the normal forms of the two games. The strategy sets of the two players are X2 [0; 100] and Y 2 [0; 100]. Figure 2.4 shows the best response correspondences and the sets of Nash equilibria in the ID and MD games (see Appendix B.1.3 for details). One can easily see that the best responses in the games are the same except for the point of the switch in the agent’s best response correspondence. There is a continuum of NE that range from the agent-preferred to the principal-preferred, which is also the (PP)SPE of the extensive form game. Thus, the two games admit a similar “non-cooperative” structure. Figure 2.5 shows the possible outcomes of the games in the expected payoff space (any choice of X and Y maps into some pair of expected payoffs). The black lines represent the possible expected payoffs in case there is a match (X Y), and the “Non-match” points show the payoffs in case X <Y. Notice, however, that the disagreement outcomes are not necessarilythesameasanon-match. Wecalculatethemastheminimalexpectedpayoffsthat each player can guarantee regardless of the choices of the other. In the ID game the principal can guarantee herself 30 points by choosing X = 100, which the agent is forced to accept by the design of the game. In the MD game the principal can only guarantee herself 0 points by choosingX = 100. The agent can always get the minimum of 70 points by choosingY = 100 65 in either game. 29 From the graphs it is clear that, given these disagreement outcomes, the NashbargainingsolutionwillpredictthechoiceofoneoftheNashequilibriadescribedabove, mapping 1-to-1 from the set of bargaining weights of the players (see Appendix B.1.3 for details). Figure 2.5: Expected payoffs, Nash bargaining and disagreement outcomes Principal-preferred NE = SPE Agent-preferred NE Non-match outcome Disagreement outcome MEP = NBS(0.6, 0.4) 0 30 70 85 100 130 Agent's Expected payoffs 0 30 45 60 100 Principal's Expected payoffs Expected Payoffs (matched pairs) Equilibrium Payoffs (matched pairs) ID section: Expected Payoffs Principal-preferred NE = SPE Agent-preferred NE Non-match outcome = Disagreement outcome MEP = NBS(0.6, 0.4) 0 30 70 100 130 Agent's Expected payoffs 0 30 60 100 Principal's Expected payoffs Expected Payoffs (matched pairs) Equilibrium Payoffs (matched pairs) MD section: Expected Payoffs PossibleexpectedpayoffsintheIDandMDgames, disagreementoutcomes, averageMatched Expected Payoffs (MEP), and Nash Bargaining Solution with bargaining weights 0.6 and 0.4. To see if the behavior in both games can be explained by a Nash Bargaining Solution with some fixed bargaining power parameters we take only matched interactions and calculate the Matched Expected Payoffs (MEP) as we did above separately for each of the 13 independent groups of participants. 30 Blue crosses on Figure 2.5 show the overall average MEP. By inverting the NBS mapping from bargaining weights to a Nash equilibrium and taking each average MEP as the corresponding Nash equilibrium in each game, we can retrieve the implied bargaining weights of each player in each game. We perform this calculation in Appendix B.1.4 and find the principals’ average bargaining weights to be 0.593 and 0.610 in the ID and MD games respectively, a remarkably similar result (standard errors: 0.09 and 0.06). This suggests two independent results. First, as expected, we find that agents 29 More specifically the agent can guarantee 70 points by choosing Y 2 [57:1; 100] in ID and Y 2 [40; 100] in MD. 30 We discard the non-matches because they happen due to noise and miscoordination, whereas NBS assumes that players can choose to match. 66 are able to capture a relatively large part of the additional surplus (40%) in contrast to the theoretical predictions (0%). As we conjectured above, we attribute this result to the payoff asymmetry and more specifically to the threat of a non-match outcome that principals face. Second, the implied distribution of bargaining power is almost identical in the two games. By accounting for the difference in the outside options (non-match outcomes), the NBS can explain the differences in expected payoffs in the two games. Result 4. A large part of the surplus created by successful persuasions or incentivizations is captured by the agents in sharp contrast to the theoretical predictions. In fact, analyzing our results using the Nash Bargaining Solution and accounting for each player’s minimum guaranteed expected payoff (outside option), we find the implied bargaining power distributions to be almost identical in the two games (0.6 - principals, 0.4 - agents). As such, while bargaining power is a driver of a large wedge between theory and practice, it does not seem to create any wedges between the informational and monetary environments. 2.4.4 The Determinants of Principals’ Success In order to understand what causes the behavioral phenomena described in Sections 2.4.2 and 2.4.2, we analyze the determinants of the principals’ success, as defined by their aver- age payoffs in ID and MD, by looking at the individual choices and reactions to feedback. We start with the analysis that connects principals’ average choices with their individual characteristics—fairness attitudes and the perception of rational cutoff points—elicited in Section 3 of the experiment. The regressions in Table B.2 in Appendix B.3 show that prin- cipals’ average choices are not determined by their fairness considerations (variable Dictator choice). 31 Their perception of rational cutoff points (variable Cutoff) only matters slightly in ID, but not in MD (we discuss this in more detail in Section 2.5). This result supports our findings from Section 2.4.2 that principals adjust their choices to those of the agents due to the threat of not matching. 31 The description of all variables use in the regressions can be found in Appendix B.2. 67 Table 2.4: Principals’ feedback states in the ID and MD games Section Feedback XY Ball Agent’s Principal’s Agent’s State guess payoff payoff ID A Yes Blue Blue 0 100 ID B Yes Red Red 100 100 ID C Yes Blue Red 100 0 ID D No Red Blue 0 0 ID E No Blue Blue 0 100 MD B Yes Red Red 100X 100 +X MD C Yes Blue Red 100X X MD D No Red Blue 0 0 MD E No Blue Blue 0 100 Before we get to the analysis of individual behavior, notice that there is a very strong correlation between principals’ average Expected Payoffs in ID and MD (Spearman’s = 0:73, p< 0:0001). Specifically, some principals win a lot in both games and some win very little. Moreover, the independent groups of participants consist of a mixture of successful and unsuccessful principals (see Figure B.4 in Appendix B.5). This suggests that success in both games is determined to a large extent by the individual choices that principals make and not by the groups that they are in, which allows us to concentrate on the individual choices of principals irrespectively of the group they belong to. To understand the behavioral differences between the successful and the unsuccessful principals in the two games we divide them into high-earning and low-earning by the median of the sum of their Expected Payoffs in ID and MD. We consider the reactions of principals to “feedback states” that distinguish different types of situations that they face after each period. Table 2.4 shows the feedback states that depend on 1) whether the pair is matched (X Y); 2) the color of the ball drawn and 3) the agent’s final guess. Notice that the states unambiguously determine the payoffs of the players, however the opposite is not true: principals and agents can receive the same payoff in different states. Thus, our analysis includes not only the mechanistic reactions to the realized payoffs, which are only partially 68 determined by the moves of Nature, but also the reactions to the outcomes of the principal- agent interaction. Figure 2.6: Principals’ reactions to feedback states -45 -30 -15 0 15 30 Change in X A B C D E High-earning principals Low-earning principals ±1SE ID: Principals' reactions *** ** * * * -45 -30 -15 0 15 30 Change in X B C D E High-earning principals Low-earning principals ±1SE MD: Principals' reactions The bars show the sums of coefficients from the random effects regressions reported in Table B.3 in Appendix B.3. Significance levels *, **, *** correspond to p< 0:1; 0:05; 0:01. Figure 2.6 shows the reactions of principals as estimated by the random effects regressions reported in Table B.3 in Appendix B.3 with changes in X as dependent variables and the dummies for the feedback states and low-earning principals as independent variables. 32 A simpleobservationfromthegraphsisthatprincipalsreacttoamatchandanon-matchinthe expected direction: in both games they weakly decrease X after a match (states A;B;C) and increase X after a non-match (states D;E). 33 However, the sizes of these reactions are very different for high- and low-earning principals. Specifically, the more successful principals react to feedback in a much more reserved way than the less successful ones. This is especially pronounced in ID for the states in which there was a match (A;B;C). These observations suggest that the less successful principals overreact to the news that they successfully persuaded their paired agent by dramatically lowering X which in turn results 32 The regressions also control for the individual characteristics elicited in Section 3. 33 We say “expected direction” in the sense that after a match (when XY) a principal should update her beliefs of the agents’ actions downwards and thus lowerX in an attempt to increase the payoff while still matching in the next period. Correspondingly, after a non-match (X < Y) a principal should update her beliefs of the agents’ actions upwards and thus increase X in order to increase the probability of matching in the next period. 69 in a significantly lower chance of matching in the next period. In addition, such reactions should lead to much more erratic behavior of the low-earning principals as compared to the high-earning ones. This is indeed the case. Table ?? shows that in ID low-earning principals have much higher standard deviations of choices X than high-earning principals (ranksum test, p = 0:0001). In MD this is also true, though the result is weaker (ranksum test, p = 0:0537). Table 2.5: Principals’ earnings by group and game Section Measure Sample Principals High-earning Low-earning ID SD of Choices All pairs 16.7 31.2 (2.10) (2.17) MD SD of Choices All pairs 13.2 19.3 (1.86) (2.41) N of observations 54 54 Mean standard deviations of principals’ choices X. Numbers in brackets indicate standard errors. Result 5. Principals who exhibit higher restraint in terms of their reactions to irrelevant circumstantial feedback are significantly more successful in terms of their earnings than prin- cipals whose behavior drifts along with feedback from circumstantial interactions with their paired agents. This is true in both games, but is especially strong in ID. 2.4.5 Heterogeneity in Agents’ Behavior Agents’ choices, namely their their informational and monetary demands, clearly exert strong influence on principals’ earnings in both games. In this section we analyze the de- terminants of agents’ behavior as well as the extent of heterogeneity in our pool of agents. As with principals, we first look at the connection between agents’ average choices and their individual characteristics elicited in Section 3. The regressions in Table B.2 in Appendix B.3 show that fairness considerations are a significant determinant of agents’ average choices in 70 both games (variable Dictator choice). Agents who choose relatively low Y in the Dictator tasks of Section 3 also choose relatively lower average Y in the corresponding game. The effect is very strong in ID, in size and statistical significance, and somewhat smaller in size but still significant in MD. This is consistent with our previous findings that agents’ choices are stable across the two games and are not influenced much by principals’ actions. The rational cutoff perceptions influence agents’ decisions only weakly in MD (10% significance level), and not at all in ID (we return to the discussion of cutoff points in Section 2.5). Table 2.6: Average choices of selfish and generous agents Section Measure Sample Agents Selfish Generous ID Choices All pairs 70.6 50.0 (2.43) (3.99) MD Choices All pairs 66.1 48.5 (2.90) (3.13) N of observations 54 54 Standard errors in parentheses. Since fairness considerations seem to matter for agents’ choices we consider the individual reactions to feedback states separately for groups of agents who appear to be more “generous” versus “selfish” in their Dictator task choices. We define “selfish” and “generous” agents, by the median split of their answers in the Dictator tasks. 34 Table?? shows selfish and generous agents’ average choices in the two games. The differences between the two groups in each game are large and significant (ranksum tests, ID:p = 0:0001; MD:p = 0:0002). In addition to the above mentioned regressions in Table B.2, this supports our conclusion that fairness considerations have a large effect on agents’ choices. 34 We divide agents into selfish and generous separately for ID and MD since we find only weak correlation in ID and MD generosity within subjects. A Player B (agent) participant whose choice in the Dictator task is above [below] the median is classified as “selfish” [“generous”]. 71 Figure 2.7: Agents’ reactions to feedback states ** -30 -15 0 15 30 45 Change in Y A B C D E Selfish agents Generous agents ±1SE ID: Agents' reactions *** -30 -15 0 15 30 45 Change in Y B C D E Selfish agents Generous agents ±1SE MD: Agents' reactions The bars show the sums of coefficients from the random effects regressions reported in Table B.3 in Appendix B.3. Figure 2.7 shows agents’ reactions depending on the feedback states as estimated by the random-effects regressions reported in Table B.3 in Appendix B.3. 35 A general observation here is that agents’ reactions are in the opposite direction to the principals’ reactions shown in Figure 2.6. After a match (states A, B, and C) agents increase Y and after a non-match (statesD andE) they weakly decreaseY. This is consistent with our idea that participants treat the two games as a form of bargaining. In terms of the differences between the reactions of the two agent groups, we observe something interesting. While the two groups react similarly in the match feedback states (A, B, and C) and the non-match feedback state E, the two groups behave significantly differently in the non-match state D–where the agents receive 0 points. More specifically, the “generous” group significantly decreases Y, while the “selfish” group remains relatively stable. What does this tell us? While the generous group decrease their choices more in both non-match states, the difference is only significant in state D. This suggests that while the generous group do appear to make more of an effort to match with the principal, that effort is significantly more pronounced when the current choices did not yield a good result for the agent. Even the generous agents’ behavior is 35 The regressions also control for the individual characteristics elicited in Section 3 and the lagged choice of the principals that agents observe on their feedback screens. 72 not driven by pure altruistic motives, else we would observe similar behavior in both non- match states by the generous group. Overall, this difference demonstrates the mechanism responsible for the lower choices of generous agents as was reported in Table ?? above. Result 6. Agents’ behavior is partially influenced by fairness considerations, which in turn affects principals’ earnings in both games. However this influence disappears when the be- havior yields good results. 2.5 Discussion In our experiment, we found that the two theoretically parallel incentive mechanisms can be perceived very differently in practice. As a consequence, the analysis that ignores such behavioral differences can fail to capture important aspects of the actual interactions and consequently fail to properly predict behavior and outcomes. Perhaps most importantly, such failure can mislead companies and policymakers about the appropriateness of monetary versus informational incentives. In this section we discuss several dimensions along which the behavior under two mechanisms differs significantly and possible reasons why. 2.5.1 Risk and Payoff Uncertainty. There are two ways in which risk preferences and payoff uncertainty may affect the interactions in the ID and MD games. First, the agent’s risk preferences influence the outcomes of the MD game. A risk averse/seeking agent will require a lower/higher transfer (sure payment) from the principal to guess red. The agents’s risk preferences do not change the equilibrium outcome of the ID game. Further analysis (see Section B.8.5) shows that risk preferences are not driving wedges in our data. Second, while the principal’s risk preferences do not change the equilibrium outcome of either of the two games, they may affect behavior because of the large discrepancies in payoff uncertainty experienced by the participants in 73 the two games. In the ID game, payoffs are extreme: participants win either 100 or 0 points, while in the MD game, the principal’s deterministic transfer (if accepted) smoothes out these extreme payoffs. We find that this difference has major implications for the behavior of the participants in our experiment. Looking at Figures 2.6 and 2.7 it becomes clear that the reactions of principals and agents to the feedback are much larger in absolute values in ID than in MD, and, as we discussed above, this has serious ramifications for the earnings of the principals (see section 2.4.4). One behavioral theory that can explain this type of behavior is reinforcementlearning,whichsuggeststhatthehigherdegreeofexpectations-basedreasoning that is required by players in ID can drastically affect behavior and consequently bring about very different outcomes than those achieved by monetary mechanisms. 2.5.2 Monetary Contracts versus Informational Contracts. While equivalent theoretically, the “flavor” of the monetary or informational agreement between the principal and the agent may be perceived very differently in practice. The mon- etary contract in MD is more straightforward and may represent a more natural agreement: the principal pays money to the agent and in return the agent chooses an unfavorable option (guesses red). In this case the agent agrees to take the risk for a specified compensation. Moreover, this risk depends solely on the moves of Nature. As such, there are no surprises when the outcome is revealed. This is reflected in Figure 2.7 where agents’ reactions to dif- ferent ball colors in case of a match (right graph, feedback states B andC) are very similar, even though the realized payoffs differ by 100 points. The nature of the informational con- tract in ID however appears to be very different. We conjecture that when the agent agrees to follow the principal’s recommendation there is a sense in which the agent is “entrusting” the principal with making the “right” guess so that both the principal and the agent can profit. Thus, when it happens that the followed recommendation is bad (the recommended and implemented guess is red but the ball is blue) the agent, who would have made the right 74 guess had he ignored the recommendation, is left with no monetary compensation, while the principal enjoys 100 points at the expense of the agent. Of course, this is all part of the resolution of uncertainty, however one can easily see how in reality such situations can induce extreme feelings of discontent and resentment and consequently extreme reactions. Such be- havior is clearly visible on the left graph of Figure 2.7 (feedback state C), which shows that when such situations arise, agents tend to react very strongly by increasing Y in the next period by an average of around 40 points. In fact, an additional analysis in Appendix B.7 demonstrates that experiencing feedback state C in the ID game has a long-lasting impact on agents’ behavior since their choices never return back to the pre-state-C levels. Overall, the persuasion in the information design with the slightly subtler resolution of uncertaintyandoftenconflictinginterestsrendersthebehaviorunderinformationalcontracts more volatile and perhaps emotion-driven, something that we do not observe under the more conventionalmonetarycontractsthatarearguablyeasiertounderstandandarecharacterized by clearer resolution of uncertainty. 2.5.3 Computational Complexity and the Monetary Equivalent of Information. Cognitive limitations and bounded rationality are two omnipresent behavioral biases driving wedges between theoretical predictions and real world behavior in a multitude of settings. Given the differences in computational complexities of the two games it should thus come as no surprise that behavioral differences arise when comparing the two situations in practice. Onecrucialcomponentofthetwogamesthatfundamentallydependsonthelevelof computational capabilities is the agents’ perception of what is the right level of informational compensation (in ID) and monetary compensation (in MD). The agent must answer two questions: How informative should the recommendation be in order to be profitable to follow? and How large should the monetary transfer be in order to be profitable to accept? 75 While an economic agent infers the correct level of compensation and asks for exactly that, in reality, people are not only prone to inference mistakes but even conditional on their unbiased inference, may additionally behave differently in a strategic setting. To decompose the two effects we extracted our participants’ best estimates of the rational (and risk neutral in MD) cutoff points in each game. 36 . This allows us to measure directly the magnitude and the direction of the mistakes that people make in each game as well as to get a sense of the strategic effects by comparing these estimates with observed behavior in the two games. Figure 2.8 shows the average choices of principals and agents in ID and MD (black bars) as well as their average estimates of the rational cutoffs (light gray bars) together with the rational cutoff points (red lines). Looking only at behavior in the games our data reveals that agents appear more demanding when being incentivized than when being persuaded since the relative play against the rational cutoff point is higher in the MD game. In fact, while agents ask, on average, the correct amount of recommendation informativeness in ID, they ask for a significantly higher monetary transfer in MD. Another way to interpret this is that agents appear to undervalue money relative to the equivalent amount of information. Once we take into account agent’s estimates of the rational cutoff points, the picture becomes somewhat different. Consider first the graph for MD. Here we see that both principals’ and agents’ average choices are close to what they perceive as the optimal cutoff. Thus, despite the fact that theseestimatesareslightlyabovethetheoreticalpredictions, agentsaskforwhattheybelieve to be optimal on average. This is also supported by the regression in Table B.2 in Appendix B.3, which shows that the agents’ average choices in MD depend significantly, albeit at 10% level, on the expressed cutoff point. InIDhoweverthesituationismarkedlydifferent. First, thereisasubstantialgapbetween participants’ average play and what they perceive to be optimal. Specifically, agents ask for much higher informational compensation than what they perceive as optimal. Second, par- 36 Section 3 tasks Cutoff ID and Cutoff MD. For detailed instructions see sections 2.3.5 and B.8.5 76 0 10 20 30 40 50 60 Choice, X and Y Principals Agents Average choice Cutoff ±1SE ID: Choices and cutoffs Theory 0 10 20 30 40 50 60 Choice, X and Y Principals Agents MD: Choices and cutoffs Average choice Cutoff ±1SE Theory Figure 2.8: Average choices, theoretical predictions, and participants’ estimates of rational (risk neutral) cutoffs. ticipants significantly underestimate the correct level of recommendation informativeness. This suggests significant computational mistakes attributed to errors in probabilistic reason- ing and Bayesian updating. Of course, these errors are hardly surprising and very much in line with vast literature related to this issue (e.g., Kahneman and Tversky, 1973; Charness et al., 2007; Holt and Smith, 2009; Abdellaoui et al., 2015). Thus, while agents’ errors from Bayesian updating should have benefited principals even more in ID, their behavior in actual play was much more demanding than their perception of the rational cutoff. As a result, coincidental or not, agents in the ID game were, on average, asking for the Bayes-optimal amount of information. Thus, while such errors may have had a significant impact on the outcome of the games, the direction of these errors places a lower bound on our results in Section 2.4.2. We were able to investigate the magnitude and direction of mistakes due to Bayesian updating and consequently the magnitude of behavior driven by strategic considerations. Whilewedonothaveaclearexplanationastowhatdrivesagentstoplaysomuchhigherthan their perception of the rational play, we hypothesize that, given the high uncertainty related to two independent sources—the moves of Nature and the probabilistic recommendations from principals—agents try to be more cautious and increase their choices above the level 77 that they think is optimal, which also drags principals’ choices along. Nevertheless, we see this is as an interesting avenue for future research. 2.5.4 Theoretical and Policy Implications. Our final note is on the theory and policy implications of our findings. While we see this paper as a first attempt at considering the parallelism between monetary and informa- tional contracts, we trust that there are important takeaways both for theorists as well as applied scientists, policy makers, and companies. At the times where the availability and value of data and information is exponentially increasing, more and more companies and policy makers are having the option to utilize information instead of money as an incentive tool to influence behavior. Thus, insights about potentially differential behavioral reactions to informational versus monetary incentives should be at the core of research in mechanism design and information economics. As such, we believe that theory should attempt to incor- porate such behavioral insights to build a more robust and general parallelism between the two–perhaps indicating the situations where monetary incentives may be the best course of action for a principal. At the other end, behavioral researchers and experimentalists should keep on identifying those behavioral biases and keep informing the theory. 2.6 Conclusion This paper utilizes a lab experiment to explore the behavioral side of a recently pro- posed parallelism between the information design (ID) and the more traditional theory of mechanism design (MD) (see Bergemann and Morris, 2019; Taneva, 2019). We modify the framework of Kamenica and Gentzkow (2011) to allow for the implementation of both the ID and MD principal-agent problems as games with identical action spaces, equivalent best response correspondences, and the same predicted expected payoffs for each player in the 78 Principal-Preferred Subgame-Perfect Equilibrium [(PP)SPE]. The latter equivalence serves as our main object of comparison and the starting point for our behavioral analysis. In order to minimize various contextual influences in our experiment, we have designed the ID and MD games in such a way that both principals and agents choose a single number in the same range in very similar and relatively simple environments. In both games there are two states of the world determined by the color of a ball drawn from an urn with common knowledge of the ball composition (seven blue balls and three red balls). The agent receives a reward if she guesses the color of the ball correctly, while the principal receives a reward if the agent guesses the color red. In MD the principal chooses the amount of money to transfer to the agent for guessing red. In ID the principal chooses the informativeness of a recommendation sent to an agent about the guess of the color of the ball. In both games, the agents choose the minimum amount of monetary transfer (MD) or recommendation informativeness (ID) that they require in order to accept the transfer (MD) or follow the recommendation (ID). The similarity of the two choice environments allows us to make direct comparisons of the behavioral differences between the informational (persuasion) versus the monetary (incentives) environments. The point-predictions of Bayesian persuasion are strongly supported in our data in terms of participants’ average choices in the ID game. While individual behavior is oftentimes erraticandhighlycontext-dependent, onaverage, bothprincipals’andagents’choicesarenot statistically different from the theory in all 10 periods of the game. Even if the mechanisms that determine those average choices may be complicated to decompose, the theory deserves merit for it’s descriptive power. Overall, our Principals were able to use both informative recommendations and monetary transfers to either persuade or incentivize their respective paired agents to take an otherwise (ex-ante) undesirable action. Thus, we can establish that, at some level, both information 79 design and mechanism design are viable strategies that our principals successfully use to some extent to improve upon their ex-ante expected payoffs. 37 In our experiment, the strategy of persuasion (Information Design) proved to be more profitable for principals over the more traditional strategy of monetary incentives (Mech- anism Design), despite the equivalence predicted by the theory. This result seems robust since principals are more often able to convince their paired agents to follow their recom- mendations in ID than accept their transfer in MD but even conditional on a successful persuasion or incentivization, persuasion attempts were more profitable for the principals. These results are strongly influenced by agents’ higher demand for monetary compensation than the equivalent recommendation informativeness. As such, we can say that agents were “easier” to persuade than incentivize. In both games, the principals’ average earnings were significantly less than the predic- tion of Kamenica and Gentzkow (2011)’s Principal-Preferred Subgame Perfect Equilibrium [PP]SPE. This is partly because on many occasions (about half) the principals failed to per- suade or incentivize their paired agents to follow the recommendation or accept the transfer. As a result, the agents ignore the recommendation or reject the transfer. However, even when we ignore those “failed” trials, we still observe a gap between the principals’ matched expected payoffs (MEP) and the [PP]SPE equilibrium point-prediction. Motivated by the strong asymmetry in the baseline payoffs (if the principal fails to persuade/incentivize the agent, the principal is guaranteed zero points while the agent still expects 70 points on av- erage) we conjecture that the principals’ fear of a zero payoff combined with the agents’ security of 70 points would induce some form of implicit bargaining between the players. We analyze the Nash bargaining solutions in the two games, which include the prediction of the [PP]SPE as a special case with full bargaining power attributed to the principal. Our analysis showcases two results. First, contrary to the predictions of the [PP]SPE, agents are able to capture a large part of the surplus generated by successful persuasions and in- 37 In the absence of either of those strategies, the principal is guaranteed a payoff of zero. 80 centivizations. Second, we find that the implied distributions of bargaining power between principals and agents are identical in the two environments. Thus, while bargaining power creates a large gap between theory and practice, it does not drive any wedges between the informational and monetary environments. 81 Chapter 3 Rolling The Skewed Die: The Economic Foundations of the Demand for Skewness 1 3.1 Introduction Skewness is pervasive among financial securities –options, growth stocks...– and other typesofinvestments–privateequity,VC... Furthermore,formanyagents,skewness-seekingis an important element in their investment decisions, as documented in the literature. In fact, the quest for skewness has the potential to explain some of the most challenging empirical puzzlescontemplatedintheliterature–forexample,thevalue puzzle,Zhang(2013). However, standardutilityfunctions–inparticular, CRRAutilityfunctions–cannotexplainthedemand for skewness we observe in practice. While the current literature, especially during the last ten years, has explored lottery characteristics –i.e., skewness– of many securities, it has not tackled the reasons that drive individual investors to demand skewness. In general, the analysis of the effects of skewness demand is based on utility functions that assign an ad-hoc 1 Joint Work with Aleksandar Giga, Suk Lee and Fernando Zapatero 82 large weight to the positive third moment of wealth –i.e. right skewness. This is the case of the influential paper by Kraus and Litzenberger (1976) or the more recent by Harvey and Siddique (2000). A step further towards an axiomatic utility is the work on aspirational utility (Diecidue and van den Ven 2008, in the spirit of Friedman and Savage (1948)). Their utility includes a jump that represents the discontinuity in utility derived from crossing a certain threshold off wealth. In this paper we analyze the demand for skewness that results from an utility function similar to the model of Diecidue and Van De Ven (2008) but derived from microeconomic foundations. In particular, we consider an economic agent who cares not only about con- sumption but also about status as another source of utility different from the consumption good. The consumption good is divisible and contributes to total utility in the same way as in the standard CRRA case. However, status is conveyed through acquisition of a non- divisible good –to simplify, we assume that the only utility provided by this good is through the status recognition; our conclusions do not depend on this assumption. Status-seeking is related to a number of preferences popular in the financial economics literature, as relative wealth concerns –or external habit formation, Campbell and Cochrane (1999) – and habit formation (Sundaresan (1989) and Constantinides (1990)). Rayo and Becker (2007) show that this type of utility provides an evolutionary edge. More recently, Roussanov (2010) pro- poses a utility model that includes status-seeking and derives some investment implications that he shows are consistent with the data. When we proxy status by a non-divisible good we have in mind examples such as a luxury car, a house, a country-club membership... all of which are often interpreted as signals of status (see, for example, Charles et al. (2009)). They might also yield consumption utility. We do not explore this possibility, but it is not inconsistent with our utility specification. The status driven, aspirational utility that emerges from this microeconomic foundations is reminiscent of the framework first established by Friedman and Savage (1948). They are motivated by the ‘puzzling’ observation that some investors simultaneously buy insurance 83 and lotteries which, they argue, cannot be explained by standard utility models. However, in their analysis they only consider the notion of “volatility," the second moment of the distribution, and overlook skewness, which is the focus of our analysis. In fact, focusing on the net demand for right skewness provides a straightforward answer to the first part of the puzzle raised by Friedman and Savage (1948). In particular, buying a lottery ticket amounts amount to taking a long position in right skewness and buying insurance implies a short position in left skewness. Each of these two decisions reveals a preference for right skewness. Yet, thesecondproblemraisedbyFriedmanandSavage(1948), theshortcomingsofstandard utility models, is still relevant because, in general, standard utility models cannot explain in generality the demand for right-skewness that the previous two examples illustrate. For example, although CRRA utility in principle implies a preference for right skewness, under any reasonable parameter values the optimal policy of an investor with CRRA utility is to sell short a lottery because its negative expected return and high variance dominate the effect of positive skewness. Furthermore, negative skewness is always discarded by a CRRA investor, unless it is associated with a positive, large enough, expected payoff, and yet, as we will later argue, demand for left-skewness is also present in some economic decisions. These observations justify our quest for individual motives whose resulting utility function explains a possible optimal demand for skewness –right or left. Interest in the demand for skewness and its effect on equilibrium prices is not new. Kraus and Litzenberger (1976) already explore its implications, assuming a utility function that putsalargerweightonthethirdmoment. Morerecently, HarveyandSiddique(2000)explore implications of the relation among higher moments –in particular, co-skewness– in the cross- section of stocks. Mitton and Vorkink (2007) show that many investors have a preference for skewness in their portfolios and strategically choose securities that are avoided by investors who prefer a diversified portfolio. Kumar (2009) studies stocks with lottery-like properties and shows they are chosen by people who also buy lotteries. Estimating ex-ante skewness 84 in stocks is difficult. Bali et al. (2011) suggest an alternative way to identify lottery stocks. Boyer and Vorkink (2014) study lottery properties among stock options. Based on relative status concerns we derive a specific type of aspirational utility and we show how demand for skewness can arise endogenously. Our utility function is similar in shape to a CRRA function but with a aspiration point (R), and a positive jump in utility if the agent can spend more than R. We then introduce a parsimonious set of securities (called ‘binomial martingales’) that contain different levels of skewness, which allow investors to choose the exact level of skewness –right or left– optimal for their position in the utility function. From this setting, through a concavification of the utility function we can derive the four seasons of the demand for skewness. In particular, we show analytically that the relative position of the reference point R (i.e., how far away the aspiration is), with respect to the agent’s current consumption level (C 0 ) is a critical factor in determining the demand for skewness. If the agents’ aspiration is only marginally higher than the current endow- ment, they choose to sell skewness. This is because the proximity of the aspiration point encourages the agents to select a security that lands on the aspiration level with relatively high chance. Such a security - high chance of small gain - is negatively skewed. On the other hand, by exactly symmetric arguments, as the aspiration level moves further away from the current wealth, agents choose to buy right-skewed securities. This is in sharp contrast to the standard mean-variance analysis where agents shun any form of gambling with zero or negative expected returns. We also explore how endogenous demand for skewness changes with respect to the param- eters of the utility function. Predictably, the size of the jump is a main factor. If attaining the aspiration leads to a big jump in utility, agents choose less (right-)skewed securities. Intuitively, a big jump implies greater importance of aspiration: the agent is then forced to demand securities that can help get to the aspiration level with higher probability, albeit at the expense of a lower level of consumption if the gamble fails. Such securities contain low 85 or even negative skewness. Analogous results are presented for the level of risk aversion and initial wealth of the agent. In the ‘binomial martingale’ setting we introduce, the agent’s choice of skewness mechan- ically fixes the level of volatility. It is impossible to separate the choice of volatility from the choice of skewness. To investigate the role of volatility in the aspirational setting, we intro- duce tri-nomial securities (that embeds binomial securities as a special case). This allows us to separate volatility from skewness. This yields a somewhat surprising result, but in line with the Friedman and Savage (1948) analysis: the aspirational agents do not necessarily choose to minimize volatility as they would in the standard mean-variance setting. Instead, agents choose just the right amount of volatility to propel themselves to their aspirations. In the aspirational setting, volatility can be desirable insofar as it helps them attain their aspirations. In similar vein, extensions are also made to consider a broader range of securities to investigate the consequence of variations in the first moment. We also consider multiple (two) aspirations. In the two-aspirations case, we find that agents can either choose to mind both aspiration points, mind only one aspiration point, or mind neither. If agents choose to mind both aspiration points, the level of skewness they demand is determined entirely by the position of the aspirations relative to their current consumption level (C 0 ). In everyday parlance, the agent gets ‘trapped in’ between a status they want to attain, and a status that they have already attained and would never want to lose. The agent’s demand for skewness is determined by the relative strength of these considerations. The paper is organized as follows. In section 3.2 we describe a basic setup, derive our utility function and first order conditions. In Section 3.3 we study the demand for skewness in the case of a single jump. In section 3.4 we study the two-jump case. Section 3.5 considers more realistic financial markets. In section 3.6 we explore the interaction between volatility and skewness. In section 3.7 we present the results of a controlled laboratory experiment 86 designed to mimic a scenario of financial decision making with an aspiration. We close the paper with some conclusions. 3.2 Setup In this section, we introduce and motivate ‘aspirational utility’: a standard utility func- tion augmented by elements of ‘goal’ and its ‘attainment’ (Diecidue and Van De Ven (2008)). The ‘goal’ in this setting is represented as a position in the agent’s wealth or consumption level, the satisfaction of achieving this aspiration is expressed by a discontinuity or ‘jump’ in utility level. 3.2.1 A Motivation: Local, Bulky Status Goods To see how aspirational utility can arise from a natural setting, we consider the notion of ‘status’ in the spirit of Roussanov (2010). In Roussanov’s model, agents care not only about standard consumption (as represented by power utility over consumption) but also about their wealth level relative to the average wealth level of the economy, a feature which represents agent’s ‘status concerns’. While we believe incorporating status is a meaningful endeavor, we make two observations that further enhances the realism of this feature. First, status goods are often bulky, indivisible purchases. (Consider luxury cars, or mansions for example.) Second, unlike the setting of Roussanov where the benchmark of status is uniform across agents of all wealth types (the average wealth level of the economy), it is reasonable to assume that status goods are wealth dependent: for example, jewelry for the poor, large house for the rich. We will consider an example that embodies status concerns in the spirit of Roussanov (2010) model, with the two enrichments described above. Let W i denote the wealth level of an agent i. Suppose that for agents of wealth level W i 2 (0; $1 million) the relevant status good is ownership of a small house that costs $0:5 million, and for agents of wealth 87 Figure 3.1: Local, Bulky Status Goods Imply Aspirational Utility level W i 2 ($1 million; $2 million) the relevant status good is a luxury house with private pool that costs $1:5 million to acquire. This reflects the locality and indivisibility of status goods; that status goods are often bulky purchases that are endemic to the peer groups within wealth brackets. Let S i denote the status good consumed by agent i, and let c i denote the standard consumption good (bread and butter) consumed by agent i. The total consumption of agent i (C i ) consists of c i and S i , (that is, C i = c i +S i W i . ) Given standard (e.g. power utility) utility over c i , and the local, indivisible nature of S i , an agent optimally chooses the amount (c i ;S i ). The left-hand panel of Figure 3.1 gives a graphical description of U(C i ) when the optimal choice (c i ;S i ) is made. The derivation can be made rigorous, but the graph is intuitive. The jumps are inheritence from the local, indivisible nature of status goods. They occur at around $0.7 (A) and $1.7 (B) million, slightly above the cost of owning the status goods ($0:5 and $1:5 million), representing the fact that agents would buy status goods only after they surpass their ‘subsistence level’. The marginal utility ofC is high following the jump, since the agents were forced to be thrifty in order to purchase the local status good. Once the marginal utility drops, agents ponder another jump in status (point B), etc. 88 The key element of this utility is the ‘jump’. The right-hand panel of Figure 3.1 depicts a different (single) jump, associated to the corresponding (single) jump at point A of the left-hand panel. This discontinuous jump described in the right-hand panel is essentially the ‘aspirational utility’ proposed by Diecidue and Van De Ven (2008). Although the formal shape is different, one can use a well-known concavification argument to show that they yield identical (expected) utility maximization, and hence to the extent of microeconomic analysis, they are indifferent. In short, local, indivisible status concerns lead to aspirational utility. 3.2.2 A Formal Representation of Aspirational Utility Consider an agent who maximizes expected utility (EU), with initial endowment C 0 . However, this agent differs from the typical EU-maximizing agent in that (i) he has an aspiration R (R > C 0 ), and (ii) he discounts payoffs that fall below R, so that his utility is u(C) when C < R, and u(C) when C (R) (0 < < 1). Otherwise, we maintain the conventional assumptions on the behavior of u(): u 0 ()> 0 and u 00 ()< 0. More formally, we write the agent’s utility function as: U(C) =u(C)1 (0;R) (C) +u(C)1 [R;1) (C) (3.1) Given this definition of aspirational utility, our next task - which is the main focus of this paper - is to apply this to see how it generates an environment that endogenously creates demand for skewness. 3.2.3 The Aspirational Agent’s Choice Set: Binomial Martingales Since our goal is to understand how preference for skewness can endogenously arise from aspirational utility, we introduce an expected utility (EU) maximization setting that allows 89 agents (with aspirational utility) to make their optimal choice of skewness. To this end, we introduce the following set of securities that defines the choice set of the EU-maximizing agents. The idea is first to parsimoniously focus on the level of skewness embedded in these set of securities, and add in other features later. Let L(p) be a binomial martingale (i.e., fair game with two outcomes) with p2 (0; 1). That is, L(p) is a fair gamble which costs to purchase, and pays M with probability p. 0 Fail 1-p M Success p To ensure that L(p) is a martingale, we require: 0 =p(M) + (1p)(). This pins down the amount M as a function of price of lottery and p; M(;p). Similarly, let L (p) be an extended binomial martingale with p2 (1; 1). That is, it is a gamble which pays C S with “probability" p and C F with “probability" 1p , such that: 0 =p(M) + (1p)();p2 (1; 1) (3.2) Note that the only difference is the domain of p. That we allow p < 0 means that at this stage,p and 1p should be interpreted as ‘weights’, rather than probabilities. The reason for extending the domain is to keep the structure consistent and tidy throughout the analysis; for those who prefer to focus on ‘real-world intuition’, it would not do much harm to ignore the term ‘extended’ for now. The real-world interpretation of p < 0 will be provided in Theorem 2. Using the set of securities L (p), the agent can construct a consumption scheme by ex- ercising freedom over two dimensions: (i) which security (L (p); choice is over p) she wishes to purchase and (ii) how much of that security she wishes to purchase. Let N be the number 90 of security she purchases. Let C S be the consumption she enjoys if each unit of her security pays M. Let C F be the consumption she gets if each unit of her security pays. Without loss of generality, we fix = 1 going forward, since the nominal amount she invests in the bet can be adjusted by N. C 0 C F = C 0 N Fail 1-p C S =C 0 +N(M1) Success p Note here that because L*(p) is a martingale, the structure of his consumption scheme is also a martingale (can easily be verified algebraically by eliminating N in equations (3.5), (3.6) and adding in (3.7) below), namely: C 0 =pC S + (1p)C F : (3.3) To summarize, we have created a set of securities that abstracts away from variations in the first moment (expected returns) by imposing martingale. We also simplify the distributional structure of these fair games by assuming a binomial payoff. All this is to ensure relentless focusonthethird moment: skewness, afocuswhichwillberelaxedinthesectionsthatfollow. Using binomial martingales, the agent can set up her consumption scheme by choosing (p,N). Under this setup, we now move on to the utility maximization problem of the agent. 3.2.4 Utility Maximization The single jump EU-maximization problem (with the extended L (p)) is: 91 max p2(1;1) N2[0;1) (1p)U(C F ) +pU(C S ) (3.4) such that: C F =C 0 N (3.5) C S =C 0 +N(M 1): (3.6) 0 =p(M 1) + (1p)(1) (3.7) The following lemma allows us to reduce a dimension. Lemma 1. Consider the single jump EU-maximization problem described in equations (3.4) - (3.7), with C 0 : initial wealth and R: reference point (C 0 <R). Consider an associated reduced form of the EU maximization problem: max p2(1;1) (1p)U(C F ) +pU(R) (3.8) such that C F satisfies C 0 =pR + (1p)C F ; and let p be the maximand (solution) to this reduced problem. Assume that the following holds. U 0 (C F ) :=U 0 C 0 p R 1p :=u 0 C 0 p R 1p >u 0 (R) =:U 0 (R) (3.9) 92 Then any optimal consumption scheme in the original problem satisfies C S = R. Assumption (3.9) of Lemma 1 is not automatically satisfied unless = 1 (then, since C F < R, and marginal utility decreases in C, automatic), yet is still innocuous. It only requires that the marginal utility at C F is higher than the marginal utility at R. This is intuitive; once the agent attains the reference (aspiration) point, the hankering dissipates and marginal utility goes down, almost by definition. This Lemma is useful because for each and every p, we can forget about the argument N. Any N which induces a C S higher than R is irrelevant for the purpose of the optimization, and thus, N is determined automatically. Hence, using Lemma 1, we can reduce the setup described in equations (3.4)-(3.7) to the following setup: max p2(1;1) (1p)U(C F ) +pU(R) (3.10) such that C F satisfies C 0 =pR + (1p)C F : From now on, we stick to this setup, which means we will only look at the parameter space (;R;C 0 )R 3 that satisfies assumption (3.9). Before laying down a set of results that characterizes the behavior of solutions to the problem, we introduce an ancillary lemma that relatesp with the level of skewness embodied in security L (p). The definition of skewness we use is ‘Pearson’s moment coefficient of skewness’. S(p) =E " X 3 # Lemma 2. Let S : p(0; 1)7! R denote ‘Pearson’s moment coefficient of skewness’ of the consumption scheme induced by L(p). Then: 93 (i) S(p) is monotonically decreasing in p (ii) S(p)"1 as p# 0 (iii) S( 1 2 ) = 0 (iv) S(p)#1 as p" 1 This Lemma describes the relationship betweenp and skewness implied byp. In particu- lar, it shows that skewness is monotone in p and hence, the p chosen as the solution to the EU-maximization problem (3.10) uniquely determines the level of skewness that the agent’s choice implies. Ifp = 1 2 , the agent is demanding a symmetric security. Ifp < 1 2 , the agent is demanding a positively-skewed security; a security that has ‘lottery-like’ feature. If p > 1 2 , the agent is demanding a negatively-skewed security; e.g., a security that delivers modestly positive returns most of the time, but very negative returns in rare, but unfortunate states. Note that when p > 1 2 ; the returns during the bad state have to be ‘very negative’ in order to honor the martingale assumption, since it has to compensate for the high (> 1 2 ) likelihood of reaching a good state. This is increasingly so as p approaches 1. 3.3 Single Jump Optimization Now, we go back to the single jump EU-maximization problem. We specialize u() to be the power utility function with > 1. For technical comfort (and realism), we assume 1C 0 (<R). Namely, first, it is natural to assume that C 0 is lower than R. Also, requiring 1 C 0 is to ensure the utility value is positive on that domain, so that that multiplying by is indeed a discount. This does not harm generality, since we can always translate the utility function to be positive. Also, to make the problem non-trivial, we assume 0<< 1. Theorems 1 and 2 characterize the solution to the EU-maximization problem, p . 94 3.3.1 The ‘Four Seasons of Gambling’ Theorem 1. Let u(C) be the power utility function and let L (p) be an extended binomial martingale with p: \probability" of success (p 2 (1; 1)). Fix C 0 : initial wealth, and consider the reduced single jump EU-maximization problem (3.10) with R: reference point (C 0 <R). Let p be the \probability" that maximizes the expected utility. Then: (i) @p @( C 0 R ) > 0 (ii) as ( C 0 R )" 1 (i.e., as C 0 "R), p " 1 (iii)9( C 0 R ) 2 (0; 1) such that p 0. Theorem 1 describes how p - the optimal security demanded by the agent with aspi- rational preference - changes with the position of his aspiration (R) relative to his current consumption (C 0 ). (i) asserts that as R moves further out, away fromC 0 , the agent demands more positively skewed security. In everyday parlance, this means that as her aspiration becomes ‘unreal- istically high’, the agent starts to demand more and more ‘lottery-like’ securities to meet the aspiration. When C 0 is far away from R, attempting such a big jump in consumption (R - C 0 ) with high chance comes at the cost of disastrously low consumption if the attempt fails. (Recall that all securities in the choice set are fair gambles.) Thus, the agent ratio- nally avoids the abysmally low utility levels in bad states, through the purchase of positively skewed security (with lowp .) Moreover, the positive sign on the derivative asserts that this relationship is monotonic. That is, as agents’ aspirations move far from (close to) C 0 , the optimal choice of skewness will increase (decrease) monotonically. (ii) describes the opposite situation. As R becomes indistinguishably close to C 0 , the agent prefers a negatively-skewed security; that which takes her to the aspiration (R) with high chance at the expense of a low-chance event of a disaster. While it is true that the martingale assumption requires thatC F be low (becausep is close to 1), the proximity ofC 0 95 to R ensures that the associated C F is not unacceptably low. In other words, the proximity of the aspiration ensures that the agent can avoid exposing herself to a big downside risk, even while she is taking negatively skewed bets. (iii) describes what happens when the aspiration is too far, or equivalently, when the agent is ‘too poor’. (iii) asserts that at some point, the aspiration will be so remote that it is optimal to choose p 0. At this point, however, we are interpreting p as ‘optimal weights’, becausep has no real world analogue. The full meaning of (iii) will be revealed in Theorem 2. Taken together, (i), (ii), and (iii) show that as R gets pushed away from C 0 , the agent’s demandrunsthroughtheentiregamutofsecurities, startingfromthemostnegativelyskewed to the most positively skewed. This happens for any fixed 2 (0; 1); which determines the ‘size of jump.’ Theorem 2. Assume (as is in the real world) that only L(p) with p2 (0; 1) is available. When p 0, the agent chooses C 0 over any L(p) with p2 (0; 1). Theorem 2 and Theorem 1 together tell us what happens as R increases relative toC 0 . At first, asC 0 stands close to R, the agents demand negatively skewed securities for the reasons explained above. As R increases, the agents start demanding more symmetric securities (e.g., stocks), then positively skewed ‘lottery-like’ securities. As we move the R away from C 0 further, agents eventually reach a certain threshold where they stay with C 0 , thereby stop demanding risky securities altogether. We may call this the ‘four seasons of gambling’ as depicted in Figure 1. The horizontal axis represents the 2 [0; 1], and the vertical axis represents R C 0 2 [1; 3]: Fix any , and start from R C 0 = 1. Theorems 1 and 2 says that when R C 0 0, agents start fresh by buying negative skewness (Spring, light blue). Then, as R C 0 increases, they start to buy symmetric securities (Summer, yellow), then positively skewed securities (Autumn, brown), and ultimately, they stop (Winter, navy). Although the picture is truncated at 96 Figure 3.2: The Four Seasons of Gambling R C 0 = 3 , if we were to elongate the picture, we would indeed see that winter hits agents for all . Itisworthmentioningtheroleofvolatilityintheoptimizationproblem, whichprovidesan interesting contrast from the familiar mean-variance (MV) framework. From equation (C.1), we know that volatility in increasing in p , and inversely related to skewness. This means that when R is very close to C 0 , agents choose a negatively skewed security, even though it embodies a very high volatility. Intuitively, this can be interpreted as a situation where the aspiration point is tantalizingly close to the current income C 0 , and the agent will choose to attain it (and hence enjoy the utility that hikes over the kink) with very high probability, even at the expense of the commensurately low C F and the large volatility this entails. On the other hand, when R moves farther away fromC 0 agents see it as ‘too remote’. They grow increasingly more interested in limiting the downside risk, while hitting the aspiration point becomes a lottery event. As the downside risk is reduced, it also increasingly evaporates their hopes of hitting the aspiration point. These forces diminish the volatility embedded in the optimal L(p ), until ultimately, agents are unwilling to accept any volatility whatsoever 97 (p 0) and the four seasons of gambling is over. In short, the most stark deviation from the mean-variance framework (recall that in the MV framework with martingale setting, agents do not tolerate any volatility in the security) is when the aspiration point is very close to the current consumption level C 0 , in the sense that agents are willing to buy securities with huge volatilities embodied in them in order to obtain RC 0 . Ultimately, this contrast from MV dissipates as R moves far away from C 0 . These findings preview a later section, which explores the relative roles of volatility and skewness in greater detail. 3.3.2 Comparative Statics We are now ready to harvest a few comparative statics results that illustrates the choice of skewness by aspirational agents. Size of Jump Theorem 3. dp d < 0 Theorem 3 explains what happens as increases. Recall that is how much the agent keeps when the consumption falls short of the reference point R, hence (1) determines the ‘size of jump’. As increases, agents take the aspirations less seriously, and can hence afford to attain it with lower probability (like lottery). In other words, when is high making it less important to attain R, agents demand securities with higher skewness which limits the downward risk (i.e., low C F ) at the expense of lower chance of attaining R. However, as decreases, agents must take the kink more seriously, forcing them to give up more of their C F in bad state to ensure that they reach R with higher probability. The agents thus demand increasingly negative-skewed securities. Figure 1 also illustrates this. Initial Endowment Theorem 4. Fix ; ;and := C 0 R . Then, @p @C 0 j ; ; > 0 98 Theorem 4 is also very intuitive. As agents become wealthier, they need to worry less and less about the disastrous states where marginal utility is extremely high. Hence, they can afford to take more and more downside risk, thereby demanding less skewed securities. This coincides with the empirical findings (e.g., Kumar (2009)) that report more active engagement in lottery-like bets among consumers with lower income. Note that Theorem 4 is saying a little more than Theorem 1-(ii), which is effectively saying that @p @C 0 j ; ;R > 0: The difference here in Theorem 4 is that we are fixing := C 0 R instead of R. Thus, here we are not decreasing the distance between C 0 and R as we increase C 0 . The point here is that p increases even when R increases along with C 0 , and that the increase of p is purely an endowment effect, rather than effect of C 0 approaching R as in Theorem 1-(ii). Risk Aversion Theorem 5. Fix C 0 and . Suppose R is ‘big enough’ to satisfy: R C 0 " ( R C 0 ) log R C 0 ( R C 0 ) 1 # +O 1 R <; where O( 1 R ) is a positive term which vanishes at the rate of 1 R . Then, @p @ j ;C 0 ;R < 0 . We first give some numerical examples to get a feel for how stringent the assumption is. For reasonable parameters values such as = 0:8, = 2,C 0 = 2, @p @ < 0 will hold whenever R > 3:491. For = 0:8, = 2, C 0 = 1:5, @p @ < 0 will hold whenever R > 2:227. For 99 = 0:8, = 2, C 0 = 1, @p @ < 0 will hold whenever R> 1:159. The role of this R-threshold (higher than which will allow full monotonicity) can be interpreted as follows. Suppose the agent has C 0 in his hands. Higher implies that his utility function is more concave. In the power utility setting, this extra concavity is achieved by pulling down the utility of the agent both on positive outcomes (C > 1) and negative outcomes (C < 1) withC = 1 as the anchoring case. (Refer to figure below.) In our setting, the agent with higher discounts both u(C S ) and u(C F ) more heavily than in the log-utility case. This has consequences on p . The depressed u(C S ) affects p unequivocally; it acts to lower p . Intuitively, this is because the reduced upside discourages the agent from taking much downside risk (in the form of lower C F ) in return, and the agent consequently decreases p to ensure that C F does not fall too low. (Recall that we are envisioning a martingale situation.) Hence, the agent with higher chooses lower p . Meanwhile, the effect of the depressed u(C F ) on p can go both ways. When is higher and the downside is even lower, the agent faces a predicament: on the one hand, she wishes to avert the painful depth of the downside by choosing to increase C F . However, this can only be done at the cost of lowered p , again, because of the martingale assumption. The lowered p means that she is undermining her very chance of avoiding the downside, albeit perhaps a less painful one. Since the agent faces this inevitable trade-off, the effect of depressed u(C F ) is ambivalent. Nevertheless, we still can say this with absolute certainty: as C S =R grows, so does the magnitude by which u(C S ) gets depressed. (Refer to Figure below.) This means that the unequivocal effect of the depressed upside (i.e., lowerp ) will grow bigger and bigger, until at some point it dominates the (ambivalent and hence limited) effect of the depressed downside. This is precisely our assumption in Theorem 5: as R exceeds a certain threshold, the effect of on p becomes monotonic. Modulo the assumption, the broad-strokes conclusion of Theorem 5 is intuitive: agents with higher risk-aversion will choose lower p securities to minimize their perceived downside. 100 Figure 3.3: Power utility functions with different 0.5 1 1.5 2 2.5 3 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0 0.5 1 3.4 Double Jump Optimization It is realistic to envision a situation where there are multiple aspirations. For example, while the aspirational agent may be eager to accomplish a goal that she has yet to attain, she could equally be concerned that a past accomplishment may be revoked. We model this situation as a two jumps, where C 0 is in between the two aspirations. (R 1 <C 0 <R 2 ) We follow the same flow as in the single jump case: we first offer the (doubly aspirational) agents a choice set of securities, each with specific levels of skewness embedded, then observe their behavior. 101 3.4.1 Setup We modify the U(C) in (3.1) to accommodate two aspirations; let U(C) = 2 u(C)1 [0;R 1 ) (C) + 1 u(C)1 [R 1 ;R 2 ) (C) +u(C)1 [R 2 ;1) (C) (3.11) with 0< 2 < 1 < 1. The double jump EU maximization problem is identical to (3.4) - (3.7), except for the modified U(C). Consequently, we can use Lemma 1 again to reduce dimension just as before. Therefore, for any 5-tuple set of parameters: ( 1 ;R 1 ; 2 ;R 2 ;C 0 ); (3.12) with 0< 2 < 1 < 1, R 1 <C 0 <R 2 and assumption (3.9) of Lemma 1. We can the define the double jump EU maximization problem to be: max p (1p)U(C F ) +pU(R 2 ) (3.13) such that C F satisfies (3.4). Since any ( 1 , R 1 , 2 , R 2 ; C 0 ) (with parameters in suitable range) defines such a problem, we will succinctly denote the EU maximization problem as: M (s) for s = ( 1 , R 1 , 2 , R 2 ; C 0 ). Figure ?? illustrates an aspirational utility with two jumps. It turns out that the behavior of agents given this doubly-aspirational setup can easily be understood by drawing on our previous results with single-aspirational agents. To establish this connection, we consider a corresponding single-aspirational setup. Consider ~ s = (R 2 , 2 ; C 0 ) and define a single-jump problem using these three parameters. This is nothing more than a single jump problem generated by erasing the intermediate jump. We denote this EU maximization problem asM (~ s) for ~ s = (R 2 , 2 ; C 0 ), and call this the single jump problem associated withM (s). 102 Figure 3.4: Aspirational Utility with Two Jumps We now introduce some notations and definitions. Definition 1. Let SR 5 be the set of all ( 1 , R 1 , 2 , R 2 ; C 0 ) such that 0< 2 < 1 < 1, R 1 < C 0 < R 2 and assumption (3.9) holds. Consider s = ( 1 , R 1 , 2 , R 2 ; C 0 )2 S, and ~ s = (R 2 , 2 ; C 0 ). Then, s is indistinguishable from ~ s iffM (s) induces the same solution (maximand p* and maximized EU) asM (~ s). Intuitively, when the parameters are arranged in such a way that one of the two aspira- tions (in this case, the lower one; R 1 ) does not play any role in the agent’s maximization problem, we may say that such parameter setting s is indistinguishable from its ‘reduced form’ single-jump parameter setting ~ s. In this case, it is redundant to think of it as a double-jump problem because it is possible to reduce it to a single-jump problem. This definition allows us to partition S into two parts. Definition 2. H =fs2S : s is indistinguishable from ~ sgS 103 Naturally, S = H _ [H c and any s2 S belongs to either H or H c . When s2 H, the parameters are configured in such a way that it can be reduced to a single-kink problem and in this sense the double kink problem is ‘trivial’. When s2H c , the parameters do not allow for such a reduction. 3.4.2 Optimal Choice Under Two Jumps A natural question to ask given this setup would be: when is reduction possible (s2H) and when it is not (s2H c )? The following Theorem addresses this question. Theorem 6. Fix all 4 elements of s except R 1 (i.e., fix the ex-R 1 4-tuple of s, and allow R 1 to float.). Let R 1 = inffR 1 : ( 1 ;R 1 ; 2 ;R 2 ;C 0 )2Hg. Then, R 1 acts as a demarcation point; when R 1 < R 1 , ( 1 , R 1 , 2 , R 2 ; C 0 )2 H c and when R 1 R 1 , ( 1 , R 1 , 2 , R 2 ; C 0 )2 H. Moreover, when R 1 > 1, R 1 "R 2 as 1 " 1. This Theorem helps us discern whether the double-jump problem is an authentic double- jump problem, or whether it is reducible to a single-jump problem. The pivot variable is R 1 , the intermediate location of the jump. The Theorem indicates that there is a threshold value (R 1 ), higher than which all problems are reducible, and vice versa. Roughly speaking, the higher the 1 and the lower the R 1 , the more likely it will be authentically double jumped, whence the intermediate reference point actually matters. This is intuitive; higher 1 height- ens the vertical stature of the intermediate reference point. Similarly, lower R 1 increases the horizontal stature of the intermediate reference point, and hence its importance. Overall, when these forces wrinkle the utility function sufficiently, the EU-optimization departs from a single jump problem. Moreover, when 1 increases (and approaches 1 monotonically), the intermediate jump becomes significant for monotonically increasing sets of R 1 until it is so high that it is significant for any R 1 less than R 2 . This monotonicity, however, holds only onR 1 > 1, and this requirement ensures that we are not looking at a rather degenerate case 104 where R 2 and 2 are simultaneously so low that the problem is dominated by the near-zero marginal utility induced by 2 0. The natural order of business now is to describe the behavior of p when s2 H c , i.e, when the double jumped problem truly involves two aspirations. This is done in the following theorem. Theorem7. On anys2H c ,M (s) admits a solutionp which is characterized by = C 0 R 1 R 2 R 1 , (which represents the position of initial wealth (C 0 ) relative to the two reference points), namely: (i) p is monotonically increasing in ; @p @ > 0. (ii) lim #0 +p 0 (iii) lim "1 p = 1 Theorem 6 and Theorem 7 together describe how agents react to a double jump problem. Imagine a situation where the agent is sitting on an initial wealth. On the upside, she sees a dream to which he aspires. On the downside, there is a point below which she does not want to drop. (e.g. she does not want to fall below a level where she will be unable to pay the rent, and will be kicked out of the neighborhood.) In some cases, the downside drop may be ignored (s2 H). Theorem 6 tells us that this is the case if the drop itself is not too big (e.g. when 1 is low and 1 2 is relatively small) or if the drop is too close to the aspiration point (e.g., when R 1 R 2 , so the agent already lumps them together when she makes decisions.) When these are not the case, the agent has to take both jumps seriously (s2H c ). Theorem 7 tells us what happens when this is the case. When this happens, the agent gets ‘trapped’ in between the aspirations, in the sense that the agent’s demand for security is determined by, the position of her initial wealth relative to the two aspirations. When her initial wealth is close toR 1 , the agent, in fear of dropping belowR 1 , demands more lottery-like securities that minimizes the downside, until at some point, she stops demanding 105 risky securities altogether and just consumesC 0 (p 0). On the other hand, asC 0 is safely far from R 1 and close to R 2 , the agent begins to demand negatively skewed securities that gets her to R 2 with great certainty, albeit at the risk of (much less likely) slips below C 0 . 3.5 Departure from Fair Gambles: Sub-martingale and Super-martingale The analysis thus far has limited its scope to martingales (fair-games with zero returns). This allowed undivided focus on the choice of skewness, however it begs the question: what happens when we vary the first (mean returns) and second moments (volatility). We now add variations in the first moment to see how this affects the choice of skewness. 3.5.1 Setup First, we consider sub-martingales. Sub-martingales are stochastic processes with posi- tive drift. Recall that C S = C 0 +N(M 1) and C F = C 0 N. To specify the deviation from martingale, let: :=p(M 1) (1p) 0: (3.14) Namely, is the expected gain from buying one unit of L(p), which costs 1 to purchase. Thus, the positive sign on is what makes this gamble a sub-martingale. When the agent purchases N units of L(p), the expected value of consumption is: E[C] =pC S + (1p)C F =C 0 +N: The second term (N 0) represents the ‘better than fair’ component in the consumption scheme. Recall that in the martingale case, E[C] = C 0 and N was automatically pinned down by Lemma 1 and = 0. 106 We first state formally the EU-optimization problem in the sub-martingale case.: max p2(1;1) N2[0;1) (1p)U(C F ) +pU(C S ) (3.15) such that: C F =C 0 N C S =C 0 +N(M 1): :=p(M 1) (1p) 0: Note that the analogue of (3.3) in the sub-martingale case (obtained by eliminating N from (3.5), (3.6) and substituting (3.14) in) is: p 1 + C S + (1p) + 1 + C F =C 0 ; (3.16) This is the ‘budget constraint’ on the consumption scheme, set by the sub-martingale condi- tion (3.14). Note that for every C 0 and C S , this sub-martingale budget constraint implies a C F higher than that implied by the martingale budget constraint (3.3).This reflects the fact that L(p) here represents a ‘better than fair’ gamble; namely that 0. In the sub-martingale case, we cannot automatically invoke Lemma 1 anymore because Assumption (3.9) takes a different form, as is outlined in Lemma 3 below. In principle, we therefore have to solve the full problem, which amounts to maximizing (3.15) under the 107 budget constraint (3.16). Doing so yields an interesting departure from the martingale case. The following two subsections illustrate the two consequences. 3.5.2 > 0: ‘Winter (keep C 0 )’ Is Replaced by Near-Arbitrage Theorem 8. For any > 0,9p 2 (0; 1);N 2 (0;1) such that EU(p ;N )>U(C 0 ). The significance of this theorem is that in the sub-martingale case, the default option for the EU-maximizing agent is no longer to ‘do nothing’ (i.e., just sit on C 0 ) as in the Mean-Variance case. By doing nothing, the agent gets U(C 0 ), which is strictly dominated by EU(p;N) for a suitable choice of p and N which is always available as long as > 0. This begs the question: how can agents always enjoy an expected utility level higher than U(C 0 ) even when they are risk-averse? The answer is that when agents are allowed to choose the level of skewness embedded in the securities, they effectively end up selling lottery, whose payoff profile increasingly resembles an arbitrage as p" 1. (Note that here, the term ‘lottery’ is in the colloquial sense; purchase of a lottery entails loss in the mean, typically with high level of positive skewness, and vice versa for sale of a lottery.) To see this, note that the payoff structure of L(p) in the sub-martingale case (obtained by manipulating the constraints in problem (3.15)) is: 0 1 Fail 1-p +(1p) p Success p In particular, when p 1, 108 Figure 3.5: Almost Achieving Arbitrage by Selling Lottery: p SM 1 C0 −5 −4 −3 −2 −1 0 0 1 Fail p 0 + Success p 1 whosepayoffalmostresemblesanarbitrage,increasinglysoasp" 1. Suchanear-arbitrage opportunity may increase N indefinitely, however it is bounded by the constraintC F (p )> 0, effectivelyactingasanupper-boundforN.ThefollowingisapictorialillustrationofTheorem 8. 109 3.5.3 > 0: Lowers Demand for Positive Skewness Positive > 0 (strict sub-martingale) introduces an additional effect, encapsulated in the following result. The assumption required for Theorem 9 borrows from Lemma 3 which follows. Theorem 9. Assume (3.18) holds. Then > 0 implies p SM > (1 +)p , where p is the martingale solution. (i.e., solution when = 0) In particular, (i) p SM >p , and (ii) p SM has a lower-bound that is strictly increasing in . ThegeneralmessageofthisTheoremisthatp SM hasawell-definedlower-bound, (1+)p which is proportional to . In particular, Theorem 9-(i) means that in the sub-martingale (> 0) case where the expected return is no longer 0, agents tend to go ‘more aggressive’, and choose a security that achieves R with a higher probability than if = 0. This means thattheoptimaldemandforskewnessdrops inthesub-martingalecase. Theorem9-(ii)states that this ‘aggressiveness’ will most likely grow with; the stronger the positive drift, the less skewed are the securities demanded. Thus, in spirit, (ii) essentially means “ @p SM @ > 0". The result can be understood intuitively in the following way. Higher p SM is desirable because it increases the chance of attaining the good outcome U(C S (p SM )). The trade-off is, of course, that it pushesC F (p SM ) down lower. However, this downside can be mitigated if is positive, because allows agents to reduce their exposure to the gamble (N SM ) required to reach their aspired goals (R). Thus, the EU-maximizing agent can now afford to enjoy a higher p SM , hence > 0 implies p SM > p . This mitigating effect is stronger if the positive drift, , is stronger. Lastly, we close the loop by introducing the following Lemma which shows how Assump- tion (3.9) has to be modified in the sub-martingale case. 110 Lemma 3. Consider the single-kink EU-maximization problem described in (3.4), with con- straints (3.5), (3.6) and (3.14). Consider an associated reduced form EU maximization problem: max p2(1;1) (1p)U(C F ) +pU(R) such that C F satisfies p 1 + R + (1p) + 1 + C F =C 0 (3.17) and letp SM be the maximand (solution) to this reduced problem. Let C F (p SM ) denote the C F which satisfies (3.17) at p SM . Assume that the following holds. U 0 (C F (p SM ))>U 0 (R) 1 + (1p SM ) (3.18) Then any optimal consumption scheme in the original problem satisfies C S = R. It may look as though Assumption (3.18) is harder to satisfy than Assumption (3.9) because (1p SM ) !1 as p SM ! 1. However, it turns out that Assumption (3.18) is almost always satisfied for reasonable values of , due to Inada Conditions. Intuitively, when p SM is close to 1 and C S R, this means C F (p SM ) is already very close to 0, if not already 0. By Inada Conditions, this implies very high U 0 (C F (p SM )), satisfying Assumption (3.18). Therefore, fortunately, we do not lose much by always assumingC S =R, as it turns out to be a good approximation for the solution to the full problem. Conclusion: the sub-martingale problem does not allow for Lemma 1, but for practical purposes, we can approach the sub- martingale problem as a reduced-form optimization problem, just as we did in the martingale case. 111 3.5.4 < 0 :Super-martingales By switching the sign of (3.14) we can extend the result in the previous subsection to super-martingales. Namely when 0, decreasing reduces p and increases preference for positive skewness. (i.e., in spirit, “ @p SM @ > 0" hold, as before.) However, somewhat differently from the> 0 case, the replacement of ‘no gambling’ by lottery sales is no longer available in the super-martingale case, simply because negative depletes the opportunity for near-arbitrage. One small caveat in the super-martingale case is that the choice of L(p) in this case must be limited to those such thatp< 1 +, (< 0). This is becauseL(p) offer an arbitrage opportunities if otherwise. (Since C S =M 1 = +1p p andC F =1, we need to ensure that + 1p> 0, otherwise arbitrage can be made by shorting L(p).) An interesting exercise can be done using super-martingales, starting from the following observation. Taylor approximating a CRRA utility function reveals a mechanical sign on the moments, in particular the mean (positive), volatility (negative) and skewness (positive). This conclusion can, in part, justify the sale and purchase of lottery. Namely, if we character- ize standard lottery as a security that has very high positive skewness yet minutely negative return, we can justify trading lottery by ascribing it to the demand for skewness, at the expense of loss in mean. Meanwhile, in the aspirational setting we provided, we show that preference for negative skewness can also be justified (p 1). In the CRRA setting, buying negative skewness would have to be corroborated by positive mean since negative skewness reduces utility. However, in the setting we provide, agents can choose to sell skewness and yet accept negative returns, providing another sharp distinction from the CRRA-based MV framework. This is described in the figure below. The colors represent the amount of re- turns that can be taken away from a martingale bet in order to stop the agent from buying L(p). The red zone represents the situation where the magnitude of mean return that can be taken away (foregoable return) is the highest, and blue zone, the lowest. The horizontal axis represents the jump size which decreases from left to right. The vertical axis represents the distance between R andC 0 , which increases from high to low. The picture shows that clearly 112 Figure 3.6: Maximum Tolerable Negative Expected Return there is some mean that can be traded away against benefits of trading skewed securities, even when the agent is pursuing a negative skewed payoff. In fact, the foregoable return is highest when the agent is trading negatively skewed securities, and lowest when trading positively skewed securities. Overall, the picture is intuitive. The foregoable return is 0 when p = 0 because under that configuration, the agent can gain nothing by taking on a risky bet. When the agent is close to the aspiration point, and when the size of jump is higher, the foregoable return is higher because there is something to gain from taking on risk to attain the aspiration point. When the agent is too far from the aspiration point, or if the size of the jump is too small, the agent becomes increasingly unwilling to give up much in return. 113 3.6 Volatility and Skewness: A Role Analysis Another meaningful departure (from binomial martingale) is to separate volatility from skewness. In the stringent binomial martingale setting, the choice of p simultaneously fixes volatility and skewness of L(p ), as illustrated by Lemma 2. Because of this entanglement, it was impossible to see clearly what the agent’s preferences for volatility looks like in the aspirational setting. The aim of this section is to explore what the relative roles of volatility and skewness are in the EU-maximization process. To isolate the two effects, our first goal is to design a set of securities that controls one, yet varies the other. 3.6.1 The New Consumption Scheme: Tri-nomial Martingales We generate a variant of the binomial martigale introduced before. For the sake of brevity, we suppress the underlying L(p) and start directly with the consumption scheme, keeping in mind there always exists a corresponding L(p). C 0 C F = C 0 N Fail p 3 C M = C 0 Stay p 2 C S =C 0 +N(M1) Success p 1 where p 1 +p 2 +p 3 = 1;andE[C 0 ] =p 1 C S +p 2 C M +p 3 C F =C 0 (3.19) 114 Figure 3.7: Top: Skewness, Bottom: Volatility 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 −10 −5 0 5 10 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0 5 10 15 20 As before, (3.19) dictates that the trinomial consumption scheme is also a martingale. The key difference is that we assign positive probability mass (p 2 ) on an intermediate branch, C M =C 0 . Because consumption level stays at C 0 on this node, the expected outcome is to (1) reduce volatility while (2) keeping skewness relatively stable. Indeed: Figure 3.7 illustrates the movements of volatility and skewness as probability mass on C M (i.e.,p 2 ) varies. As expected, volatility is muted down as p 2 increases, whereas skewness is kept relatively stable. This allows us to address the task of analyzing separately how elements of volatility and skewness affect the agent’s demand for L(p). We begin by asking, “if single-kinked EU-maximizing agents were offered the opportunity to choose securities with same (similar) 115 skewness but lower volatility (by increasing p 2 ), which security would the agent choose?" For this, define: Definition 3. LetT (P;C) denote the trinomial consumption scheme depicted above, with P = (p 1 ;p 2 ;p 3 ) and C = (C S ;C 0 ;C F ) such that p 3 C F + p 2 C 0 + p 1 C S = C 0 . Consider B := C (P 0 ;C 0 ) with P 0 = (p 1 ; 0;p 3 ) and C 0 = (C S ;C F ). B is the binomial consumption scheme associated toT: Inthisdefinition,B issimplythebinomialanalogueofthetrinomialconsumptionscheme when p 2 = 0, while preserving the martingale property and the positions of C S and C F . It can be interpreted as the most volatile security in the family of trinomial securities with identical consumption positions and similar (by construction) levels of skewness. 3.6.2 Principle of Maximal Volatility Theorem 10 and 11 characterize the aspirational agent’s choice over tri-nomials. They reveal a rather surprising deviation from the familiar mean-variance results. Theorem 10. ConsiderM (;R;C 0 ). Any optimal tri-nomial consumption scheme: T , is dominated by its associated binomial consumption scheme: B. Theorem 10 tells us that any consumption scheme which is less volatile than the most volatile consumption scheme (B) is dominated, and not chosen. In other words, somewhat against our intuition, agent would choose the most volatile security in the family of securities with same (similar) skewness profiles. The reason for doing so becomes more apparent with the next Theorem. Theorem 11. Any solution to a trinomial optimization problem is an associated B with C S =R. Theorem 11 tells us that in spirit, Lemma 1 holds in the trinomial martingale setting as well. Namely, the agent takes just enough risk to hit R, and does not engage in a con- sumption scheme that offers C S > R: Meanwhile Theorem 10 tells us that the agent does 116 not aim to reduce volatility either, as long as C S =R can be attained. Reducing volatility only reduces expected gains from attaining R. As long as the agent knows that (s)he has right skewness and N to efficiently hit the aspiration point R, (s)he buys the entire portfolio rather than try to reduce volatility by assigning weight on C 0 . After all, volatility is the thrust that propels the agent fromC 0 to R. Reducing volatility would detract from the spoils of attaining R. Thus, insofar as the agent is able to precisely choose the level of skewness to attain R, she operates under the principle of maximal volatility in order to maximize her gain from the gamble. In the examples that follow, we illustrate the relative role of skewness and volatility, the principle of maximal volatility, and what happens when the agent loses the ability to precisely target the level of skewness needed to attain R. Example 1: Loss of Skewness is Prohibitively Costly Consider the EU maximization problem M (;R;C 0 ) with = 0:8, R = 5, C 0 = 2:8. WhentheagentisallowedtochoosefromallavailableL(p),theagentwillchoosep = 11:22% with skewness of 2.46 to attain ‘o’. Now, we consider depleting the agent of the opportunity to control skewness. If the agent is only offered the choice of skewness of 0, (i.e. p = 1 2 ), the agent would be offered ‘x’, which (s)he does not choose since this is dominated by U(C 0 ): Namely, EU(o) > U(C 0 ) > EU(x) so the gambling stops. This illustrates the case where loss of skewness is prohibitively costly and the agent will stay with U(C 0 ) (no gambling). Example 2: Loss of Skewness is Not Prohibitive Now consider the EU maximization problemM (;R;C 0 ) with = 0:8,R = 5,C 0 = 4:5. The only difference from Example 1 is C 0 . When the agent is allowed to choose from all available L(p), the agent will choose p = 81% with skewness of -1.54 to attain ‘o’. In this case, when the agent loses control over skewness, the agent still chooses ‘x’ over U(C 0 ). Namely, in this case, EU(o)>EU(x)>U(C 0 ), so the agent still makes a risky choice. 117 Figure 3.8: Loss of Skewness is Prohibitively Costly Examples 1 and 2 illustrate that ‘principle of maximal volatility’ is not confined to optimal choices. Namely, in spirit, Theorems 10 and 11 hold even when we restrict the agent’s choice to limited levels of skewness - in the Examples, restricted to 1 2 . In Example 2 when control over skewness is lost, agent does not choose to reduce volatility. It is easy to show that if the agent was offered a trinomial martingale with skewness = 0, the agent would decline and still choose the associated binomial martingale, very much like in Theorem 10. However, like in Theorem 11, this is only to attain R. Once C S =R, the agent does not increase the size of bet (N) to increase C S . 118 Figure 3.9: Loss of Skewness is Acceptable 3.6.3 A Rule of Thumb Under Limited Choice of Skewness The Examples also illustrate a ‘rule of thumb’ that agents use when they are offered a tri-nomials with limits on the level of skewness they can choose from. When the agent is offered a tri-nomial security with skewness = 0 (or more generally, with restrictions on the level of skewness she can choose), she processes it in the following steps. Step 1: she asks whether the loss of skewness is prohibitive and chooses C 0 if it is. Step 2: if the loss of skewness is not prohibitive, she chooses the security with the highest volatility; the associated binomial. (Principle of Maximal Volatility) In summary, skewness is welfare improving because it allows the agent to tailor her aspiration more precisely. Restrictions on this freedom is welfare reducing (Example 2), and in some instances, prohibitively so (Example 1). Volatility on the other hand, can work both ways. Volatility is helpful insofar as it takes the agent to R. This is why agents choose the 119 associated binomial (‘principle of maximal volatility’.) However, this is only up to R, and excess volatility over and above R is avoided (Theorem 11.) 3.7 Experimental evidence We have thus far shown that under certain conditions, the presence of a potential as- piration level (or more generally the possibility of reaching a non-monetary utility jump) can cause a rational decision-maker to seek additional skewness (both positive and negative depending on the environment) in her portfolio of assets. 2 Through this framework we have layed the economic foundations that may explain part of the skewness seeking that is ob- served in various real world situations. While we do not claim that potential aspiration levels are the sole cause of the observed skewness seeking, we have shown that they may very well be at least a significant part of the prevalent skewness seeking observation. Showing that aspirations can cause an increase demand for skewness in the real world is complicated by the fact that, it is very rare that the researcher observes a financial decision maker’s intrinsic motivations. To stake further claim to our hypothesis and theoretical model we conducted a controlled laboratory experiment to test whether the presence of subtle, non-monetary, potential aspiration points could cause experimental subjects to seek additional skewness in financial decision making. The experiment took place in May 2019 at the Marshall behavioral research laboratory at the University of Southern California. A total of 126 subjects participated in one of 13 sessions by completing a series of choices on paper. This was an individual decision-making experiment, thus there was no strategic interaction between participants. Participants were given written instructions and were guided through a presentation prior to the start of the experiment. To ensure proper understanding, they were encouraged to ask questions both in public and in private and asked to complete a series of comprehension questions which 2 Here “additional skewness” refers to over and above the level of skewness that the decision-maker would choose in the absence of the aspiration level. 120 were checked for errors prior to the start of the experiment. Sessions lasted between 50 - 60 minutes and subjects earned on average $20 including a $5 show-up fee. The experiment consisted of two independed studies both of which were incorporated in the same experimental session (i.e. all subjects participated in both experimental studies). Each session was split into three parts, sections 1, 2 and 3, which were completed in order. Sections 1 and 3 conducted the first experimental study while section 2 conducted the second study. Each section consisted of 10 rounds, for a total of 30 rounds. Each round consisted of a menu of (two, three or five) binary lotteries out of which the participant had to select one by circling the appropriate letter representing that lottery. At the end of the experiment one of the 30 rounds was selected at random for payment. (The selection of the round and the payment were done using 3-sided and 10-sided die - see Appendix for details.) Each binary lottery clearly indicated two possible monetary rewards in $ amounts each with a corresponding % chance of happening. Since payment was done using a 10-sided die, the die number which corresponded to each monetary outcome was also stated for addidtional clarity. A potential aspiration point was induced through what we called a “Donation threshold” $ amount. This referred to the minimum monetary amount that if a participant reached in the randomly chosen round then we (the experimenters) would donate a $25 amount to a charity organization chosen by the participant (out of a list that we provided at the start of the experiment). Note that the “Donation threshold” was round- specific while the $25 donation amount was constant. Whether a participant achieved the donation or not had no impact on her monetary earnings from her choices. Finally, only the round chosen for payment was relevant for the potential donation. Thus the aspiration point varied between rounds while the aspiration level was kept constant (but the magnitude for each participant may have been very different). An example round is shown figure 3.10. 121 Figure 3.10: Example: Round 3 of Section 1 Round 3 of Section 1. By choosing lottery “A”, the subject will have the chance to earn $8 with 50% chance (if the 10-sided die rolls 1,2,3,4,5) or $14 with 50% chance (if the 10-sided die rolls 6,7,8,9,10). Since only the $14 reward is at least as much the “Donation threshold” of $9, the subject has a 50% chance of achieving the $25 donation to her chosen charity. Conversely, by choosing lottery “B”, the subject will have the chance to earn $2 with 10% chance or $12 with 90% chance depending on the 10-sided die roll. The subject now has a 90% chance of achieving the donation since the $12 is above the “Donation threshold”. Notice that both lotteries have the same expected value ($11) and standard deviation ($3) but differ in the skewness ($0 versus $-2.7) 3.7.1 Study 1: Within-subject test of preference reversals Study 1 was conducted in sections 1 and 3 of our experimental sessions. It followed a within-subject design meaning that all subjects faced the same questions in the first and last section of the experiment. Since the study involved checking for preference reversals (i.e. a subject faced each lottery twice) we split the study in two sections - before and after study 2, to make the purpose of the study less obvious to the participants. Experimental design All20roundsinvolvedchoicesbetweentwobinarylotteries, “A” and“B”.Thetwolotteries within each round always had the same mean and standard deviation. Since binary lotteries areuniquelydefinedbytheirfirstthreemoments,theonlydifferencebetweenthetwolotteries was the level of skewness, with lottery “A” always having zero skewness (50% chance of each outcome.) and option “B” being either positively (1.5 or 2.7) or negatively (-1.5 or -2.7) skewed. Each pair of lotteries was presented twice, once in section 1 and once in section 2. While the lotteries in the pair were identical in the two sections, the donation threshold 122 differed. In one case, the donation threshold was unattainable by either of the potential earnings in the two lotteries. This part served as the control choice - the choice that is unaffected by the potnetial aspiration point of the donation. In the other case, the donation threshold was attainable by either lottery A (for pairs with zero and positively skewed lotteries) or both choices A and B (for pairs with zero and negatively skewed lotteries). This part served as the treatment choice - the choice impacted by the potential aspiration point of the donation. In total there were 10 unique combinations of binary lotteries varying in expected value and standard deviation (see Appendix). By offering various combinations of means and variances we were able to test whether the first two moments played a role in the decision to switch skewness preference in the presence of the aspiration. The motivation here is to see whether we can identify (and whether they exist) the circumstances under which a participant may prefer the less skewed option in the absence of the donation aspiration point (lottery “A”) but may then opt to seek more skewness in the presence of the aspiration point (lottery “B”) as our theory suggests. In our design that would be reflected as the revealed preference choices “AB”, where the first letter indicate the choice under no induced aspiration and the second letter would indicate the choice under the induced aspiration. Note that our design does not preclude the presence of other intrinsic aspiration points (for example a participant may aspire to leave the experiment with $10 or more - something which would induce an aspiration point.) or that our induced aspiration points are relevant to the participants (i.e. Someone may not care for any of the charities that we presented or in an extreme case may receive disutility from charitable donation). We did not design the experiment to answer these questions for individual subjects and so we will rely on averages to test our hypotheses. Results A total of 126 subjects generated 2520 choices (20 choices each across two sections). Option A (the no skewed option) was chosen 52% of the time while option B (the skewed 123 option) 48% of the time. Breaking down these numbers by the rounds that involve (no- skewed, positively skewed) pairs vs. (no-skewed, negatively skewed) pairs, option A was chosen 47% and 57% of the time while option B 53% and 43% of the times respectively. Turning to our object of interest - preferences over pairs in the control and treatment, Table 3.1 shows the % of times that each pair {AA, AB, BA, BB} was chosen by all participants for all rounds. Choice pairs “AA” or “BB” indicate that these choices were unaffected by the induced aspiration point. i.e. The participant did not exhibit a preference reversal in the presence of achieving a potential aspiration. The distinction between the two cases is that while in “AA” we can say that the induced aspiration was not strong enough to cause an additional demand for skewness, in “BB” the participants intrinsically preferred the more skewed option even without the potential aspiration, so that we cannot say whether the potential aspiration could have caused them to demand even more skewness. Participants chosing “AB” would showcase a preference reversal in the direction that our hypothesis predicts. i.e. Preference for the less skewed option in the absence of the potential aspiration coupled a preference reversal (demand for the more skewed option) when the possibility of achieving an aspiration point is present. Finally “BA” choices indicate the opposite. i.e. a participant seeks less skewness when it’s possible to achieve the donation. On first sight, such results may seem strange, as perhaps coming from mistakes. However we can rationalize them in two ways: First, it may be that some participants received a disutility from achieving the donation and thus seek the choices that minimize such a chance (i.e. a downwards utility jump). Second, it could be due to subject confusion about the way in which the donation was constructed and more specifically from thinking that the donation amount would somehow affect their own earnings. Regardless, mistakes or inattention are certainly possible explanations too and it is likely that both play a role in such revealed observations. Most of the choices are unaffected by the presence of the potential donation - pairs (“AA”, “BB”). This tells us that participants understand the experiment and are clearly not 124 Table 3.1: Distribution of choices across pairs of rounds Revealed All lottery Pairs with Pairs with preference pairs {0, +ve} skewness {0, -ve} skewness AA 34.68% 27.62% 41.75% AB 20.71% 23.02% 18.41% BA 13.73% 15.04% 12.06% BB 30.87% 33.97% 27.78% Total 100% 100% 100% N = 1,260 (2,520 “A” or “B” paired choices.) The first [second] letter indicates the lottery choice in the control [treatment] part of each pair of lotteries. i.e. The “AB” row indicates the % of times that participants chose the zero skewed option in the control part (lottery “A”) and the skewed option in the treatment part (lottery “B”). randomizing. Most of their choices (65.55%) are revealed to be stable over the experiment. Of the choices that do exhibit preference reversals (34.44%), the majority (20.71%) are preference reversals in the direction predicted by our hypothesis. This finding is qualitatively stable if we split the choices across pairs with zero-positive or zero-negative skewed pairs remains. We also find qualitatively similar results if we further split the sample into choices involving pairs with skewness of {(0, 1.5), (0, 2.7), (0, -1.5), (0, -2.7)} and further if we examine each lottery pair combination (Appendix Tables C.3, C.4, C.5). There does not seem to be any systematic pattern regarding the preference reversals with the means and variances of the lotteries at stake. Thus while subtle, we conclude that our induced potential aspiration points on average seem to cause participants to exhibit additional demand for skewness. The next step is to check whether these findings which are averages over all participants are the result of a few people that being strongly affected by the induced aspiration or by many people being slightly affected in terms of demanding more skewness. Figure 3.11 (left) shows the histogram of the number of “AB” preference reversals per participant and figure 3.11 (right) shows the histogram of the ratio of preference reversals “AB” to choosing “A” in 125 the control choice (i.e. “AA” + “AB”) - This statistic is interesting because it shows the % of times a participant demanded more skewness when the potential aspiration was present out of the times that he or she revealed preference for the less skewed option. (Note that if the participant revealed preference for the skewed option in the control part, then we have no way of checking whether he or she would demand more skewness in the presence of the aspiration since there was no more skewed option to choose from.) Figure 3.11: Histograms of preference reversals 0 10 20 30 40 Number of participants 0 1 2 3 4 5 6 7 8 9 10 Number of ABs Histogram: Number of ABs per participant 0 10 20 30 Number of participants 0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1 Number of ABs / Number of As in control Histogram: Number of ABs over As in control per participant Left: Histogram of number of preference reversals “AB” per participant. Right: Ratio of “AB” over all such possible choices (“AB” + “AA”) - Choices of “B” in the control part are excluded since for those we cannot examine a potential preference revarsal for additional skewness seeking. Interestingly we observe that while most participants exhibit a preference reversal be- tween zero to four times, this does not take into account the intrinsic preferences for skewness of each participant. If we exclude all the choices where option “B” was chosen in the control part (i.e. the skewed option) - since those choices by construction cannot gererate additional skewness seeking, and divide the number of preferense revarsals towards additional skewness seeking over the number of times a participant preferred the less skewed option (option “A”) in the control part then we find a massive heterogeneity over our participants. Some of our participants never demand more skewness in the presence of the aspiration point while others always demand more skewness when the aspiration point is achievable. Thus the induced aspiration of a potential charitable donation has very differential effects on the skewness 126 seeking of our participants. This can either be due to the size of the utility jump due to the donation () or to the strength of skewness preferences of each participant. A drawback of study 1 is that in many cases participants choose option “B” (the skewed option) in the control part, meaning that there is no more skewed option for them to choose in the treatment part where the aspiration point is achievable. Thus we lose a lot of obser- vations. For that reason we conducted study 2 which is conducted in a similar fashion but which overcomes this problem. 3.7.2 Study 2: Between-subject test of skewness preferences with aspiration shifts Experimental design Study 2 was conducted in section 2 of our experimental sessions and in contrast to study 1, it varied between subjects in three treatments. Subjects were faced with 10 rounds, each of which, involved a number of lotteries (either 3 or 5) out of which the subject was asked to select one. Thus, from the subjects’ perspective, the task was similar to study 1 (sections 1 and 3), apart from the number of lotteries available in each round. From the researcher’s perspective however, the higher number of binary lottery choices (same first two moments) allowed us to qualitatively test for movements along the skewness dimension that is predicted by our model. In addition, by utilizing a between-subject design and since each subject only faces the same lotteries once, we are able to overcome any potential issues related to experimenter demand effects that study 1 may be subject to. Thus, study 2 acts as a robustness check for study 1 by providing an alternative framework to observe the potential applicability of our model that also overcomes some of the potential drawbacks of study 1. Across participants, the 10 rounds differed only in the amount of the donation threshold (location of the potential utility jump), i.e. all participants faced the same lottery choices in each round. Each lottery within a round was marked with a letter (“A”, “B” , “C” , “D” , “E” 127 ) in order of the magnitude of skewness, with “A” being the lottery with the more negative skewness and “C” (in the 3-choice rounds) or “E” (in the 5-choice rounds) the lottery with the highest positive skewness. The middle option (“B” or “C” respectively) was the zero-skewness lottery. All lottery choices are depicted in the appendix. Whatwearetestinginthisstudyiswhethermanipulationsinthelocationofthepotential utility jump (donation threshold) cause qualitative changes to the types of lotteries chosen. More specifically, will participants choose more positively [negatively] skewed lotteries as the donation threshold increases [decreases]? While we are not able to make quantitative predictions due to the absence of other measures like the risk aversion parameter, the size and location of the utility jump due to the $25 donation, this is a useful proof-of-concept exercise showcasing the mechanics of our model: the potential presence of a non-monetary aspiration point (representing a discontinuous utility jump) can affect the skewness (both positive and negative) that decision makers seek in financial decision making. Thus what we expect to observe is that an increase [decrease] in the donation threshold will cause, on average in our sample, an increase [decrease] in the skewness seeking or equivalently towards the lottery that maximizes the probability of donation. The example figure 3.12 provides an illustration of the three treatments and our hy- potheses: The binary lotteries “A”, “B”, “C”, have the same first two moments (mean = 11, standard deviation = 3). Thus, from the property of binary lotteries, the skewness (third moment) of each uniquely determines each lottery (skewness = {-2.7, 0, +2.7} respectively). Participants in all three treatments faced the same lotteries. What varied across treatments is the "Donation threshold". Figure 12 shows a Donation threshold of $12. In this case, a participant seeking to maximize the probability of reaching the donation threshold should choose lottery “A”. In another treatment, the donation threshold was somewhere in the range (12; 14]. Thus a participant in that treatment, seeking to maximize the probability of reach- ing that threshold should choose lottery “B”. Finally, the highest threshold would be in the range of (14; 20]. Participants in such treatments should choose “C” if they are to have any 128 chance of reaching the threshold. Of course, other factors affect the lottery choice (intrinsic skewness preferences, size and direction of the utility jump from the donation). However, random assignment to treatments means that, we should expect those factors to average out across treatment groups. Thus, if at least some of our participants receive positive utility from making a donation to their chosen charity, our model predicts that, as the donation threshold increases [decreases] participants will be more likely to choose a lottery with more positive [negative] skewness. Figure 3.12: Example: Round 1 of Section 2 in one of the treatments Lottery “A” maximizes the probability of donation at the current donation threshold. As the donation threshold moves up in the range of (12; 14] we should expect qualitative movements away from “A” and towards “B” and “C”. As the donation threshold moves further up in the range of (14; 20] we should expect further qualitative movements towards “C”. The three treatments thus involve donation thresholds in the ranges off(10; 12]; (12; 14]; (14; 20]g Results A total of 126 subjects generated 1260 choices (10 choices each). We find that, in ac- cordance to our model’s predictions, on average, participants choose more positively skewed lotteries (i.e. demand more positive skewness) when the donation threshold (aspiration point) is higher and more negatively skewed lotteries (i.e. demand more negative skewness) when the donation threshold is lower. This finding is in line with our model’s prediction, namely that decision makers will, other things being equal, seek skewness in order to increase the chance of achieving their aspiration (higher donation thresholds required more positively skewedlotteriestoreachthethresholdswhilelowerdonationthresholdsthatcouldbereached 129 with several lotteries, required more negatively skewed lotteries to maximize the probabil- ity of reaching the threshold.). Of course, as expected, intrinsic preferences for skewness played an important role and thus several participants chose lotteries that did not maximize the probability of achieving the donation. On average, however, there is a clear directional movement towards the lottery with the skewness that maximizes tha probability of donation. For our main analysis we compare if and how the distribution of choices changes across treatments (locations of the donation threshold). To begin with, we pool together all rounds with three lottery choices. We can do that because lottery “A” is always negatively skewed, lottery “B” has always zero skewness while lottery “C” is always positively skewed. In ad- dition the donation thresholds always fall in one of three regions: (1) Where lottery “A” maximizes the probability of donation (we will refer to those regions as the “Low” dona- tion threshold regions), (2) Where lottery “B” maximizes the probability of donation (“Mid” donation threshold regions) and (3) Where lottery “C” maximizes the probability of dona- tion (“High” donation threshold regions). Table 3.2 shows the distribution of choices across rounds conditional on the donation threshold region of each 3 . If the possibility of achieving the donation made no impact on participants, we would expect the three distributions “Low”, “Mid”, “High” to not be statistically different from each other since the lottery choices were identical. However, two-sided Kolmogorov - Smirnov tests of equality of distributions reject the hypothesis that any two of the two distributions are equal 4 . While it is clear that the negatively skewed lottery (A) is generall not preferred (chosen less than 20% of the time), it is relatively more popular in the “Low” treatment where it maximizes the probability that participants would reach the donation. Similarly, the lotteries B / C are each relatively more popular in treatments “Mid” / “High” which are the respective treatments where those lotteries maximize the probability of donation. While Table 3.2 shows the average results across all rounds with 3 lottery choices, the qualitative results hold across most rounds. Appendix Table C.6 shows the breakdown 3 All rounds were depicted with donation thresholds falling in each of the three regions across treatments. 4 P-value for distributions “Low” and “Mid” = 0.002; P-value for distributions “Mid” and “High” < 0.001. 130 Table 3.2: Distribution of choices across treatments Chosen Donation threshold Lottery Low Mid High A (-ve skewness) 24.11% 15.48% 18.45% B (0 skewness) 32.44% 48.51% 28.87% C (+ve skewness) 43.45% 36.01% 52.68% Total 100% 100% 100% N = 1,008 336 336 336 The percentages across columns indicate the histogram of choices across treatments. The “Low” [“Mid” / “High”] treatment includes all rounds in which the “Donation threshold” was set such that lottery “A” [“B” / “C”] maximized the probability of the donation. of Table 3.2 for each of 8 rounds involving 3 lottery choices. Out of the 24 comparisons (8 3), 18 are qualitatively identical to the comparisons from Table 3.2 , i.e. Each lottery is chosen relatively more often when the donation threshold is where that lottery maximizes the probability of donation. We can thus confidently say that, like experiment 1, the location of the donation threshold (aspiration / utility jump) appears to be having a causal impact in the skewness that participants choose in their financial decision making. 3.7.3 Discussion of experimental results We conducted two experimental studies with a similar context but different experimental designs in order to look for evidence of behavior conforming to the model and ideas pre- sented in the theoretical part of this paper. More specifically we wanted to test whether the possibility of achieving a non-divisible non-monetary good on top of money would af- fect the behavior of our participants in the direction of them seeking more skewed lotteries rather than what they would seek if that good was not part of the experiment. This good was a $25 donation to a charity of their choice and, in our model, this would act as an aspiration-induced discontinuous utility jump in an agent’s utility function. In the exper- 131 iment we manipulated the location of the utility jump by varying the monetary earnings threshold above which participants would achieve the $25 donation - on top of the money they would earn from their lottery choices. In both experiments, participants were asked to choose amongst binary lotteries with the same mean and variance, differring only in skew- ness. Comparing their lottery choices along with the location of the donation threshold we were able to infer whether participants seeked lotteries with more (positive or negative) skewness when that would improve their chances of achieving the donation. Our first study utilized a within-subject design and looked at preference reversals. i.e. whether a subject would demand more (positive or negative) skewness in the presence of the utility jump against the case of no jump (control). Across 10 rounds we found consis- tently this pattern emerging across our participants who chose the less skewed options in the control rounds, i.e. the placement of a discontinuous potential utility jump caused them to demand more skewness in their financial decision making. Our second study utilized a between-subject design and examined qualitative movements in the skewness dimension with movements in the position of the utility jump. Here we also found clear evidence that, the distribution of choices along the skewness dimension was significantly different when the lo- cation of the utility jump changed. Specifically the distributions shifted towards the lottery that maximized the probability of achieving the donation. 3.8 Conclusion In this paper we study the demand for skewness –both right and left skewness– using a utilityfunctionwithmicroeconomicandevolutionaryfoundations. Weassumethateconomic agents care both about consumption –divisible good– and status –achieved through the purchase of non-divisible goods. Our resulting utility is in the spirit of Friedman and Savage (1948), however, theiranalysisfocusesonthesecondmomentofthedistributionthatexplains uncertainty. We consider a parsimonious set of securities that allow the agent to select the 132 exact optimal level of right or left skewness. Our analysis yields a rich set of results broadly consistent with empirical observations. What is more, we conducted an experiment in order to obtain clear evidence that the main concept of our model (the aspirational utility jump) is indeed responsible for driving additional demand of skewness (in both the positive and negative dimension). 133 Bibliography Abdellaoui, M., Klibanoff, P. and Placido, L. (2015). Experiments on compound risk in relation to simple risk and to ambiguity. Management Science, 61 (6), 1306–1322. Alonso, R. and Câmara, O. (2016a). Bayesian persuasion with heterogeneous priors. Journal of Economic Theory, 165, 672–706. — and Câmara, O. (2016b). Persuading voters. American Economic Review, 106 (11), 3590–3605. — and Câmara, O. (2016c). Political disagreement and information in elections. Games and Economic Behavior, 100, 390–412. Andrew, C. E. and Senik, C. (2014). Income comparisons in chinese villages. Happiness and economic growth: Lessons from developing countries, pp. 217–239. Andrew, R. C., Frijters, P. and Shields, M. A. (2008). Relative income, happiness, and utility: An explanation for the easterlin paradox and other puzzles. Journal of Eco- nomic literature, 46 (1), 95–144. Au, P. H. and Li, K. K. (2018). Bayesian persuasion and reciprocity: Theory and experi- ment. Working paper. Babichenko, Y. and Barman, S. (2016). Computational aspects of private bayesian per- suasion. arXiv preprint arXiv:1603.01444. Bali, T. G.,Cakici, N. andWhitelaw, R. F. (2011).Maxingout: Stocksaslotteriesand the cross-section of expected returns. Journal of Financial Economics, 99 (2), 427–446. Barrington-Leigh, C. P. and Helliwell, J. F. (2008). Empathy and emulation: Life satisfaction and the urban geography of comparison groups. National Bureau of Economic Research, Working Paper (14593). Bergemann, D. and Morris, S. (2016). Information design, bayesian persuasion, and bayes correlated equilibrium. American Economic Review, 106 (5), 586–91. —and—(2019).Informationdesign: Aunifiedperspective. Journal of Economic Literature, 57 (1), 44–95. Bizzotto, J., Rudiger, J. and Vigier, A. (2016). Dynamic bayesian persuasion with public news. Tech. rep., Mimeo. 134 Blanchflower, D. G. and Andrew, J. O. (2004). Well-being over time in britain and the usa. Journal of public economics, 88 (7-8), 1359–1386. Boleslavsky, R. and Kim, K. (2018). Bayesian persuasion and moral hazard. Working paper. Boyer, B. H. andVorkink, K. (2014). Stock options as lotteries. The Journal of Finance, 69 (4), 1485–1527. Brown, G. D., Gardner, J., Oswald, A. J. and Qian, J. (2008). Does wage rank affect employees’ well-being? Industrial Relations: A Journal of Economy and Society, 47 (3), 335–389. Campbell, J. Y. and Cochrane, J. H. (1999). By force of habit: A consumption-based explanation of aggregate stock market behavior. Journal of Political Economy, 107 (2), 205–251. Cantril, H. (1965). The pattern of human concerns. New Brunswick, New Jersey: Rutgers University Press. Charles, K. K., Hurst, E. and Roussanov, N. (2009). Conspicuous consumption and race. The Quarterly Journal of Economics, 124 (2), 425–467. Charness, G., Karni, E. and Levin, D. (2007). Individual and group decision mak- ing under risk: An experimental study of bayesian updating and violations of first-order stochastic dominance. Journal of Risk and uncertainty, 35 (2), 129–148. Constantinides, G. M. (1990). Habit formation: A resolution of the equity premium puzzle. Journal of Political Economy, 98 (3), 519–543. Crawford, V. P.andSobel, J.(1982).Strategicinformationtransmission. Econometrica, pp. 1431–1451. Dahlin, M., Kapteyn, A. and Tassot, C. (2014). Who are the joneses? CESR-Schaeffer working paper, 004. Deaton, A. andStone, A. A. (2013). Two happiness puzzles. American Economic Review, 103 (3), 591–97. DellaVigna, S. and Gentzkow, M. (2010). Persuasion: empirical evidence. Annu. Rev. Econ., 2 (1), 643–669. Diecidue, E. and Van De Ven, J. (2008). Aspiration level, probability of success and failure, and expected utility. International Economic Review, 49 (2), 683–700. Duesenberry, J. S. (1949). Income, savings, and the theory of consumer behavior. Cam- bridge, MA: Harvard University Press. 135 Dughmi, S., Kempe, D. and Qiang, R. (2016). Persuasion with limited communication. In Proceedings of the 2016 ACM Conference on Economics and Computation, ACM, pp. 663–680. — and Xu, H. (2016). Algorithmic bayesian persuasion. In Proceedings of the forty-eighth annual ACM symposium on Theory of Computing, ACM, pp. 412–425. — and — (2017). Algorithmic persuasion with no externalities. In Proceedings of the 2017 ACM Conference on Economics and Computation, ACM, pp. 351–368. Easterlin, R. A. (1974). Does economic growth improve the human lot? some empirical evidence. Nations and households in economic growth, pp. 89–125. — (2015). Happiness and economic growth–the evidence. Global handbook of quality of life, pp. 283–299. —, McVey, L. A., Switek, M., Sawangfa, O. and Zweig, J. S. (2010). The happi- ness–income paradox revisited. Proceedings of the National Academy of Sciences,107 (52), 22463–8. Ferrer-i Carbonell, A. (2005). Income and well-being: an empirical analysis of the comparison income effect. Journal of public economics, 89 (5-6), 997–1019. Fliessbach, K., Weber, B., Trautner, P., Dohmen, T., Sunde, U., Elger, C. E. andFalk, A. (2007). Social comparison affects reward-related brain activity in the human ventral striatum. Science, 318 (5854), 1305–1308. Frank, R. H. (2012). The easterlin paradox revisited. Emotion, 12 (6), 1188. Fréchette, G., Lizzeri, A. and Perego, J. (2019). Rules and commitment in commu- nication: An Experimental Analysis. Tech. rep. Friedman, M. and Savage, L. J. (1948). The utility analysis of choices involving risk. Journal of political Economy, 56 (4), 279–304. Gardner, J. and Andrew, J. O. (2007). Money and mental wellbeing: A longitudinal study of medium-sized lottery wins. Journal of health economics, 26 (1), 49–60. Gentzkow, M. andKamenica, E.(2014).Costlypersuasion. American Economic Review, 104 (5), 457–62. — and — (2016). Competition in persuasion. The Review of Economic Studies, 84 (1), 300–322. — and — (2017). Bayesian persuasion with multiple senders and rich signal spaces. Games and Economic Behavior, 104, 411–429. Goerke, L. and Pannenberg, M. (2013). Direct evidence on income comparisons and subjective well-being. Institute of Labour Law and Insutrial Relations in the European Union (IAAEU) Discussion Papers, 201303. 136 Gratton, G., Holden, R. and Kolotilin, A. (2017). When to drop a bombshell. The Review of Economic Studies, 85 (4), 2139–2172. Harvey, C. R. and Siddique, A. (2000). Conditional skewness in asset pricing tests. The Journal of finance, 55 (3), 1263–1295. Hernández, P. and Neeman, Z. (2018). How bayesian persuasion can help reduce illegal parking and other socially undesirable behavior. Working paper. Holt, C. A. and Smith, A. M. (2009). An update on bayesian updating. Journal of Economic Behavior & Organization, 69 (2), 125–134. Ifcher, J., Zarghamee, H. and Carol, G. (2019). Income inequality and well-being in the us: evidence of geographic-scale-and measure-dependence. The Journal of Economic Inequality, 17 (3), 415–434. Kahneman, D. and Tversky, A. (1973). On the psychology of prediction. Psychological review, 80 (4), 237. Kamenica, E. and Gentzkow, M. (2011). Bayesian persuasion. American Economic Re- view, 101 (6), 2590–2615. Kapteyn, A. (1977). A theory of preference fornation. Pasmans - Unpublished PhD Thesis. — and Federica, T. (2003). Hypothetical intertemporal consumption choices. The Eco- nomic Journal, 113 (486), C140–C152. —, Smith, J. P. and Van Soest, A. (2010). Life satisfaction. International differences in well-being, pp. 70–104. Kolotilin, A., Mylovanov, T., Zapechelnyuk, A. and Li, M. (2017). Persuasion of a privately informed receiver. Econometrica, 85 (6), 1949–1964. Kraus, A. and Litzenberger, R. H. (1976). Skewness preference and the valuation of risk assets. The Journal of finance, 31 (4), 1085–1100. Kumar, A. (2009). Who gambles in the stock market? The Journal of Finance, 64 (4), 1889–1933. Li, F. and Norman, P. (2018). On bayesian persuasion with multiple senders. Economics Letters, 170, 66–70. Lindqvist, E., Ostling, R. and Cesarini, D. (2018). Long-run effects of lottery wealth on psychological well-being. The Review of Economic Studies. Luttmer, E. F. (2005). Neighbors as negatives: Relative earnings and well-being. The Quarterly journal of economics, 120 (3), 963–1002. Mathevet, L., Perego, J. and Taneva, I. (2020). On information design in games. Journal of Political Economy, 128 (4), 1370 – 1404. 137 McBride, M. (2001). Relative-income effects on subjective well-being in the cross-section. Journal of Economic Behavior and Organization, 45 (3), 251–278. Mitton, T. and Vorkink, K. (2007). Equilibrium underdiversification and the preference for skewness. The Review of Financial Studies, 20 (4), 1255–1288. Montgomery, M. (2016). Reversing the gender gap in happiness: Validating the use of life satisfaction self-reports worldwide. Job Market Paper. Nguyen, Q. (2017). Bayesian persuasion: Evidence from the laboratory. Working Paper - Utah State University. Okuno-Fujiwara, M., Postlewaite, A. and Suzumura, K. (1990). Strategic informa- tion revelation. The Review of Economic Studies, 57 (1), 25–47. Rayo, L. and Becker, G. S. (2007). Evolutionary efficiency and happiness. Journal of Political Economy, 115 (2), 302–337. Roussanov, N. (2010). Diversification and its discontents: Idiosyncratic and en- trepreneurial risk in the quest for social status. The Journal of Finance,65 (5), 1755–1788. Sacks, D. W., Stevenson, B. and Wolfers, J. (2012). The new stylized facts about income and subjective well-being. Emotion, 12 (6), 1181. Stevenson, B. and Wolfers, J. (2008a). Economic growth and subjective well-being: Reassessing the easterlin paradox. National Bureau of Economic Research, Working Pa- per (14282). — and — (2008b). Happiness inequality in the united states. The Journal of Legal Studies, 37 (S2), S33–S79. — and — (2013). Subjective well-being and income: Is there any evidence of satiation? American Economic Review, 103 (3), 598–604. Sundaresan, S. M. (1989). Intertemporally dependent preferences and the volatility of consumption and wealth. Review of financial Studies, 2 (1), 73–89. Taneva, I. A. (2019). Information design. American Economic Journal: Microeconomics, 11 (4), 151–85. Van de Stadt, H., Kapteyn, A. and Van de Geer, S. (1985). The relativity of utility: Evidence from panel data. The review of Economics and Statistics, 1, 179–187. Van Praag, B. M. (1971). The welfare function of income in belgium: An empirical investigation. European Economic Review, 2 (3), 337–369. — and Kapteyn, A. (1973). Further evidence on the individual welfare function of income: An empirical investigatiion in the netherlands. European Economic Review, 4 (1), 33–62. 138 Van Vugt, M. andTybur, J. M. (2015). The evolutionary foundations of status hierarchy. The handbook of evolutionary psychology, 23, 1–22. Veblen, T. (1973). The theory of the leisure class. Boston: Houghton Mifflin. Wang, Y. (2013). Bayesian persuasion with multiple receivers. Available at SSRN 2625399. Zhang, X.-J. (2013). Book-to-market ratio and skewness of stock returns. The Accounting Review, 88 (6), 2213–2240. 139 Appendix A Appendix to Chapter 1 A.1 Mathematical derivations A.1.1 Log moments of the perceived income distributions The following shows the derivations of the first two log moments (log means and log variances) of the perceived income distributions and the inclusive perceived income distributions. Each is defined appropriately. For more information see section 1.3.4 in the main text. m - Log-mean of the perceived income distribution at , F (xj) m = Z 1 1 ln(x)dF (xj) = n X i=1 q i (1q 0 ) ln(y i ) - Log-mean of the inclusive perceived income distribution at , H(xj) 140 = Z 1 1 ln(x)dH(xj) =q 0 ln(y 0 ) + (1q 0 )m = n X i=0 q i ln(y i ) m - Log-mean of the perceived income distribution, F (x) m = Z 1 1 ln(x)dF (x) = 0 X =1 a() Z 1 1 ln(x)dF (xj) = 0 X =1 a()m - Log-mean of the inclusive perceived income distribution, H(x) = Z 1 1 ln(x)dH(x) = 0 X =1 a() Z 1 1 ln(x)dH(xj) = 0 X =1 a() s 2 - Log-variance of the perceived income distribution at , F (xj) 141 s 2 = Z 1 1 ln(x)m 2 dF (xj) = n X i=1 q i (1q 0 ) ln(y i )m 2 2 - Log-variance of the inclusive perceived income distribution at time , H(xj) 2 = Z 1 1 ln(x) 2 dH(xj) = n X i=0 q i ln(y i ) 2 =q 0 ln(y 0 ) 2 + (1q 0 ) n X i=1 q i (1q 0 ) ln(y i ) 2 =q 0 ln(y o )q 0 ln(y o ) (1q 0 )m 2 + (1q 0 ) n X i=1 q i (1q 0 ) ln(y i )q 0 ln(y 0 ) (1q 0 )m 2 =q 0 (1q 0 )(lny 0 m ) 2 + (1q 0 ) n X i=1 q i (1q 0 ) (ln(y i )m )q 0 (ln(y 0 )m ) 2 = (1q 0 ) q 0 (1q 0 )(lny 0 m ) 2 + n X i=1 q i (1q 0 ) ln(y i )m 2 | {z } =s 2 2q 0 ln(y o )m n X i=1 ln(y i )m | {z } =0 +q 2 0 ln(y 0 )m 2 n X i=1 q i (1q 0 ) | {z } =1 = (1q 0 ) q 0 (ln(y o )m ) 2 +s 2 s 2 - Log-variance of the perceived income distribution, F (x) 142 s 2 = Z 1 1 ln(x)m 2 dF (x) = 0 X =1 a() Z 1 1 ln(x)m 2 dF (xj) 2 - Log-variance of the inclusive perceived income distribution, H(x) 2 = Z 1 1 ln(x) 2 dH(x) = 0 X =1 a() Z 1 1 ln(x) 2 dH(xj) = 0 X =1 a()E ln(x) + 2 = 0 X =1 a() E (ln(x) ) 2 +E ( ) 2 + 2E ((ln(x) ) | {z } =0 ( )) = 0 X =1 a() 2 + ( ) 2 A.1.2 Examples The following derivations correspond to each example in section 1.3.5 of the main text. One-off income growth for everyone. We assume that all incomes in the society have been constant for a sufficiently long time and that each individual’s reference group comprises everyone in the society. We have that y i = y i(1) ; 8i;2 (1; 0]. It is straightforward to see that we have the following simplifications: = 1 ; m =m 1 ; 2 = 2 1 ; s 2 =s 2 1 ,82 (1; 0] 143 ) = ; m =m ; 2 = 2 ; s 2 =s 2 : The above show that the log-means and the log-variances of both the inclusive and non-inclusive perceived income distributions at time have been constant over time and hence are also equal to the respective overall perceived distributions. Now suppose there is an identical increase in incomes for all individuals by a factor of exp(). We then have that the new incomes satisfy y 0 i =y i exp(),8i) ln(y 0 i ) = ln(y i ) +. Individual i = 0’s new level of income satisfaction is then given by: W (y 0 0 ) = ln(y 0 0 ) 0 0 The new log-mean of the inclusive perceived income distribution becomes: 0 =a + (1a) 0 0 where 0 0 = n X i=0 q i ln(y 0 i ) = n X i=0 q i ln(y i exp()) = + 0 = + =a + (1a)( +) = + (1a) The new log-variance of the inclusive perceived income distribution becomes: 2 0 = 0 X =1 (1a)a 2 0 + ( 0 ) 2 = 1 X =1 (1a)a 2 + ( (1a)) 2 + (1a) 2 0 + ( + (1a)) 2 where 2 0 0 = (1q 0 ) q 0 (ln(y 0 00 )m 0 0 ) 2 +s 2 0 0 = (1q 0 ) q 0 (ln(y 00 ) +m 0 ) 2 +s 2 0 = 2 0 = 2 144 = 1 X =1 (1a)a 2 + (1a) 2 2 + (1a) 2 +a 2 2 =a( 2 + (1a) 2 2 ) + (1a)( 2 +a 2 2 ) = 2 + (1a)((1a)a 2 +a 2 2 ) = 2 + (1a)a 2 Thus the new income satisfaction in terms of pre-income increase levels is given by: W (y 0 00 ) =W (y 00 exp()) = ln(y 00 ) + (1a) p 2 + (1a)a 2 = ln(y 00 ) +a p 2 + (1a)a 2 Sustained income growth for everyone We assume that incomes have been growing constantly at the rate and once again that each individual’s reference group consists of everyone in the society. Hence we have that y i; = y i;1 exp(),8i;2 (1; 0]) ln(y i; ) = ln(y i;1 ) +. Now since all incomes increase identically in each period , we have that: m = n X i=1 q i (1q 0 ) ln(y i ) = n X i=1 q i (1q 0 ) [ln(y i;1 ) +] =m 1 + =m 0 +;(in terms of present values) 145 Similarly we have that) = 1 + = 0 +, and that: 2 = (1q 0 ) q 0 (ln(y 0 )m ) 2 +s 2 = (1q 0 ) q 0 (ln(y 01 ) +m 1 ) 2 +s 2 1 where s 2 = n X i=1 q i (1q 0 ) [ln(y i )m ] 2 = n X i=1 q i (1q 0 ) [ln(y i1 ) +m 1 ] 2 =s 2 1 = (1q 0 ) q 0 (ln(y 01 )m 1 ) 2 +s 2 1 = 2 1 = 2 0 To find individuali = 0’s income satisfaction we need to compute the log-mean and log-variance of the inclusive perceived income distribution: = 0 X =1 (1a)a = 0 X =1 (1a)a ( 0 +) = (1a) 0 0 X =1 a + (1a) 0 X =1 a = 0 + (1a) 1 X =0 a () = 0 (1a) 1 X =0 @(a ) @a (a) = 0 (1a)a @( P 1 =0 a ) @a 146 = 0 (1a)a @( 1 1a ) @a = 0 (1a)a(1a) 2 = 0 a (1a) =q 0 ln(y 00 ) + (1q 0 )m 0 a (1a) 2 = 0 X =1 (1a)a [ 2 + ( ) 2 ] = 0 X =1 (1a)a [ 2 0 + ( 0 + 0 a 1a ) 2 ] = 2 0 + 0 X =1 (1a)a [ 2 ( a 1a ) 2 ] = 2 0 + (1a) 2 0 X =1 a [ 2 + a 2 (1a) 2 2 a 1a ] where 0 X =1 a 2 = 1 X =0 a 2 = 1 X =0 @(a ) @a a =a @ P 1 =0 (a ) @a =a @ P 1 =0 ( @(a ) @a a) @a =a @( @( P 1 =0 a ) @a a) @a =a @((1a) 2 a) @a = a (1a) 2 + 2a 2 (1a) 3 where 0 X =1 a a 2 (1a) 2 = a 2 (1a) 3 where 0 X =1 a 2 a 1a = 2a 1a 1 X =0 a () = 2a 1a 1 X =0 @a @a a = 2a 2 1a @ P 1 =0 a @a = 2a 2 (1a) 3 =s 2 0 + (1a) 2 h a (1a) 2 + 2a 2 (1a) 3 + a 2 (1a) 3 2a 2 (1a) 3 i =s 2 0 + 2 a 1a a 1a + 1 =s 2 0 + 2 a (1a) 2 147 Thus individual i = 0’s satisfaction with her own income is given by: W (y 00 ) = ln(y 00 ) = ln(y 00 ) [q 0 ln(y 00 ) + (1q 0 )m 0 a (1a) ] q 2 0 + 2 a (1a) 2 = (1q 0 )[ln(y 00 )m 0 ] + a (1a) q 2 0 + 2 a (1a) 2 ! One-off income growth for self We will solve this in a similar manner to the one-off income growth for everyone in Section A.1.2. Since we assume that all incomes in the society have been constant for a sufficiently long time and each individual’s reference group is everyone in the society we have that: y i = y i;1 , 8i;2 (1; 0]). Once again it is straightforward to see that we have the following simplifications: = 1 ; m =m 1 ; 2 = 2 1 ; s 2 =s 2 1 ,82 (1; 0] ) = ; m =m ; 2 = 2 ; s 2 =s 2 : In this example, we are assuming a one-off increase in individual i = 0’s income by the factor exp(), while all other individual i6= 0s’ income remains unchanged, i.e. y 0 00 = y 00 exp()) ln(y 0 00 ) = ln(y 00 ) +, y 0 i0 =y i0 ,8i6= 0. Now, since income for all i 6= 0 remains unchanged, we can make the following additional simplifications: m 0 0 =m 0 =m =m 0 , s 2 0 0 =s 2 0 =s 2 =s 2 0 . 148 Before deriving the log-moments of the inclusive perceived income distribution It is useful to first derive the log-moments of the inclusive perceived income distribution at time = 0 in terms of the pre-income increase levels (since obviously the log-moments for any < 0 remain unchanged.): 0 0 = n X =0 q i ln(y 0 i0 ) =q 0 ln(y 0 00 ) + (1q 0 )m 0 0 =q 0 (ln(y 00 ) +) + (1q 0 )m 0 = 0 +q 0 2 0 0 = (1q 0 ) q 0 ln(y 0 00 )m 0 0 2 +s 2 0 0 = (1q 0 ) q 0 ln(y 00 ) +m 0 2 +s 2 0 = (1q 0 ) q 0 (ln(y 00 )m 0 ) 2 + 2 + 2(ln(y 00 )m 0 ) +s 2 0 = (1q 0 ) q 0 (ln(y 00 )m 0 ) 2 +s 2 0 + (1q 0 )q 0 2 + 2(ln(y 00 )m 0 ) = 2 0 + (1q 0 )q 0 + 2(ln(y 00 )m 0 ) | {z } X = 2 0 +X Now we are in a position to derive individuali = 0’s income satisfaction with the new increased incomelevely 0 00 intermsofthepre-incomeincreaseincomelevelsy 00 , bycomputingthelog-moments of the inclusive perceived income distribution: 0 = 0 X =1 (1a)a 0 = 1 X =1 (1a)a + (1a) 0 0 =a + (1a)( 0 +q 0 ) = + (1a)q 0 149 2 0 = 0 X =1 (1a)a 2 0 + ( 0 0 ) 2 = 1 X =1 (1a)a 2 + ( ) 2 + (1a)[ 2 0 0 + ( 0 0 0 ) 2 ] =a 2 + (1a) 2 0 +X + ( 0 +q 0 (1a)q 0 ) 2 =a 2 + (1a) 2 +X + (aq 0 ) 2 = 2 + (1a)(X +a 2 2 q 2 0 ) = 2 + (1a) (1q 0 )q 0 2 + 2(1q 0 )q 0 [ln(y 00 )m 0 ] + 2 a 2 q 2 0 = 2 +(1a)q 0 (1q 0 )[ + 2(ln(y 00 )m 0 )] | {z } A +a 2 q 0 |{z} B Thus individual i = 0’s level of income satisfaction is given by the following: W (y 0 00 ) = ln(y 0 00 ) 0 0 = ln(y 00 ) + (1a)q 0 q 2 +(1a)q 0 A +B ! = ln(y 00 ) +[1 (1a)q 0 ] q 2 +(1a)q 0 A +B ! Where A = (1q 0 )[ + 2(ln(y 00 )m 0 )], B =a 2 q 0 . Sustained income growth for self We will solve this example in a similar manner to the sustained income growth for everyone, in Section A.1.2. We will assume that individual i = 0’s income has been growing constantly 150 at the reate while for everyone else, incomes have been constant. Once again we assume that each individual’s reference group consists of everyone in the society. Hence we have the following: y 0 = y 0;1 exp(),) ln(y 0 ) = ln(y 0;1 ) + or ln(y 0 ) = ln(y 00 ) + in present values. Similarly, y i =y i;1 ,8i6= 0. The first obvious implication of the fact that the incomes of all i6= 0 are constant is that: m =m 0 =m, s 2 =s 2 0 =s 2 . Once again we begin the derviations with the log-moments of the inclusive perceived income distribution at time in terms of present values: = n X i=0 q i ln(y i ) =q 0 ln(y 0 ) + (1q 0 )m =q 0 (ln(y 00 +)) + (1q 0 )m 0 = 0 +q 0 = 1 +q 0 2 = (1q 0 )fq 0 (ln(y 0 )m ) 2 +s 2 g = (1q 0 )fq 0 (ln(y 01 ) +m 1 ) 2 +s 2 1 g = (1q 0 )fq 0 (ln(y 01 )m 1 ) 2 +s 2 1 g + (1q 0 )q 0 f 2 + 2(ln(y 0;1 )m 1 )g = 2 1 +(1q 0 )q 0 [ + 2(ln(y 0;1 )m 1 )] = 2 0 +(1q 0 )q 0 [ + 2(ln(y 0;1 )m 1 )] | {z } X We can now derive the log-moments of the inclusive perceived income distribution as follows: 151 = 0 X =1 (1a)a = (1a) 0 X =1 a ( 0 +q 0 ) = 0 +q 0 (1a) 0 X =1 a = 0 q 0 a(1a) 1 2 = 0 X =1 (1a)a [ 2 + ( ) 2 ] = 0 X =1 (1a)a [ 2 0 +X + ( 0 +q 0 0 +q 0 a(1a) 1 ) 2 ] = 0 X =1 (1a)a [ 2 0 +X + (q 2 0 2 ( + a 1a ) 2 ) = 2 0 + (1a) 0 X =1 a X + (1a)q 2 0 2 0 X =1 a + a 1a 2 where 0 X =1 a X = (1q 0 )q 0 2 0 X =1 a 2 + 2(ln(y 00 )m 0 ) 0 X =1 a ) = (1q 0 )q 0 2 h a (1a) 2 + 2a 2 (1a) 3 i 2(ln(y 00 )m 0 ) a (1a) 2 = (1q 0 )q 0 a(1a) 2 [1 + 2a(1a) 1 ] 2(ln(y 00 )m 0 ) where 0 X =1 a + a 1a 2 = 0 X =1 a a 2 (1a) 2 + 2 + 2a (1a) = a 2 (1a) 3 + a (1a) 2 + 2a 2 (1a) 3 2a 2 (1a) 3 =a(1a) 2 [1 +a(1a 1 )] = 2 0 +q 0 a 1a (1q 0 ) h 1 + 2 a 1a 2(ln(y 00 )m 0 ) i +q 0 1 + a 1a = 2 0 +q 0 a 1a (1q 0 ) h 1 +a 1a 2(ln(y 00 )m 0 ) i +q 0 1 1a 152 Individual i = 0’s level of income satisfaction is then given by the following: W (y 00 ) = ln(y 00 ) = ln(y 00 ) [q 0 ln(y 00 ) + (1q 0 )m 0 q 0 a (1a) ] q 2 0 +q 0 a 1a f(1q 0 )A +q 0 Bg = (1q 0 )[ln(y 00 )m 0 ] +q 0 a (1a) p 2 0 +C[q 0 B + (1q 0 )A] ! Where: A = 1+a 1a + 2(ln(y 00 )m 0 ), B = 1 1a , C =q 0 a 1a Derivation of individual i = 0’s income satisfaction as y 00 !1: Notice that as y 00 !1, (ln(y 00 )m 0 )!1, and define (ln(y 00 )m 0 )X. Step 1: Divide both the numerator and the denominator by X. W (y 00 ) = (1q 0 ) +q 0 a (1a) =X q 2 0 =X 2 +Cq 0 B=X 2 +C(1q 0 ) 1+a 1a =X 2 +C(1q 0 )2X=X 2 ! Step 2: It is clear that the numerator (1q 0 ) +q 0 a (1a) =X! (1q 0 ), as X!1. Step 3: It is also clear that the last three terms of the denominator, Cq 0 B=X 2 , C(1 q 0 ) 1+a 1a =X 2 , C(1q 0 )2X=X 2 ! 0 as X!1. Step 4: For the first term of the denominator 2 0 = (1q 0 )q 0 X 2 + (1q 0 )S 2 0 , we have that: 153 2 0 =X 2 = (1q 0 )q 0 + (1q 0 )S 2 0 =X 2 ! (1q 0 )q 0 ; as X!1: Step 5: Combining Steps 2 - 4, we get that: W (y 00 )! r 1q 0 q 0 ! ; as X!1 A.1.3 Model adaptation Income of individual i at time and the log-mean of the perceived income distribution at time can be re-written as: y i = y i1 ;8 m = n X i=1 q i (1q 0 ) ln(y i ) = n X i=1 q i (1q 0 ) ln( ) + n X i=1 q i (1q 0 ) ln(y i1 ) = ln( ) +m 1 It is useful to define the difference between log income of self and log-mean of the perceived income distribution: ln(y 0 )m = ln( ) + ln(y 01 ) ln( )m 1 = ln(y 01 )m 1 (constant) 154 The log-mean of the inclusive perceived income distribution at time can then be expressed as: = ln( ) + 1 =q 0 ln(y 0 ) + (1q 0 )m =q 0 [ln(y 0 )m ] +m =q 0 [ln(y 0 )m ] + ln( ) +m 1 Which then allows us to define the log-mean of the perceived income distribution as: = 0 X =1 a() = 0 X =1 a()m +q 0 [ln(y 0 )m ] (7) Similarly we can write the log-variance of the perceived income distribution at time as: s 2 = n X i=1 q i (1q 0 ) ln(y i )m 2 =s 2 1 s 2 (constant) and the log-variance of the inclusive perceived income distribution at time as: 2 = (1q 0 )fq 0 (ln(y 0 )m ) 2 +s 2 g = (1q 0 )fq 0 (ln(y 01 )m 1 ) 2 + s 2 g = 2 1 = 2 1 (constant) 155 Finally, we can express the log-variance of the inclusive perceived income distribution as: 2 = 0 X =1 a()f 2 + ( ) 2 g = + 0 X =1 a() 0 X s=1 a(s)m s m 2 The availability of data on GDP growth rates varies by country. For most countries we have estimates of GDP growth, going back in time by ten years. We make the additional assumption that growth has been zero beyond the nth period (where n = 10). 0 X =1 a()m = (1a) 0 X =1 a m = (1a) 0 X =n a m + (1a) (n+1) X =1 a m n = (1a) 0 X =n a m +a n+1 m n M ) =M +q 0 [ln(y 0 )m ] ) 2 = + 0 X =1 a()(Mm ) 2 = + (1a) 0 X =n a (Mm ) 2 +a n+1 (Mm n ) 2 (8) A.2 Empirical analysis and robustness checks A.2.1 Robustness of empirical specifications As a first test of robustness of our estimation results we change the coding of vignette charac- teristics and add respondent-level characteristics. More specifically we define the following variables 156 capturing various vignette characteristics except income as follows: “VigBackpain” = 1 if the vignette description mentions the existence of back pain, = 0 otherwise; “VigSecurejob” = 1 if the vignette description mentions that the hypothetical individual has a secure job, = 0 otherwise; “VigInterestingjob” = 1 if the vignette description mentions that the hypothetical individual has an interesting job, = 0 otherwise; “VigMarried” = 1 if the vignette description mentions that the hypothetical individual is married, = 0 otherwise; “VigChildren” = 1 if the vignette description mentions that the hypothetical individual has children, = 0 otherwise; “VigWidow” = 1 if the vignette description mentions that the hypothetical individual is a widow, = 0 otherwise; “VigWorking” = 1 if the vignette description mentions that the hypothetical individual is cur- rentlyworking, =0otherwise; “VigHealthprobs” =1ifthevignettedescriptionmentionsthatthe hypothetical individual has any health problems, = 0 otherwise. We also include several variables capturing respondent characteristics, the meanings of which are self explanatory (We exclude some variables from Table A.1 for the sake of space. None of the variables we excluded was significant at any level). The results are shown in column “A” which was based on the same starting values as our main specification in Table 1.5. As a second test of robustness of our estimation results, we present results of specification “A” with three different starting values of our parameters. These are shown in columns “B” and “C” of table A.1 . Table A.1 shows that the main specification presented in Table 1.5 is robust to changing the variablesthatcodethevignettecharacteristicsandtoaddingadditionalrespondent-specificcharach- teristics. Our main specification is mostly robust to the choice of starting values. It seems, however that there exist two local optima for the objective function, each of which is reached depending on the starting values. However the local minimum of sum of squares corresponding to our main specification is lower than the one based on the alternative starting values. We therefore take the main specification as representing a global minimum. 157 Table A.1: Model estimation results. A B C 0 -0.105 -0.105 0.05 (0.309) (0.309) (0.332) 1 0.303*** 0.303*** 0.341*** (0.032) (0.032) (0.03) q 0 0.053*** 0.053*** 0.054*** (0.014) (0.014) (0.011) a 0.971*** 0.971*** 0.215 (0.041) (0.041) (1.15) Vigfemale -0.059*** -0.059*** -0.038*** (0.015) (0.015) (0.012) Vigage 0.002*** 0.002*** 0.0007** (0.001) (0.001) (0.0003) VigBackpain -0.027 -0.027 -0.0529*** (0.017) (0.017) (0.016) VigSecureJob 0.109*** 0.109*** 0.119*** (0.013) (0.013) (0.013) VigInterestingJob 0.096*** 0.096*** 0.046** (0.026) (0.026) (0.022) VigMarried -0.114*** -0.114*** -0.112*** (0.009) (0.009) (0.008) VigChildren -0.074*** -0.074*** -0.097*** (0.019) (0.019) (0.017) VigWidow -0.1*** -0.1*** -0.063*** (0.016) (0.016) (0.015) VigWorking -0.143*** -0.143*** -0.139*** (0.01) (0.01) (0.01) VigHealthProbs -0.011* -0.011* -0.017*** (0.007) (0.007) (0.006) N 391,088 391,088 391,088 R 2 0.2101 0.2101 0.2074 Rt(MSE) 0.21068 0.21068 0.211 Res. Dev. -108336 -108336 -107000 Starting values {0, 1, 0.5, 0.8} {0.6, 0.6, 0.6, 0.6} {0.5, 0.5, 0.5, 0.5} Where *p< 0:1; **p< 0:05; ***p< 0:01 158 Model estimation results (Cont.) A B C age -0.0001 -0.0001 -0.0001 (0.006) (0.006) (0.0001) female 0.006*** 0.006*** 0.005*** (0.002) (0.002) (0.002) married -0.011*** -0.011*** -0.013*** (0.003) (0.003) (0.003) divorced 0.003 0.003 0.0005 (0.006) (0.006) (0.006) widowed -0.006 -0.006 -0.01** (0.005) (0.005) (0.005) domesticpartner 0.0006 0.0006 0.0023 (0.007) (0.007) (0.007) unemployed 0.0006 0.0006 0.004 (0.004) (0.004) (0.004) selfemployed 0.003 0.003 0.003 (0.005) (0.005) (0.004) urban -0.008* -0.008* -0.008* (0.005) (0.005) (0.004) ln(GDPperCapita) 0.035* 0.035* 0.017 (0.019) (0.019) (0.018) ln(EducationExpenditure) 0.002 0.002 0.019* (0.011) (0.011) (0.01) ln(LifeExpectancy) 0.062 0.062 0.041 (0.87) (0.87) (0.09) N 391,088 391,088 391,088 R 2 0.2101 0.2101 0.2074 Rt(MSE) 0.21068 0.21068 0.211 Res. Dev. -108336 -108336 -107000 Starting values {0, 1, 0.5, 0.8} {0.6, 0.6, 0.6, 0.6} {0.5, 0.5, 0.5, 0.5} Where *p < 0:1; **p < 0:05; ***p < 0:01. Note the following: Estimated parameters in columns A and B are essentially the same, apart from some rounding differences; All estimation results where starting values are the same and above 0.6 produce the exact same results with column B; All estimation results where starting values are the same and below 0.5 produce the exact same results with column C. 159 A.2.2 Other robustness checks Evidence of random assignment of vignette characteristics We perform two tests to examine whether vignette characteristics are randomly assigned to vignettes. First, we report pairwise correlations between the vignettes’ income multipliers and the five other vignette characteristics. Secondly, we report a multiple regression of on the other characteristics (Table A.2). We find that vignette characteristics are mostly as-good-as randomly assigned. The only characteristics that show some evidence of correlation with are age and job which is not surprising since it is expected that as one gets older, an individuals’ job may get better and income may increase. Table A.2: Parwise correlations and multiple regression of on other vignette characteris- tics. Female Age Health Family Job Pairwise -0.27 0.73*** 0.18 -0.16 0.83*** correlations (0.4) (0.007) (0.57) (0.62) (<0.001) Multiple 0.34 0.0002 0.22 0.13 0.86* regression (0.34) (0.99) (0.27) (0.46) (0.06) Numbers in brackets indicate p-values. *p<0.1, **p<0.05, ***p<0.01. 160 A.3 Data appendix A.3.1 Coding of vignette dummy variables Table A.3: Coding of vignette dummies used in Table 1.5. Variable Value Vignette set “healthgood” 1 {A2, A3, A6} “healthbad” 1 {A1, A5, B1, B2, B4, B5, B6} “familygood” 1 {A1, A3, A5, A6, B1, B2} “familybad” 1 {B4, B5, B6} “jobgood” 1 {A2, A6, B1, B4, B6} “jobbad” 1 {A3, B2} Table A.4: Coding of vignette dummies used in Table A.1. Variable Value Vignette set “Vig-female” 1 {A1, A4, A5, B2, B4, B5} “Vig-Backpain” 1 {A1, A5, B2} “Vig-Securejob” 1 {A2, A4, B3} “Vig-Interestingjob” 1 {A4} “Vig-Married” 1 {A1, A4, A6, B3, B5, B6} “Vig-Children” 1 {A2, B2, B4, B5} “Vig-Widow” 1 {A5, B2} “Vig-Working” 1 {A2, A3, A4, B1, B3, B4, B6} “Vig-Healthprobs” 1 {A1, A5, B1, B2, B5, B6} 161 A.3.2 Vignette descriptions Vignette questions were asked after the question of own life satisfaction on the 0 10 scale. Each individual was asked to answer either the A set or the B set of vignettes (about half in each country). Given the set that each individual answered, the order in which vignettes were presented was randomized. Vignette set A: A1 Think of a female who is 40 years old and happily married with a good family life. Her monthly family income is about (median income). She has severe back pain, which keeps her awake at night. On which step of the ladder do you think this person stands? A2 Think of a male who is 50 years old and divorced. He has a daughter with whom he has a good relationship. He has a secure job that pays about (twice median income) per month. He has no serious health problems. On which step of the ladder do you think this person stands? A3 Think of a male who is 25 years old and single without many friends. He makes about (half median income) per month. He feels he has little control over his job and worries about losing it. He has no health problems but feels stressed sometimes. On which step of the ladder do you think this person stands? A4 Think of a female who is 35 years old and married, with no children. Her monthly family income is about (median income). Her work is a bit dull sometimes, but it is a very secure job. On which step of the ladder do you think this person stands? A5 Think of a female who is a 70-year-old widow. She receives about (half median income) in income each month. She has many friends. Lately, she suffers from back pain, which makes housework painful. On which step of the ladder do you think this person stands? A6 Think of a male who is 60 years old. His is single but has many friends his age. He no longer works but is comfortable with his decision to stop working. He receives about (twice median 162 income) in income each month. He is very physically active. On which step of the ladder do you think this person stands? Vignette set B: B1 Think of a male who is 40 years old and happily married with a good family life. His monthly family income is about (twice median income). He likes to work but suffers from serious back pain, which keeps him awake at night. On which step of the ladder do you think this person stands? B2 Think of a female who is a 65-year-old widow. She misses her husband a lot but has good relationships with her children and grandchildren. She receives about (half median income) in income each month. She has heart problems, which caused her to stop working. On which step of the ladder do you think this person stands? B3 Think of a male who is 35 years old and married, with no children. His monthly family income is about (half median income). His work is a bit dull sometimes, but it is a very secure job. On which step of the ladder do you think this person stands? B4 Think of a female who is 60 years old and divorced. She has children from her marriage but has little contact with them. She has an interesting job. Her monthly income is about (twice median income). She often has trouble sleeping. On which step of the ladder do you think this person stands? B5 Think of a female who is 70 years old and married. She and her husband lead their own lives and don’t do many things together. They have two children but rarely see them. Her monthly family income is about (median income). She is overweight and gets tired when walking for more than a few minutes. On which step of the ladder do you think this person stands? B6 Think of a male who is 50 years old. He does not exercise and is obese. He has pain in his knees almost all the time. He is very secure in his job. He has been married for a long time, but he and his wife spend very little time together. His monthly family income is about (median income). On which step of the ladder do you think this person stands? 163 A.3.3 Countries,samplesizes,interviewyearsandnon-representative samples Table A.5: Countries, sample size, interview year, non-rep samples Country 2011 2012 2013 2014 Portion Non-rep Afghanistan 1,000 0.83 Albania 999 0.83 Argentina 1,000 0.83 Armenia 1,000 0.83 Australia 1,002 0.83 Austria 1,000 0.83 Azerbaijan 1,000 0.83 Bahrain 1,002 0.83 Bangladesh 1,000 0.83 Belarus 1,052 0.87 Benin 1,000 0.83 Bolivia 1,000 0.83 Bosnia & Hertzekovina 1,001 0.83 Botswana 1,000 0.83 Brazil 1,042 0.86 Bulgaria 1,000 0.83 Cambodia 1,000 0.83 Cameroon 1,000 0.83 Canada 1,002 0.83 Central African Rep. 1,000 0.83 Chad 1,000 0.83 Chile 1,009 0.84 China 4,256 3.52 Colombia 1,000 0.83 Congo Kinasha 1,000 0.83 Costa Rica 1,000 0.83 Croatia 1,000 0.83 Czech Republic 1,001 0.83 Dominican Republic 1,000 0.83 Ecuador 1,003 0.83 164 Countries, sample size, interview year, non-rep samples Country 2011 2012 2013 2014 Portion Non-rep Egypt 1,077 0.89 El Salvador 1,000 0.83 Ethiopia 1,000 0.83 France 1,003 0.83 Georgia 1,000 0.83 Germany 3,033 2.51 Ghana 1,000 0.83 Greece 1,003 0.83 Guatemala 1,000 0.83 Haiti 504 0.42 Honduras 1,000 0.83 Hungary 1,019 0.84 India 5,000 4.14 Indonesia 1,000 0.83 Iran 1,000 0.83 Iraq 1,003 0.83 Israel 1,000 0.83 Italy 1,004 0.83 Japan 1,000 0.83 Jordan 1,000 0.83 Kazakhstan 1,000 0.83 Kenya 1,000 0.83 Kuwait 1,008 0.83 Laos 1,000 0.83 Lebanon 1,012 0.84 Liberia 1,000 0.83 Macedonia 1,025 0.85 Madagascar 1,008 0.83 Malaysia 1,000 0.83 Mauritania 1,000 0.83 Mexico 1,000 0.83 Moldova 1,000 0.83 Mongolia 1,000 0.83 Morocco 1,007 0.83 Myanmar 1,020 0.84 Namibia 1,000 0.83 New Zealand 1,008 0.83 Nicaragua 1,000 0.83 165 Countries, sample size, interview year, non-rep samples Country 2011 2012 2013 2014 Portion Non-rep Nigeria 1,000 0.83 Pakistan 1,008 0.83 Palestine 1,000 0.83 Panama 1,001 0.83 Paraguay 1,000 0.83 Peru 1,000 0.83 Philippines 1,000 0.83 Poland 1,000 0.83 Portugal 1,007 0.83 Russia 1,500 1.24 Rwanda 1,000 0.83 Saudi Arabia 1,017 0.84 Senegal 1,000 0.83 Singapore 1,000 0.83 Slovakia 1,000 0.83 Slovenia 1,017 0.84 South Africa 1,000 0.83 South Korea 1,000 0.83 Spain 1,001 0.83 Sri Lanka 1,030 0.85 Sudan 1,000 0.83 Syria 1,025 0.85 Taiwan 1,000 0.83 Tanzania 1,008 0.83 Thailand 1,000 0.83 Turkey 1,000 0.83 Uganda 1,000 0.83 United Arab Emirates 1,012 0.84 United Kingdom 3,075 2.55 United States 1,019 0.84 Uruguay 1,000 0.83 Venezuela 1,000 0.83 Vietnam 1,000 0.83 Zambia 1,000 0.83 Zimbabwe 1,000 0.83 Total (109 countries) 18,143 51,024 26,103 25,553 100 (23) 166 Appendix B Appendix to Chapter 2 B.1 Theoretical Derivations B.1.1 Equilibrium in the Information Design Extension For this game we will make use of the Kamenica and Gentzkow’s Principal-Preferred Subgame Perfect Equilibrium and solve the game through backward induction. We provide the intuition behind the solution, omitting detailed proofs which are provided in Kamenica and Gentzkow (2011) and other sources. Stage 2. The agent’s problem is to choose an action c(m)2fr;bg for every possible message m2f;g to maximize her expected payoff given the prior probability distribution p over the states and the principal’s signal structure (P R ;P B ): c (m) = argmax c2fr;bg P (m) A R;c + (1P (m)) A B;c where P (m) denotes the posterior probability of state R given the message m, which is generated from the principal’s signal structure (P R ;P B )—chosen by the principal in stage 1—and is calculated 167 according to the Bayes’ rule as follows: 1 P (m) =Pr(Rjm) = Pr(mjR)Pr(R) Pr(m) = pPr(mjR) pPr(mjR) + (1p)Pr(mjB) 1P (m) =Pr(Bjm) = Pr(mjB)Pr(B) Pr(m) = (1p)Pr(mjB) pPr(mjR) + (1p)Pr(mjB) Essentially, the agent Bayes-updates her beliefs about the likelihood of each state and then takes the action which matches the state that is more likely to have been realized under each message: c (m) = 8 > > < > > : r; if P (m) 1=2: b; if P (m)< 1=2: NotethatwefollowKamenicaandGentzkow(2011)inresolvingtheindifferencecase(whereP (m) = 1=2) by having the agent choosing the principal-preferred action r (which is where the “principal- preferred” part in the equilibrium name comes from). Stage 1. The principal’s problem is to choose a signal structure (P R ;P B ) to maximize her expected payoff, which is the probability with which the agent will choose action r times the payoff derived from that action, Pr(c (m) = r). Given the agent’s optimal behavior derived in Stage 2, this problem reduces to maximizing Pr[P (m) 1=2]. The intuition behind the solution goes as follows. Since p< 1 2 , it is impossible to have P (m) 1 2 for both m2f;g so the best that the principal can do is to choose one message for which the induced posterior will be weakly greater than half and for the other message strictly less than half. Without loss of generality and to facilitate the parallelism of messages as action recommendations we assume that the principal will choose to induce the posteriors such that P () 1 2 , P ()< 1 2 . (i.e., such that the agent will want to choose c () = r, c () = b). The principal thus seeks to maximize Pr(m = ) (equivalently, maximize Pr(m =js =R) and minimizePr(m =js =B)) while being constrained byP () 1 2 ,P ()< 1 2 . To do so, the principal chooses to always transmit the correct message (recommendation) in the state where both players’ preferred actions coincide (when s = R) and mix the recommendations 1 P R =Pr(m =js =R), P B =Pr(m =js =B). 168 in the state where the players’ preferred actions conflict (when s = B) such that the following is satisfied: P () 1=2 Pr(m =js =R)Pr(R) Pr(r) 1=2 Pr(m =js =R)p Pr(m =js =R)p +Pr(m =js =B)(1p) 1=2 p p +Pr(m =js =B)(1p) 1=2 Pr(m =js =B) p 1p )P B =Pr(m =js =B) 1 2p 1p : Since the principal seeks to minimize P B , the solution to the principal’s problem is given by: P R =Pr(m =js =R) = 1 P B =Pr(m =js =B) = 1 2p 1p Principal-Preferred Subgame Perfect Equilibrium of Information Design Game. The principal’s optimal signal structure derived above induces the following posteriors: P (m) = 8 > > < > > : 1 2 ; if m = 0; if m =: Correspondingly, the agent’s optimal action-choice rule dictates the following choice rule in equilib- rium: c (m) = 8 > > < > > : r; if m = b; if m =: 169 Given the above, each player’s expected payoffs are given by: E s P s;c (m) =pPr(m =js =R) + (1p)Pr(m =js =B) = 2p E s A s;c (m) =pPr(m =js =R) + (1p)Pr(m =js =B) = (1p) B.1.2 Equilibrium in the Mechanism Design Extension Forthisgame,weagainusethePrincipal-PreferredSubgamePerfectEquilibrium(where“Principal- Preferred” is again used to indicate that we will resolve the indifference case in favor of the principal) as the solution concept for this game. Stage 2. The agent’s problem is to choose an action c2fr;bg to maximize her expected payoff given the prior probability distribution p over the states and the principal’s transfers (t r ;t b ): c = argmax c2fr;bg p A R;c + (1p) A B;c +t c ; wheret c denotestheaction-contingenttransferthattheprincipalchoosesinstage1. Forrisk-neutral expected utility maximizing agent, the optimal action is given by c = 8 > > < > > : r; if t r t b (1 2p) b; if t r t b < (1 2p): Note that once again here we resolve the indifference case when t r = (1 2p) +t b by having the agent choosing action r. Also note that in this case, given the principal’s action-contingent transfers, the agent’s action is deterministic. This is in contrast to the information design case where the agent’s action-choice rule is a function of the probabilistically generated message. Stage 1. The principal’s problem is to choose (non-negative) action-contingent transfers (t r ;t b ) in order to maximize her expected payoff, which is given by the payoff in the baseline game minus 170 the transfer conditional on the agent’s action: P s;c t c . Trivially, it is optimal to set t b = 0 since P s;b = 0;8s =fR;Bg. The problem then reduces to minimizingt r so that the agent finds it optimal to choose r: t r = argmax tr2[0;100] t r s.t. t r (1 2p): The solution to the principal’s problem is given by t r = (1 2p) t b = 0 Principal-Preferred Subgame Perfect Equilibrium of Mechanism Design Game. The principal’s optimal transfer choice induces the agent to choose r in equilibrium. c =r Given this, the player’s expected payoffs are given by: E s P s;r = t r = 2p E s A s;r = p +t r = (1p) B.1.3 IDandMDinNormalForm: NashEquilibriaandBargaining We find Nash equilibria by explicitly defining the best response correspondences. Principals’ best response correspondences: BR Principal ID (Y ) =Y; 8Y 2 [0; 100] BR Principal MD (Y ) =Y; 8Y 2 [0; 100] 171 Agents’ best response correspondences: BR Agent ID (X) = 8 > > > > > > < > > > > > > : (X; 100]; if X < 400 7 [0; 100]; if X = 400 7 [0;X]; if X > 400 7 BR Agent MD (X) = 8 > > > > > > < > > > > > > : (X; 100]; if X < 40 [0; 100]; if X = 40 [0;X]; if X > 40 TheintersectionoftheabovebestresponsecorrespondencesidentifiesacontinuumofNashequilibria for each game as shown in Figure C.2. Figure B.1: Best responses and Nash equilibria for the ID and MD games Principal-preferred NE = SPE Agent-preferred NE 0 ~57 100 Agent's choice (Y) 0 ~57 100 Principal's choice (X) Principal's B.R. Agent's B.R. Nash equilibria ID section: Best Responses and Nash equilibria Principal-preferred NE = SPE Agent-preferred NE 0 40 100 Agent's choice (Y) 0 40 100 Principal's choice (X) Principal's B.R. Agent's B.R. Nash equilibria MD section: Best Responses and Equilibria “Agent-preferred NE” refers to the Nash equilibrium that is best for the agent in terms of expected payoffs. Correspondingly “Principal-preferred NE” refers to the Nash equilibrium that is best for the principal. Thesetof Nashequilibriaisvery similar inthetwogames, exceptthattherange isslightlylarger in the MD game due to the fact that the (PP)SPE is at a lower point. The Principal-preferred Nash equilibria should not to be confused with the corresponding Principal-Preferred Subgame Perfect Equilibria, except that they coincide in both games. This merely reflects the fact that the first- mover advantage of the principal in the two-stage version of the games, assigns full bargaining power to the principal. 172 TohighlighttheequivalenceoftheNEofthetwogames, FigureB.2graphsthemintheexpected- payoff space. We note the following. Conditional on agreement (XY ), no equilibrium outcome Pareto dominates any other (i.e., the surplus generated by persuasion or incentives does not depend on who receives it). Moving along the red striped line from left to right, the expected payoffs of the Nash equilibria increase for the principal and decrease for the agent in a linear fashion. Thus, conditional on agreement, both the ID and MD games resemble constant-sum games. The equilibrium outcomes predicted by the (PP)SPE corresponds to the best Nash equilibrium for the principal (Principal-preferred NE). We call “Disagreement outcome” the minimum guaranteed expected payoff that each player can guarantee in each game. This happens when the principal chooses X = 100 (guaranteed to persuade/incentivize) and when the agent chooses Y 400 7 in ID and Y 40 in MD. We call “Non-match outcome” the expected payoffs for each player when the principal fails to persuade/incentivize (X <Y). While the disagreement outcomes are the same as the non-match outcomes for agents across the two games, it is not the case for principals. This is because by choosing X = 100, in ID the principal constructs a fully-informative signal structure which guarantees her 30 points in expectation (since the agent will choose c =r, 30% of the time) while in MD the principal transfers 100 points (all of her points) to the agent, thus essentially guaranteeing herself zero points. Figure B.2: Expected payoffs in the ID and MD games Principal-preferred NE = SPE Agent-preferred NE Non-match outcome Disagreement outcome 0 30 70 85 100 130 Agent's Expected payoffs 0 30 45 60 100 Principal's Expected payoffs Expected Payoffs (matched pairs) Equilibrium Payoffs (matched pairs) ID section: Expected Payoffs Principal-preferred NE = SPE Agent-preferred NE Non-match outcome = Disagreement outcome 0 30 70 100 130 Agent's Expected payoffs 0 30 60 100 Principal's Expected payoffs Expected Payoffs (matched pairs) Equilibrium Payoffs (matched pairs) MD section: Expected Payoffs “Disagreement outcome” refers to the minimum guaranteed expected payoffs for each player. “Non- match outcome” refers to the expected payoffs when the principal fails to persuade/incentivize the agent (X <Y). 173 Next, we cahracterize the Nash Bargaining Solution (NBS). For the ID game we solve the following constrained optimization problem: max P;A (P 30) (A 70) 1 s.t. P +A = 130; where P and A represent the principal’s and agent’s NBS agreement payoffs. 30 and 70 represent each player’s disagreement outcomes while 130 is the total surplus. Finally denotes the principals relative bargaining power. Since we, ex ante, take an agnostic view on the bargaining power we note that the above constrained optimization problem with equal bargaining power, NBS(0:5; 0:5) admits a global maximum at the point P = 45, A = 85. SimilarlyfortheMDgame, tofindNBSwesolvethefollowingconstrainedoptimizationproblem: max P;A (P 0) (A 70) 1 s.t. P +A = 130; where the only difference from the NBS in the ID game is that principal’s minimum guaranteed expected payoff (disagreement outcome) decreases from 30 to 0. TheNBS(0:5; 0:5) admits a global maximum at the point P = 30, A = 100. Thus, one can see that these two otherwise identical problems in terms of the (PP)SPE can predict a significant difference in expected payoffs (up to 50% less for the principal in MD than in ID) in terms of the Nash equilibria with uniform relative bargaining power for the two players. This happens because of the difference in the minimum guaranteed expected payoff that principals can guarantee themselves in ID and MD. B.1.4 Calculation of the Nash Bargaining Weights in ID and MD In this section, we backtrack the relative Nash bargaining weights of each player from the observed matched expected payoffs in the ID game (left) and the MD game (right). 174 max P ID ;A ID (P ID 30) ID (A ID 70) 1 ID max P MD ;A MD (P MD 0) MD (A MD 70) 1 MD s.t. P ID +A ID = 130 s.t. P MD +A MD = 130 TheP variables denote principal’s expected payoffs andA variables denote agent’s expected payoffs in the corresponding game. From these constrained maximization problems we obtain the follow- ing relationships between principal’s relative bargaining power and the agent’s matched average expected payoff in the ID game (left) and MD game (right): ID = 100A ID 30 MD = 130A MD 60 Plugging the agents’ average Matched Expected Payoffs (MEP) obtained from the data (see Section 2.4.3) we obtain the average principals’ relative bargaining power in ID (left) and MD (right): ID = 0:593 0:6 MD = 0:610 0:6 Interestingly, we observe that conditional on pairs matching, principals and agents exhibit similar relative bargaining powers across the two games. Taking this result at face value, it may appear that the difference in the absolute values of the Matched Expected Payoffs that we observe can be fully attributed to the difference in the minimum guaranteed outcome that principals can obtain in the two games. 175 B.2 Variables Used in the Regressions Table B.1: Variables used in the regressions Variable Range Definition Average choice [0; 100] for each participant and each game (ID or MD), the average choice (X or Y) made in 10 periods Dictator choice [0; 100] for each participant and each game (ID or MD), the choice (X if principal,Y if agent) made in the Dictator tasks in Section 3 Cutoff [0; 100] for each participant and each game (ID or MD), the choice (X if principal, Y if agent) made in the Cutoff tasks in Section 3 D.X [0; 100] the difference in principal’s choiceX in the current and the previous period D.Y [0; 100] the difference in agent’s choiceY in the current and the previous period Low-earning principal 0=1 is 1 if the average expected payoff of a principal in a game (ID or MD) is below median of all 54 principals Generous agent in ID 0=1 is 1 if agent’s cutoff estimate in ID (the Cutoff ID task in Section 3) is below median of all 54 agents Generous agent in MD 0=1 is 1 if agent’s cutoff estimate in MD (the Cutoff MD task in Section 3) is below median of all 54 agents L.B, L.C, L.D, L.E 0=1 is 1 if the feedback state B;C;D;E was observed by a participant in the previous period L.X in ID [0; 100] the principal’s choice X in ID observed by an agent in the previous period L.X in MD [0; 100] the principal’s choiceX in MD observed by an agent in the previous period 176 B.3 Additional Regressions Table B.2: Random effects linear regressions of average choices on the Dictator and Cutoff choices in Section 3 Average choice Principals Agents ID MD ID MD Dictator choice –0.026 0.073 0.445*** 0.215** (0.060) (0.117) (0.078) (0.106) Cutoff 0.186** –0.059 0.036 0.161* (0.079) (0.043) (0.062) (0.086) Constant 49.483*** 52.286*** 27.776*** 33.470*** (3.566) (2.237) (8.966) (7.333) N observations 54 54 54 54 N groups 13 13 13 13 Errors are robust and clustered by the 13 independent groups of participants. Standard errors in parentheses. Significance levels *, **, *** correspond to p< 0:1; 0:05; 0:01. 177 Table B.3: Random effects linear regressions of change in choice (X orY) between periods t and t 1 D.X, D.Y Principals Agents ID MD ID MD Low-earning principal –23.896*** –8.585 (6.842) (7.677) Generous agent in ID –5.476 (4.606) Generous agent in MD 5.697 (3.888) L.B 14.305*** –0.168 (3.210) (7.923) L.C 17.272*** –0.155 19.449*** 7.963*** (3.713) (2.696) (6.901) (1.937) L.D 27.005*** 8.624*** –19.214*** –7.346* (7.379) (2.581) (7.250) (3.920) L.E 29.086*** 14.414*** –19.083*** –9.773** (5.615) (2.467) (5.397) (3.946) L.B Group 3.683 2.203 (8.380) (8.955) L.C Group 10.682 1.094 7.393 –6.972** (9.272) (7.516) (10.174) (3.492) L.D Group 33.073** 15.841* –11.170 –21.582*** (13.560) (8.753) (10.405) (7.437) L.E Group 30.477*** 9.606 3.309 –10.073* (10.125) (9.958) (7.064) (5.789) Dictator choice –0.036 0.013 (0.031) (0.022) Cutoff 0.064 –0.011 –0.018 0.054** (0.050) (0.018) (0.029) (0.023) L.X in ID –0.145*** (0.055) L.X in MD –0.179*** (0.044) Constant –16.873*** –3.964** 19.435*** 11.293** (3.864) (1.899) (6.145) (4.527) N observations 486 486 486 486 N groups 54 54 54 54 Dummy Group: Low-earning principal, Generous agent in ID, or Generous agent in MD. Errors are robust and clustered by participant. *, **, *** correspond to Sig. levels p< 0:1; 0:05; 0:01. 178 B.4 Additional Tables Table B.4: Summary of payoffs of Principals and Agents by game (ID/MD), sam- ple(All/Matched) and payoffs (Actual/Expected) Section Measure Sample Principals Agents Data Theory Data Theory ID Expected Payoffs Matched pairs 47.8 60 82.2 70 (2.73) (2.73) MD Expected Payoffs Matched pairs 36.6 60 93.4 70 (3.41) (3.41) ID Expected Payoffs All pairs 25.3 60 76.1 70 (1.7) (0.82) MD Expected Payoffs All pairs 16.4 60 79.8 70 (1.26) (0.91) ID Actual Payoffs Matched pairs 47.5 60 81.5 70 (3.67) (2.97) MD Actual Payoffs Matched pairs 38.1 60 93.3 70 (2.35) (4.01) ID Actual Payoffs All pairs 25.6 60 76.9 70 (2.36) (1.8) MD Actual Payoffs All pairs 16.4 60 80.9 70 (1.26) (2.23) N of independent observations 13 13 Numbers in brackets indicate standard errors. 179 B.5 Additional Figures Figure B.3: Histograms of agents’ choices in ID and MD. 0 5 10 15 20 Percent 0 20 40 60 80 100 Y ID: Histogram of agents' choices 0 5 10 15 20 Percent 0 20 40 60 80 100 Y MD: Histogram of agents' choices B.6 The Role of Risk Preferences As discussed in 2.3 risk preferences play an important role in the equivalence between the principal’s expected payoffs in the principal-preferred subgame prefect equilibrium in ID and MD. While theE s P p s;c = 2p = 60 in the [PP]SPE of ID is unaffected by either player’s risk preferences, the equivalent object in the [PP]SPE of MD will move upwards [downwards] as the agent becomes more risk averse [loving]. The logic is simple and requires no detailed derivation: the MD game involves a sure transfer from the principal to the agent, thus the more risk-averse the agent is, the less the principal needs to transfer to induce the agent to accept the transfer. In the ID game, risk preferences play no role since it involves a case of strict Bayesian updating. To examine our participants’ risk preferences we designed and implemented an innovative test in Section 3, which comprised of two tasks: Cutoff MD - Risk preferences: In this task, all of our participants played the MD game as a Player B (agent) with a minor adjustment. Instead of playing against another participant in the role of Player A (principal), they were told that the computer will choose X (the choice that player A would have to make) randomly from the available set of options [0; 100]. In doing so we remove any strategic considerations from the game and are able to examine the 180 Figure B.4: Scatterplot of the principals’ average Expected Payoffs 0 20 40 60 0 20 40 60 Average Expected Payoffs in ID Average Expected Payoffs in MD Principals from high-earning independent groups Principals from low-earning independent groups Scatterplotoftheprincipals’averageExpectedPayoffs(54principals). Wedividethe13independent groups of principals into high-earning and low-earning groups by the median split of average group payoffs. The red dots show principals from the high-earning groups and the blue dots from the low-earning groups. It is clear that some principals from high-earning groups earn little and vice versa. Therefore, earnings are not exclusively determined by the group to which a principal belongs, but also depend on her individual decisions. participants’ true and unbiased preferences to choose Y, i.e. whether they required more or less than the risk-neutral level of 40 points. For the above task to be informative one has to assume that participants are able to accurately calculate (or have the right perception of) the risk-neutral cutoff point of 40 points. If not, then comparing choices in the above game to 40 to infer risk preferences will be misleading. For example, if an agent perceives that 50 points is what makes guessing red or blue equally profitable on average, and in the above task, he or she chooses 50, then this participant should be classified as risk-neutral instead of risk-seeking, which would be the classification under the assumption of correct inference on the risk-neutral cutoff point. To overcome this problem, we implemented the following task: Cutoff MD - Risk neutral: This task was designed to extract our participants perception of the risk neutral cutoff point in the MD game. In contrast to the Cutoff MD - Risk preferences game, in this task there is a correct answer, (40). All our participants were incentivized to provide their best guess of that number (for detailed instructions see Section B.8.5). Comparing responses in these two tasks can give us the type and magnitute of risk preferences in our sample. The choices in the two tasks is shown in Table B.5. Responses in the two tasks are very 181 Table B.5: Average choices in the Cutoff MD tasks Task - Game Sample Average choice N Cutoff MD - Risk preferences Full 55.5 108 (2.13) Cutoff MD - Risk neutral Full 49.7 108 (2.52) Numbers in brackets indicate standard errors. close to each other in terms of average choices. A paired two-sided t-test rejects the hypothesis that the averages are statistically different. Thus, it seems that participants in our sample appear to be on average risk neutral in this task. When examining the difference between choices in the two tasks (Cutoff MD - Risk preferences minus Cutoff MD - Risk neutral) at the individual level we observe that the mode is zero (indicating risk neutrality)–observed in 24 of our participants (22.22%). The second and third most commonly observed differences are +20 and20 (indicating risk seeking and risk neutrality respectively)–observed in 11 and 8 participants (10.19% and 7.41%). While overall we find that our participants seem to exhibit all kinds of risk preferences of various magnitudes, on average, there appears to exist a not statistically significant tendency towards risk seeking. As such we conclude that risk preferences play a negligible role, of unclear type (aversion or seeking) in our experiment and thus our theoretical equivalence continues to hold even after accounting for risk preferences. 182 B.7 ID: The Lasting Impact of Feedback State C In the analysis of the reactions of agents to feedback states in Section 2.4.5 we observed that the largest reaction in ID was to state C, which happens when agents follow the recommendation, guess red, but receive 0 points because the ball is blue. As we discuss in Section 2.5 this reaction is the main difference between ID and MD since in MD agents are paid for guessing red, which does not create the “betrayal” problem evident in ID. We hypothesize that agents blame principals for “tricking” them into following the recommendation. Figure B.5: Average choices of agents after experiencing state C in ID Choices of agents after experiencing state C in ID. X-axis shows the number of periods that have elapsed since experiencing state C. Y-axis shows the difference between the average choices of all agents in the periods after experiencing state C and the corresponding average choices of agents in all periods that have not yet experienced state C. The figure above shows the average increase in choices of agents after experiencing feedback stateC for the first time. For each agent we take the difference between the consecutive choices and the choice that was made when state C happened. We observe that agents are influenced by state C so much that they do not decrease their choices even 7 periods after the occurrence of state C, indicating that experiencing this state does not only cause extreme but also long-lasting reaction. 183 B.8 Instructions and Screenshots All instructions below were translated to Italian since the experiment was run in Trento, Italy. For the Italian version, please contact the authors. B.8.1 Instructions for the ID Section 184 185 B.8.2 Screenshots for the ID section Figure B.6: Choice screens in the ID section. Player As - principals (“Giocatore A”) and Player Bs - agents (“Giocatore B”); “Giocatore” - “Player”; “Se la pallina e ” - “If the ball is”; “Scegli” - “Choose”; “Suggeersisci Scommetti rosso / blu con 36% di probabilita” - “Recommend guess red / blue with 36% chance”. Figure B.7: Feedback screens in the ID section. Player As - principals (Left), Player Bs - agents (Right); “Riassunto dei pagamenti” - “Summary of payments”;“Il colore della pallina estratta e stato BLU” - “The color of the extracted ball is blue”; “Il suggerimento Scommetti blu e stato fatto in base alla tue scelta” - “The recommendation guess blue was made according to your choice”;“Il Giocatore A ha scelto il seguente suggerimento” - “Player A’s recommendation is”; “Il Giocatore B ha deciso di SEGUIRE il tuo suggerimento” - “Player B has decided to FOLLOW your recommendation”; “Il suggerimento scommetti blu e stato fatto in baseal suo piano” - “The recommendation was guess blue”; “Hai deciso di SEGUIRE il suo suggerimento” - “You have FOLLOWED this recommendation”; “Il Giocatore B ha scommesso su blu” - “Player B has guessed blue”; “La tua scommessa e su blue” - “You guessed blue”; “In questo round hai guadagnato” - “In this round you have earned”. 186 B.8.3 Instructions for the MD Section 187 188 B.8.4 Screenshots for the MD section Figure B.8: Choice screens in the MD section. Player As - principals (“Giocatore A”) and Player Bs - agents (“Giocatore B”); “Se Giocatore B scommette rosso” - “If Player B guesses red”. “Per scommettere su rosso” - “To guess red”. “accetto trasferimenti di almeno 43 punti” - “I accept a transfer of at least 43 points”. “trasferisci 65 punti” - “transfer 65 points”; “Scegli” - “Choose”. Figure B.9: Feedback screens in the MD section. “Riassunto dei pagamenti” - “Summary of payments”; “Il colore della pallina estratta e stato BLU” - “The color of the extracted ball is blue”; “Hai scelto di transferire 65 puntise la scommessa e sul rosso” - “You have chosen to transfer 65 points to guess red”; “Il Giocatore A ha scelto di trasferire 65 punti se la tue scommessa e su rosso” - “Player A has chosen to transfer 65 points if you guessed red”; “Il Giocatore B ha deciso di ACCETTARE il tuo trasferimento” - “Player B has decided to ACCEPT your transfer”; “Hai deciso di ACCETTARE questo trasferimento” - “You have decided to ACCEPT this transfer”; “Il Giocatore B ha scommesso su rosso” - “Player B has guessed red”; “La tua scommessa e su rosso” - “You guessed red”; “In questo round hai guadagnato” - “In this round you have earned”. 189 B.8.5 Intructions for Section 3 Dictator ID [Dictator MD] “You will play two more rounds as Player A[B]. You will play one round of the game in section 1 and one round of the game in section 2. In each round you will be rematched with another participant playing as Participant B[A] and you will once again choose X[Y ] as in sections 1 and 2. However, the Player B[A] that you will be matched with will not be allowed to choose Y [X]. Instead their choice of Y [X] will automatically be set to 0 [equal to your choice of Y (i.e. X =Y)]. i.e. Player B will always follow your recommendation or accept your transfer [Player A’s recommendation or transrer will be the smallest possible that you would follow or accept.]” Cutoff ID “Suppose a ball would be drawn at random from an urn with 3 RED balls and 7 BLUE balls. Suppose that recommendations will be generated depending on the ball color as follows:” Ball color RED BLUE “Guess red” 100% 100X% “Guess blue” 0% X% “If you receive a “Guess blue” recommendation, then the ball myst be blue for sure.” “If you receive a “Guess red” recommendation, then the ball can be either red or blue” “Your task is to indicate what is the numberX such that if the recommendation is “Guess red”, the chances of the ball being red is equal to the chances of the ball being blue.” “Your earnings from this task will be proportional to how close your answer is to the correct answer.” 190 Cutoff MD - Risk neutral “Suppose a ball will be drawn at random from an urn with 3 RED balls and 7 BLUE balls. Suppose that you would have to produce a guess for the color of the ball and receive points as follows (This is hypothetical. Your actual earnings from this task will be explained below.):” “If your guess is correct, you will earn 100 points.” “If your guess is also “red” you will earn an additional X number of points (regardless of the cor- rectness of your guess and on top of any points you make for the correctness of your guess.)” “Your task is to indicate what is the number X such that whether your guess ends up being “red” or “blue”, you should expect to earn the same amount of points, on average.” “Your earnings from this task will be proportional to how close your answer is to the correct answer.” Cutoff MD - Risk preferences “In this task a ball will be drawn (not hypothetically) at random from an urn with 3 RED balls and 7 BLUE balls. There are two possible guess for the color of the ball, red or blue. ” “If your guess is correct you will earn 100 points.” “If your guess is also “red”, you will earn an additional X number of points (regardless of the cor- rectness of your guess and on top of any points you make for the correctness of your guess.)” “X will be generated randomly by the computer to be a number between 0 and 100. Each number will have the same probabilisty of being selected to be X.” “Indicate the minimum amount of X that you would accept in order to you guess “red”.” 191 “Your earnings from this task will be calculated according to your guess, the color of the ball and possibly X (if your choice is at less than or equal to the randomly chosen X then you will receive X points for sure in addition to any points for the correctness of your guess).” 192 Appendix C Appendix to Chapter 3 C.1 Mathematical Appendix Proof of Lemma 1: Step 1) Show that any consumption scheme whoseC S exceeds R is dominated by a consumption scheme whose C S equals R. Let N 0 be the N such that C S = R on equation (3.6). (It can easily be shown that) EU is continuously concave in N on N N 0 . It then suffices to show that @EU @N N 0 < 0 in the original problem. Namely, if @EU @N N 0 < 0, then by continuous concavity of EU, @EU @N N < 0 for all NN 0 , and (again, by concave continuity of EU) any consumption scheme with N >N 0 is dominated by a consumption scheme with N 0 To prove this sufficient condition ( @EU @N N 0 < 0), first, write down theEU NN 0 under equations (3.5) - (3.7): EU NN 0 =(1p)U(C F ) +pU(C S ) =(1p)U(C 0 N) +pU(C 0 +N(M 1)) =(1p)U(C 0 N) +pU(C 0 +N 1p p ): 193 Then differentiate EU NN 0 with respect to N at N 0 , p : @EU @N N 0 ;p =(1p )U 0 (C 0 N) +p 1p p U 0 (C 0 +N 1p p ) N 0 = (1p )(U 0 (C S )U 0 (C F )) N 0 = (1p )(U 0 (R)U 0 (C F )) < 0; where the last inequality follows from the domain of p, and assumption (3.9). Step 2) Show that any consumption scheme whose C S falls below R is dominated by a con- sumption scheme whose C S equals R. WhenC S <R,U(c)=u(c). Thenbyourstandardassumptionsonu(c)andJensen’sInequality, we know that p = 0 (no trade) dominates all other consumption scheme, which is by definition, (weakly) dominated by p because p=0 is in the choice set. Proof of Lemma 2: Since p2 (0; 1), C F = C 0 pR 1p , and C F C 0 = p(C 0 R) (1p) . Also by Lemma 1, C S = R. Some calculations yield: 2 = p (1p) (RC 0 ) 2 ; (C.1) and similarly, E[(CC 0 ) 3 ] = p(1 2p) (1p) 2 (RC 0 ) 3 : (C.2) )S(p) = 12p p p(1p) and S 0 (p) = 1 2p(1p) 3=2 < 0. Proof of Theorem 1: The optimization problem characterized by (3.10) is specialized to power utility: 194 max p EU(p) (C.3) where EU =(1p) C 1 F 1 1 +p R 1 1 1 (C.4) and C 0 =pR + (1p)C F : (C.5) Let := C 0 R . The associated First Order Condition @EU @p gives: F (p;) : = 1 ( p 1p ) R 1 ( 1 +p 1p ) + R 1 + 1 1 (C.6) = 0 (C.7) Checking the Second Order Condition, @F (p;) @p = 1 R 1 ( p 1p ) 1 2 (1) 2 (1p) 3 < 0 (C.8) Hence, the EU-optimization problem amounts to finding the p which satisfies F (p ;) = 0 (C.9) 195 Proof of part (i): By Implicit Function Theorem, @p @ = @F @ @F @p (C.10) Partial differentiation yields: @F @ = (1 )(1p) R 1 ( p 1p ) 1 ( 1)(1) 1p and @F @p = (1 )(1p) R 1 ( p 1p ) 1 (1 )(1) 2 (1p) 2 Hence, @p @ = 1p 1 > 0 (C.11) Proof of part (ii) Suppose not. This means that on equation (C.6), p does not converge to 1 as " 1. This in turn allows us to conclude that: F (p ;)! 1 R 1 + R 1 + 1 1 ; (C.12) as " 1 196 Since F (p ;) = 0, so is the limit. (Equals zero.) This implies: 1 R 1 + R 1 + 1 1 = 0: (C.13) Rearranging, this becomes: R 1 = 1, a contradiction under our assumptionsR> 1 and > 1. Proof of part (iii) Let g() : =F (0;) = 1 R 1 ( 1 ) + R 1 + 1 1 : To show that9( C 0 R ) 2 (0; 1) such that p = 0, we need to show that g() has a root in (0,1). First, note that g(1) = 1 1 (R 1 1)> 0 Then, by Intermediate Value Theorem (IVT), it suffices to show that: 9C ;R 2 (0; 1), such that g(C)< 0. (*) Proof of (*) 197 Rearranging g(), we get: g() = R 1 (1 1 ) + R 1 + 1 1 but, (1 1 )> !1 as # 0 + Plugging this back into (22), this implies g()!1 as # 0 + Hence, for any N2R,9C2 (0; 1) such that g(C)<N, which proves (*). Proof of Theorem 2: Recall that (i) @F (p;) @p < 0 (from SOC), and (ii) C F C 0 = p 1p (C 0 R). Clearly, from (i), the Expected Utility is maximized at p (0). Note that from (ii), choosing p =0 is equivalent to choosing C 0 , namely not choosing any gamble. This is certainly an available option for the agent. Hence, we want to show that the agent chooses p=0 over p2 (0; 1). To do this, we need to show EU(p )EU(0)>EU(p) (C.14) where p 0<p (C.15) 198 By Mean Value Theorem (MVT),9c2 (0;p) such that EU(p)EU(0) =EU 0 (c)(p 0)< 0 (C.16) This follows from the fact thatEU 0 (c)< 0, which can easily be shown by applying MVT again. Proof of Theorem 3: Using Implicit Function Theorem, dp d = @F @ @F @p : Note that F (p ;) = 0 () @EU @p = 0 () @(1p) C 1 F 1 1 +p R 1 1 1 @p = 0 () @(1p) C 1 F 1 1 @p + R 1 1 1 = 0 Hence, @F @ = @(1p) C 1 F 1 1 @p = R 1 1 1 1 < 0: Also, from before, @F @p = (1 )(1p) R 1 ( p 1p ) 1 (1 )(1) 2 (1p) 2 < 0 It thus follows that dp d < 0. Proof of Theorem 4: Using Implicit Function Theorem, @p @C 0 j ; ; = @F @C 0 @F @p : 199 Note that @F @C 0 = R (1) (1p) 2 ( p 1p ) 1 > 0: Also, from before, @F @p = (1 )(1p) R 1 ( p 1p ) 1 (1 )(1) 2 (1p) 2 < 0 It thus follows that dp dC 0 j ; ; > 0. Proof of Theorem 5: A direct algebraic proof is not amenable. We first suggest a sufficient condition (actually, an equivalence condition) and then use this to prove the theorem. Claim 1. For a given (and of course, under the fixedC 0 and as assumed in the statement of the theorem), letR 0 ( ) denote the R which leads top = 0. (Recall from Theorem 1, we know !9R 0 ( ).) It suffices to show that @R 0 ( ) @ < 0 Proof. Let i < j , and denote the optimal solutions pertaining to i and j (as functions of ) as p i () and p j (). Note that p () is well-defined as a function of because we have fixed C 0 . In fact, in this setting we can treat p () as a continuously differentiable function, as a direct conse- quence of the Implicit Function Theorem. Note also that by (C.11), p i () and p j () can never intersect. To intersect at, say, point 0 , there must exist a neighborhood of 0 upon whichj @p i () @ j always exceedsj @p j () @ j. However, (C.11) prohibits this (i.e. plug in 0 into (C.11), and for any > 0 ,j @p i () @ j<j @p j () @ j whenever p i ()>p j () and vice versa for < 0 ), asserting our claim that p i () and p j () can never intersect. Next, suppose that @R 0 ( ) @ < 0 holds. Since i < j , this implies R 0 ( i )>R 0 ( j ). Using what we know about p () from Theorem 1, we can deduce that p i C 0 R 0 ( j ) > 0 =p j C 0 R 0 ( j ) ; 200 where the inequality follows from combining Theorem 1-(i) (p is monotonically increasing in and approaches 1 from the left) and the fact that 0 =p i C 0 R 0 ( i ) , by definition. Similarly, the equality follows from definition ofR 0 ( ). But since we established thatp i () andp j () can never intersect, this inequality at = C 0 R 0 ( j ) must in fact hold uniformly in all , namely, p i ()>p j () whenever i < j . Therefore, dp d < 0, as desired. Claim 2. @R 0 ( ) @ < 0. Proof. We first specialize (C.6) by insisting p = 0, as per the definition of R 0 : G() =F (0;) =C 0 R 1 C 1 0 + R 1 + 1 1 = 0 By Implicit Function Theorem, @R @ = @G @ @G @R = ( R C 0 ) 1 ( 1 C 0 ) | {z } A<0 ( R C 0 ) [ ( R C 0 ) log R C 0 ( R C 0 )1 ] +O( 1 R ) ( R C 0 ) | {z } B>0 (C.17) whereO( 1 R ) := (1)C 1 0 logC 0 R C 0 1 > 0, a positive quantity that converges to 0 at the rate of 1 R . Under the current assumptions, R 1>C 0 1> 0, and 1 < 0, thus A = ( R C 0 )1 1 C 0 < 0. To sign B, note that ( R C 0 ) log R C 0 ( R C 0 )1 > 1 for all R C 0 > 1, andO( 1 R )> 0, so if ( R C 0 ) [ ( R C 0 ) log R C 0 ( R C 0 )1 ] +O( 1 R )<, this implies ( R C 0 ) <, and B > 0. Therefore, under the given assumption, @R @ < 0. Combining Claim 1 and Claim 2, we arrive at the desired conclusion. Proof of Theorem 6: Note that by definition, R 1 < R 1 automatically implies ( 1 , R 1 , 2 , R 2 ; C 0 )2 H c . It remains to prove that when R 1 R 1 , ( 1 , R 1 , 2 , R 2 ; C 0 )2 H. The proof is constructed as follows. First, (Lemmas 4 - 5) we state some miscellaneous facts which we will use along the way. Second, we introduce a ‘discriminant’ that will help us tell H and H c apart, and 201 derive some of its properties (Lemmas 6-10). Third, we will use these to draw conclusions on how parameters should behave to be in either H or H c . Lemma 4. Let C F (p ) denote the C F when the agent’s optimal solution is implemented; i.e., that which satisfies (3) at p . Then, @C F (p ) @ > 0 Proof. C F (p ) = C 0 p R 1p . By chain rule, @C F @ = @C F @p @p @ = C 0 R (1p ) 2 @p @ > 0, as product of two negatives. Lemma 5. When C F >C F (p*), @EU @C F < 0 Proof. First, note that @C F @p = C 0 R (1p) 2 < 0, hence C F is a bijection. It follows that C F (p) > C F (p ) () p<p , and from proof of theorem 1, we know that @EU @p = @EU @C F @C F @p > 0 on p<p . ) @EU @C F < 0 as desired. Now considerM (s), and we define the following objects on the EU maximization problem. g(c) will be the discriminant. . l 2 (c) :=u(R 2 ) + u(R 2 ) 2 u(C F ) R 2 C F (cR 2 ); (C.18) g(c) = 1 u(x)l 2 (c)2C 2 (C.19) The following lemma tells us what l 2 (c) is. Lemma 6. Consider s2 S and its associated ~ s. l 2 (c) is the tangent line to 2 u(c) on M (~ s) which passes through the point (R 2 ;u(R 2 )). Moreover, the point of tangency is unique. 202 Figure C.1: Example where l 2 (c) is tangent to U(c) at C F 0.5 1 1.5 2 2.5 3 3.5 −2 −1.5 −1 −0.5 0 0.5 1 Proof. That it passes through (R 2 ;u(R 2 )) is clear. To prove tangency, consider the single kinked EU maximization problem and the FOC condition. F (p;) = @EU @p =u(R 2 ) 2 u(C F ) + 1 1p 2 u 0 (C F ) (C 0 R 2 ) = 0 (C.20) () 2 u 0 (C F ) = (1p )(u(R 2 ) 2 u(C F )) R 2 C 0 = (u(R 2 ) 2 u(C F )) R 2 C F : (C.21) (C.22) Since RHS = (u(R 2 ) 2 u(C F )) R 2 C F is the slope of l 2 (c) and u is concave, l 2 (c) is tangent to 2 u(c) and the point of tangency isc =C F . Finally, uniqueness of tangency follows from concavity of u(c). Lemma 7. Let u(c) be the power utility function. Then,9^ c2R + where g 0 (^ c) = 0. Moreover, g(^ c) is a local and global (hence unique) maximum () g 0 (^ c) = 0. 203 Proof. Clearly,u 0 (c)> 0 for all c, hence by lemma 5, (u(R 2 ) 2 u(C F )) R 2 C F > 0. If u(c) is the power utility function, then u 0 (c) =c which tends to +1 at 0 + and converges to 0 as c!1. Hence by IVT, 9^ c2R + where g 0 (^ c) = 0. The second part (local and global maximality) is a general property of strictly concave functions. Lemma 8. ConsiderM ( 1 ;R;C 0 ) andM ( 2 ;R;C 0 ) with 1 > 2 and define l i (c), i = 1,2 con- formably as before. Then l 1 (c)>l 2 (c);8c2 (0;R). Proof. ByLemma3, @C F () @ > 0,henceC F ( 1 )>C F ( 2 ). Sinceu’(c)isdecreasinginc,u 0 (C F ( 1 ))< u 0 (C F ( 2 )). By Lemma 5, i u 0 (C F ( i )) = u(R) i u(C F ( i )) RC F ( i ) , i = 1; 2, and therefore u(R) 1 u(C F ( 1 )) RC F ( 1 ) < u(R) 2 u(C F ( 2 )) RC F ( 2 ) . Note also that l 1 (R) =l 2 (R) )l 1 (c)>l 2 (c);8c2 (0;R). Lemma 9. Let u(c) be the power utility function. Then the following are equivalent. g 1 and g 2 are two distinct roots of g(c) in (0;1) () g(^ c) > 0, where g 0 (^ c) = 0 and ^ c in (g 1 ;g 2 ) () max c2(g 1 ;g 2 ) g(c)> 0 Proof. We prove the first equivalence. The second equivalence is a direct corollary from lemma 6. ())9 2 roots in (0;1) implies (by MVT) that9^ c in (g 1 ;g 2 ) such thatg 0 (^ c) = 0, whence by lemma 6g(^ c) is a global maximum in (0;1). If the global maximum,g(^ c); 0, then9 0 or 1 root in (0;1), a contradiction. (() Suppose g(^ c) > 0, where g 0 (^ c) = 0 and ^ c2 (0;1), then by concavity of g(c), we can pick 0 < c < ^ c < c such that g 0 (c) < 0 < g 0 (c). Taylor expanding around c, c, (and using the fact that u(c) is a power utility function which tends to1 as c# 0 + ), we can show that lim c!0 + g(c) = lim c!+1 g(c) =1. ) by IVT,9 at least two distinct roots in (0;1). Lemma 10. g(c) has at most two distinct roots in (0;1). Proof. Suppose not. Then by lemma 8, there are at least two distinct local maxima> 0, which contradicts lemma 6, in particular, the uniqueness of local maximum. With these, we now construct the main body of the proof. By Lemma 9, we know g(c) has 0, 1, or 2 distinct roots. Let S 0 denote the subset of S such that g(c) has 0 root. Let S 1 denote the 204 subset of S such that g(c) has 1 root. Let S 2 denote the subset of S such that g(c) has 2 roots. Clearly, S 0 ;S 1 ;S 2 partition S; S 0 _ [S 1 _ [S 2 =S. Figure C.2: Examples where g(c) has 2 roots and g(c) has no root. 0.5 1 1.5 2 2.5 3 3.5 −2 −1.5 −1 −0.5 0 0.5 1 0.5 1 1.5 2 2.5 −2 −1.5 −1 −0.5 0 0.5 1 g(c) has 2 roots [Left] and g(c) has no root [Right] - Many of these are ruled out by Lemma 1 Claim 3. S 0 H Proof. Given anys2S and its associated ~ s, letp and ~ p denote the optimal solution toM (s) and M (~ s). LetC F (p ) andC F (~ p ) denote theC F ’s whenp and ~ p are implemented. LetEU (M (s)) and EU (M (~ s)) denote the maximized EU when p and ~ p are implemented. For a given s2 S, C F (p ) can either be higher or lower than R 1 . We look at the two cases. Step 1) We first look atfs :C F (p )<R 1 g and showfs :C F (p )<R 1 g\S 0 H. Here, EU (M (s)) =p U(R 2 ) + (1p )U(C F (p )) =p u(R 2 ) + (1p ) 2 u(C F (p )); 205 because C F (p )<R 1 . On the other hand, EU (M (~ s)) = ~ p U(R 2 ) + (1 ~ p )U(C F (~ p )) = ~ p u(R 2 ) + (1 ~ p ) 2 u(C F (~ p )) Since ~ p istheuniquesolutionto max p2(0;1) EU(M (~ s)), wemusthavep = ~ p , andconsequently,C F (p ) = C F (~ p ) and EU (M (s) = EU (M (~ s)), i.e.,fs :C F (p )<R 1 g\S 0 H. Step 2)fs :C F (p )R 1 g\S 0 is empty; i.e., C F (p )R 1 never happens in S 0 . Suppose C F (p )R 1 . Since on S 0 , g(c) has 0 root, g(c)< 0 on the entire domain. Therefore, l(c)> 1 u(c) and in particular, l(C F (p ))> 1 u(C F (p )): (C.23) Also, l(R 2 ) =u(R 2 ) (C.24) by definition, so taking linear combinations of the two sides, and recalling that we are assuming to be infs :C F (p )R 1 g so that 1 u(C F (p )) =U(C F (p )), (1p )l(C F (p ) +p l(R 2 )> (1p ) 1 u(C F (p )) +p u(R 2 ) =:EU (M (s)): (C.25) Note thatl() is a linear operator in its argument, and plug inC 0 = ~ p R 2 + (1 ~ p )C F (~ p ) to get LHS =l(C 0 ) = ~ p u(R 2 ) + (1 ~ p ) 2 u(C F (~ p )) =:EU (M (~ s)): (C.26) 206 Thus, EU (M (s)<EU (M (~ s)), so p is never chosen, a contradiction to global optimality of p . Hence C F (p )R 1 never happens in S 0 . Claim 4. S 1 H Proof. Pick any s2 S 1 . First, note that l 1 (c) = l 2 (c) because g(c) is concave and unique root defines the tangency, hence the tangent line for on C F ( 1 ) and C F ( 2 ) is a common line by con- struction ofl(c). Then, by same logic as in Claim 1, we can show thatEU (M (s)) =EU (M (~ s)). Modulo the (innocuous) assumption that the agent chooses ~ p = p when indifferent, s and ~ s are indistinguishable, hence S 1 H as desired. Claim 5. S 2 =R + _ [R where R + H and R H c Proof. Let R<R be the two roots of g(c). Let R + :=fs2S 2 :R 1 Rg and let R :=fs2S 2 : R 1 <Rg. Clearly, S 2 =R + _ [R . Step 1) R + H It suffices to show that U(c) l 2 (c) on all c2 (0;R 2 ). Then, we can use the same argument as in Claim 1 to show that EU (M (s) = EU (M (~ s)). (Namely, we partition C F (p ) into (0;R 1 ) vs [R 1 ;R 2 ) and argue that on (0;R 1 ), s2H and the sub-case for [R 1 ;R 2 ) is empty.) Whenc2 (0;R 1 ),U(c)l 2 (c)certainlyholdssinceU(c) = 2 u(c) 1 (0;R 1 ] andl 2 (c)islinetan- gent to 2 u(c) 1 (0;R 1 ] from above. Whenc2 [R 1 ;R 2 ), by lemma 6,9^ c2 (R;R) such thatg 0 (^ c) = 0 and g 0 (c) < 0 everywhere on c2 (^ c;1). ) g(c) = 1 u(c)l 2 (c) < 0 for all c2 (R;1), in par- ticular for allc>R 1 R, since we are inR + . Hence, 1 u(c)<l 2 (c) for allc2 (R 1 ;R 2 ) as desired. Step 2) R H c This assertion will be proved in Theorem 7 (Lemma 10). 207 Summing up, we know that S 0 _ [S 1 _ [R + = H and R = H c . Recall that R 1 = inffR 1 : ( 1 ;R 1 ; 2 ;R 2 ;C 0 )2 Hg. For any s2 S 0 _ [S 1 _ [R + , R 1 = 1, because for all R 1 ;s2 H and R 1 is defined to satisfy R 1 > 1. On the other hand by construction of the proof of Claim 3, for any s2R =H c ,R 1 =R. Moreover, for anyR 1 R 1 , the corresponding s is inH. Therefore,R 1 =R is the demarcation point between H and H c . Finally, suppose we consider the ex-R 1 4-tuple of any s2H C . Then R 1 > 1. As 1 " 1, the R of the associated g(c) monotonically increases toR 2 , thus proving the claim thatR 1 "R 2 as 1 " 1. . Proof of Theorem 7: Let s2 R and let C F (p ) be the the C F associated toM (s). Let C F ( 1 ) be theC F associated to the single-kink utility maximization problem ( 1 ;R 2 ;C 0 ). Similarly, let C F ( 2 ) be the C F associated to associated single-kink EU problem M (~ s). Note that C F (p ) is in either one of the two intervals: (0;R 1 ) or [R 1 ;R 2 ). We assume C F (p )2 [R1;R2) and later verify this. Assuming C F (p )2 [R1;R2), consider two sub-cases. Case 1) C F ( 1 )R 1 Because we assumed C F (p )2 [R1;R2), EU (M (s)) =p U(R 2 ) + (1p )U(C F (p )) =p u(R 2 ) + (1p ) 1 u(C F (p )): Because we assumed C F ( 1 )R 1 , C F ( 1 )R 1 C F (p ) (C.27) Recall by Lemma 4, @EU @C F < 0 on all C F ( 1 )<C F , hence the 0 best 0 C F (p ) is the smallest C F that respects C F ( 1 )<R 1 C F (p ). 208 )C F (p ) =R 1 : (C.28) Case 2) C F ( 1 )>R 1 In this case, we argue exactly as in the proof of Theorem 6, Claim 1, Step 1 to arrive at: C F (p ) =C F ( 1 ): (C.29) Pulling these two cases together, C F (p ) =max(R 1 ;C F ( 1 )) (C.30) We now justify our assumption C F (p )2 [R 1 ;R 2 ). Claim 6. C F (p )2 [R 1 ;R 2 ) Proof. Suppose C F (p )2 (0;R 1 ). Then, by same logic as Theorem 6, Claim 1, Step 1 C F (p ) =C F ( 2 ) (C.31) Pulling these together, C F (p ) = 8 > > < > > : C F ( 2 ); ifC F (p )2 (0;R 1 ) max(R 1 ;C F ( 1 )); ifC F (p )2 [R 1 ;R 2 ) Therefore, it suffices to show that agents always choose max(R 1 ;C F ( 1 )) overC F ( 2 ). We proceed (as before) in two cases. Case 1) C F ( 1 )R 1 Here, max(R 1 ;C F ( 1 )) =R 1 , so the task is to compare C F ( 2 ) and R 1 . Consider g(c) and its two roots R and R. By Lemma 6,9~ c2 (R;R) such that g 0 (~ c) = 0. 209 1 u 0 (~ c) =l 0 2 (c) (C.32) On the other hand, by Lemma 5, 1 u 0 (C F ( 1 )) =l 0 1 (c): (C.33) Also, (as in the proof of Lemma 7) it is easy to see that l 0 1 (c)<l 0 2 (c) (C.34) Putting these together, 1 u 0 (C F ( 1 )) =l 0 1 (c)<l 0 2 (c) = 1 u 0 (~ c): (C.35) which means C F ( 1 )> ~ c. Combine this with the assumptions C F ( 1 )R 1 and R 1 <R (because we are in R ), we get: ~ c<C F ( 1 )R 1 <R (C.36) Since g(c) is concave in c, and g 0 (~ c) = 0, g 0 (c)< 0 on (~ c;R 2 ). This, (49), and Lemma 8 tell us g(c)> 0 on (~ c;R), in particular, g(R 1 )> 0. Namely, 1 u(R 1 )>l 2 (R 1 ) (C.37) Finally, since C F (p ) =R 1 , p must satisfy (1p )C F (p ) +p R 2 = (1p )R 1 +p R 2 =C 0 (C.38) We compare 210 EU(R 1 ) = (1p )U(R 1 ) +p U(R 2 ) = (1p ) 1 u(R 1 ) +p u(R 2 ) and EU(C F ( 2 )) =l 2 (C 0 ) =l 2 ((1p )R 1 +p R 2 ) = (1p )l 2 (R 1 ) +p u(R 2 ): Using (50), EU(C F (p ))>EU(C F ( 2 )) as desired. Case 2) C F ( 1 )>R 1 Task here is to compare EU under C F ( 1 ) vs C F ( 2 ). By Lemma 7, l 1 (c) > l 2 (c), and in particular for c =C 0 . Therefore, EU(C F ( 1 ))>EU(C F ( 2 )) as desired. Therefore, in each of the cases, indeed, C F (p )2 [R1;R2) as we assumed. We can now use (46) without any qualifications. This immediately ties up a loose end we left in Theorem 6 (R H c ), which we state as a Lemma. Lemma 11. R H c . Proof. It suffices to show that C F (p )6= C F ( 2 ). Recall that 2 < 1 , and hence by Lemma 3, C F ( 2 )<C F ( 1 ). )C F ( 2 )<C F ( 1 )max(R 1 ;C F ( 1 )) =C F (p ), as desired. We now use (46) to wrap up the proof. Consider the two cases: 211 Case 1) C F ( 1 )R 1 By (43), C F (p ) =R 1 . This implies p = C 0 R 1 R 2 R 1 =, whence (i), (ii), (iii) follow directly. Case 2) C F ( 1 )>R 1 By (43), C F (p ) = C F ( 1 ). By same logic as in Theorem 6, Claim 1, Step 1, we can think of M (s) asM ( 1 ;R 2 ;C 0 ), which is a setting where Theorem 1 - Theorem 5 apply. (i) follows from Theorem 5 and the chain rule. (ii) and (iii) follow from Theorem 1 and the chain rule. . ProofofTheorem8: Withoutlossofgenerality, supposethatC <R alwaysholds. (Assuming otherwise only strictly increases EU whence the same analysis can be used for the proof.) Consider a hypothetical sub-martingale L(p) with p = 1 and > 0. Namely, L(1) is a bet that gives with probability 1, and gives -1 with probability 0. Then, for any N > 0;EU(1;N ) =U(C S ) = U(C 0 +N)>U(C 0 ), with strict inequality. Since EU is continuous in p,9p 2 (0; 1) such that EU(p ;N )>U(C 0 ). Proof of Theorem 9: For the given problem, letp be the optimal solution when = 0; ceteris paribus. We want to show that any candidate p SM with p SM < p (1 +) is strictly dominated by p (1 +). Under the assumption, we know that C S = R, so we will assume this along the way. We first establish some notations and gather related facts. LetL(C;C F ;R;U()) be the line (as function of C) that passes through the points (C F ;U(C F )) and (R;U(R)), whereU() is the single-jump aspirational utility function we have defined and used thus far. 212 For the given , C 0 , R, and for any choice of p SM , let C 0 F (p SM ;;C 0 ;R) be the C F which satisfies the budget constraint-cum-; equation (3.17). Explicitly it is given as: C 0 F (p SM ;;C 0 ;R) := C 0 (1 +) 1 +p p 1p + R: Let C POE (p SM ;;C 0 ;R) be the ‘point of evaluation’, defined as: C POE (p SM ;;C 0 ;R) :=(1p SM )C F (p SM; ) +p SM R =C 0 (1 +) p SM 1 p SM (1 +) + 1 p SM 1 p SM (1 +) p SM R C 0 (1 +) : The reason for this nomenclature will become evident. Given these definitions and notations, some immediate facts can now be observed. Fact 1 First, under the given problem, the expected utility (EU) given the choice p SM is given byL(C POE (p SM )), with the parameter notations suitably suppressed. This is an immediate conse- quence of the definition ofL(C) andC POE and the definition of expected utility under our binomial setting. This observation is where the name ‘point of evaluation’ comes from; the expected utility can be regarded as L(C) evaluated at the ‘point of evaluation’, C POE . Needless to say, since U() in monotone in its argument and C F <R by definition,L(C) has positive slope. Fact2Second, wealreadyknowthatC F (p )isthetangentpointtoU(), soinsofarasp SM 6=p , L(C;C F (p SM ;> 0);R;U()) will be strictly lower than L(C;C F (p ; = 0);R;U()) Fact 3 Third, some straightforward calculations inform us that L(C;C F (p (1 +);> 0);R;U()) =L(C;C F (p ; = 0);R;U()): 213 That is, if changes from 0 to some strictly positive value, inducing a change in the associated L(C), we can revert back to the original ( = 0) line simply by increasing the choice of p SM by a factor of (1 +): Fact 4 Fourth, from the definition of C POE above and by differentiation, it is not hard to sign the quantity: @C POE @p SM > 0; again, with suitable abuse of notation. We now have enough mustered up to prove the theorem. The task is to show that given the setup with > 0, any choice of p SM such that p SM < p (1 +) is strictly dominated by an alternative, feasible choice of choosing p (1 +), which is, by definition, greater than p . From the second and third fact, we know that: L(C;C F (p SM ;> 0);R;U())<L(C;C F (p ; = 0);R;U()) =L(C;C F (p (1 +);> 0);R;U()) From the first fact, we know that EU can be evaluated from the lines L(C) at suitable C POE ’s. Recall that L(c) have positive slopes. Hence, the only case where EU(p SM )>EU(p (1 +)) can ever happen with p SM <p (1 +) is when: C POE (p (1 +);;C 0 ;R)<C POE (p SM ;;C 0 ;R); for some p SM <p (1 +). But this contradicts the fourth fact, proving our claim. Proof of Lemma 3: Step 1) Show that any consumption scheme whose C S exceeds R is dominated by a consump- tion scheme whose C S equals R. 214 Let N 0 be the N such that C S = R on equation (3.6). Then as in the proof of Lemma 1, EU NN 0 =(1p)U(C F ) +pU(C S ) =(1p)U(C 0 N) +pU(C 0 +N(M 1)) Then differentiate EU NN 0 with respect to N at N 0 , p : @EU @N N 0 ;p =(1p )U 0 (C 0 N) + (M 1)p U 0 (C 0 +N(M 1)) N 0 = (1p ) (1 + (1p ) )U 0 (C S )U 0 (C F ) N 0 = (1p ) (1 + (1p ) )U 0 (R)U 0 (C F ) < 0; where the last inequality follows from the domain of p, and assumption (3.18). Step 2) Showing that C S <R never happens is same as in Lemma 1. Theorem 10: We proceed in two steps. In Step 1, we show that the optimal consumption positions are identical to the those of the optimized binomial scheme. In Step 2, we show that the probability mass on C 0 must be 0 for optimality. Step 1: Consider the standard solution to the EU-maximization problem using binomial con- sumption scheme. Let C F and C S be optimal C F and C S in this binomial solution. LetT with 215 P = (p 1 ;p 2 ;p 3 ) andC = (C S ;C 0 ;C F ) be the optimal trinomial consumption scheme toM (;R;C 0 ). ThenC = (C S ;C 0 ;C F ): Proof. (Given any p 2 , 0 p 2 < 1.) The proof goes by comparing the objective functions of the binomial and trinomial optimization. The trinomial optimization problem is: max p 1 ;p 2 ;p 3 p 3 U(C F ) +p 2 U(C 0 ) +p 1 U(C S ); subject to p 3 C F +p 2 C 0 +p 1 C S =C 0 and p 1 +p 2 +p 3 = 1 or equivalently, max p 1 ;p 3 U(C 0 ) +p 1 U(C S )U(C 0 ) +p 3 U(C F )U(C 0 ) ; (C.39) subject to p 1 (C S C 0 ) +p 3 (C F C 0 ) = 0 and p 1 +p 3 = 1p 2 (C.40) Recall that the objective function for the binomial scheme was: max p U(C 0 ) +p U(C S )U(C 0 ) + (1p) U(C F )U(C 0 ) ; (C.41) subject to p(C S C 0 ) + (1p)(C F C 0 ) = 0 (C.42) Scaling (C.39) - (C.40) by 1 1p 2 yields a monotone affine transformation (affine in the choice vari- ables) of (C.41) and identical constraint, hence identical solution, up to the C positions. (P cannot be pinned down yet, because it is determined only up to scaling.) Step 2: p 2 = 0. 216 Proof. By an argument similar to the proof of Lemma 5-Lemma 7, we know that: (1)U(C F ) +U(R)>U(C 0 ); for 0<< 1: On the other hand, some manipulation on (C.39) yields EU = (1p 2 ) (1p 2 p 1 ) (1p 2 ) U(C F ) + p 1 (1p 2 ) U(R) +p 2 U(C 0 ): Hence, to maximize EU, we require p 2 = 0. Proof of Theorem 11: This is really a Corollary of Theorem 10. By Theorem 10, we can reduce the solution spaceT to the space ofB. Then by Lemma 1, theC S of the associatedB must equal R. 217 C.2 Appendix to the experiment C.2.1 Lotteries used in study 1 Table C.1: Description of lotteries used in study 1 Round Option Mean SD Skewness Outcomes Pr. 1 (5*) A 7 2 0 {5, 9} (0.5, 0.5) 1 (5*) B 7 2 1.5 {6, 11} (0.8, 0.2) 2* (7) A 13 3 0 {10, 16} (0.5, 0.5) 2* (7) B 13 3 -2.7 {4, 14} (0.1, 0.9) 3 (2*) A 11 3 0 {8, 14} (0.5, 0.5) 3 (2*) B 11 3 -2.7 {2, 12} (0.1, 0.9) 4 (9*) A 11 3 0 {8, 14} (0.5, 0.5) 4 (9*) B 11 3 2.7 {10, 20} (0.9, 0.1) 5* (1) A 7 2 0 {5, 9} (0.5, 0.5) 5* (1) B 7 2 -1.5 {3, 8} (0.2, 0.8) 6* (10) A 19 8 0 {11, 27} (0.5, 0.5) 6* (10) B 19 8 1.5 {15, 35} (0.8, 0.2) 7* (4) A 13 3 0 {10, 16} (0.5, 0.5) 7* (4) B 13 3 2.7 {12, 22} (0.9, 0.1) 8* (3) A 19 8 0 {11, 27} (0.5, 0.5) 8* (3) B 19 8 -1.5 {3, 23} (0.2, 0.8) 9 (6*) A 17 3 0 {14, 20} (0.5, 0.5) 9 (6*) B 17 3 2.7 {16, 26} (0.9, 0.1) 10 (8*) A 17 3 0 {14, 20} (0.5, 0.5) 10 (8*) B 17 3 -2.7 {8, 18} (0.1, 0.9) The rounds in brackets indicate the corresponding round in the experiment section 3. The non-bracketed number indicates the round in section 1; The starred rounds indicate which section involved the control pair (i.e. when the donation threshold was impossible to reach); Option A was always the non-skewed option, Option B was always the skewed option; Par- ticipants did not see the moments of the lotteries. 218 C.2.2 Lotteries used in study 2 Table C.2: Description of lotteries used in study 2 Round Option Mean SD Skewness Outcomes Pr. 1 A 11 3 -2.7 {2, 12} (0.1, 0.9) 1 B 11 3 0 {8, 14} (0.5, 0.5) 1 C 11 3 2.7 {10, 20} (0.9, 0.1) 2 A 13 3 -2.7 {4, 14} (0.1, 0.9) 2 B 13 3 0 {10, 16} (0.5, 0.5) 2 C 13 3 2.7 {12, 22} (0.9, 0.1) 3 A 15 3 -2.7 {6, 16} (0.1, 0.9) 3 B 15 3 0 {12, 18} (0.5, 0.5) 3 C 15 3 2.7 {14, 24} (0.9, 0.1) 4 A 17 3 -2.7 {8, 18} (0.1, 0.9) 4 B 17 3 0 {14, 20} (0.5, 0.5) 4 C 17 3 2.7 {16, 26} (0.9, 0.1) 5 A 18 2 -1.5 {14, 19} (0.2, 0.8) 5 B 18 2 0 {16, 20} (0.5, 0.5) 5 C 18 2 1.5 {17, 22} (0.8, 0.2) 6 A 18 4 -1.5 {10, 20} (0.2, 0.8) 6 B 18 4 0 {14, 22} (0.5, 0.5) 6 C 18 4 1.5 {16, 26} (0.8, 0.2) 7 A 18 6 -1.5 {6, 21} (0.2, 0.8) 7 B 18 6 0 {12, 24} (0.5, 0.5) 7 C 18 6 1.5 {15, 30} (0.8, 0.2) 8 A 18 8 -1.5 {2, 22} (0.2, 0.8) 8 B 18 8 0 {10, 26} (0.5, 0.5) 8 C 18 8 1.5 {14, 34} (0.8, 0.2) 9 A 19 6 -2.7 {1, 21} (0.1, 0.9) 9 B 19 6 -1.5 {7, 22} (0.2, 0.8) 9 C 19 6 0 {13, 25} (0.5, 0.5) 9 D 19 6 1.5 {16, 31} (0.8, 0.2) 9 E 19 6 2.7 {17, 37} (0.9, 0.1) 10 A 22 6 -2.7 {4, 24} (0.1, 0.9) 10 B 22 6 -1.5 {10, 25} (0.2, 0.8) 10 C 22 6 0 {16, 28} (0.5, 0.5) 10 D 22 6 1.5 {16, 31} (0.8, 0.2) 10 E 22 6 2.7 {17, 37} (0.9, 0.1) 219 C.2.3 Charity organization options: 1. MAKE-A-WISH Foundation The MAKE-A-WISH is a non-profit organization that creates life-changing wishes for children with a critical ilness who are between the ages of 2.5 and 18. Children who may be eligible to receive a wish can be referred by either a medical professional treating the child, a parent/guardian or the potential wish child. 2. AMERICAN CIVIL LIBERTIES UNION (ACLU) The American Civil Liberties Union (ACLU) works in the courts, legislatures and communities to defend and preserve the individual rights and liberties guaranteed to all people in this country by the Constitution and laws of the United States. 3. AMNESTY INTERNATIONAL Amnesty international is a global movement of more than 7 million people in over 150 countries and territories who campaign to end abuses of human rights. 4. World Wide Fund for Nature (WWF) The World Wide Fund for Nature (WWF) is an interantional non-governmental organization founded in 1961, working in the field of the wilderness preservation, and the reduction of human impact on the environment. 5. GREENPEACE Greenpeace is an independent, campaigning organization which uses non-violent, creative con- frontation to expose global environmental problems, and to force solutions for a green and peaceful future. 6. The Salvation Army The Salvation Army, an international movement, is an evangelical part of the universal Christian Church. Its message is based on the Bible. Its ministry is motivated by the love of God. Its mission is to preach the gospel of Jesus Christ and to meet human needs in His name without discrimination. 220 C.2.4 Additional tables for study 1 Table C.3: Distribution of choices by distinct skewness option combinations Revealed Pairs with skewness of preference {0, +1.5} {0, +2.7} {0, -1.5} {0, -2.7} AA 30.16% 25.93% 45.24% 39.42% AB 21.83% 23.81% 17.06% 19.31% BA 12.30% 17.46% 13.49% 11.11% BB 35.71% 32.80% 24.21% 30.16% 100% 100% 100% 100% In all pairs the mean and variance is the same across the two options. Table C.4: Distribution of choices in pairs with positively skewed options Pairs with Revealed Skewness: {0, +1.5} {0, +2.7} preference (, ): (7, 2) (19, 8) (11, 3) (13, 3) (17, 3) AA 28.57% 31.75% 26.19% 30.16% 21.43% AB 20.63% 23.02% 28.57% 19.05% 23.81% BA 14.29% 10.32% 19.05% 13.49% 19.84% BB 36.51% 34.92% 26.19% 37.30% 34.92% 100% 100% 100% 100% 100% The third row shows the mean and standard deviation of the lottery pair. 221 Table C.5: Distribution of choices in pairs with negatively skewed options Pairs with Revealed Skewness: {0, -1.5} {0, -2.7} preference (, ): (7, 2) (19, 8) (11, 3) (13, 3) (17, 3) AA 43.65% 46.83% 41.27% 34.13% 42.86% AB 19.84% 14.29% 23.81% 13.49% 20.63 BA 15.08% 11.90% 6.35% 21.43% 5.56% BB 21.43% 26.98% 30.95% 30.95% 30.95% 100% 100% 100% 100% 100% The third row shows the mean and standard deviation of the lottery pair. 222 C.2.5 Additional tables for study 2 Table C.6: Distribution of choices across rounds of study 2 Round Chosen Donation threshold N Lottery Low Mid High A (-ve skewness) 26.19% 14.29% 16.67% 1 B (0 skewness) 33.33% 57.14% 33.33% 126 C (+ve skewness) 40.48% 28.57% 50.00% A (-ve skewness) 7.14% 19.05% 26.19% 2 B (0 skewness) 42.86% 54.76% 33.33% 126 C (+ve skewness) 50.00% 26.19% 40.48% A (-ve skewness) 9.52% 9.52% 14.29% 3 B (0 skewness) 30.95% 57.14% 35.71% 126 C (+ve skewness) 59.52% 33.33% 50.00% A (-ve skewness) 38.10% 14.29% 16.67% 4 B (0 skewness) 21.43% 40.48% 35.71% 126 C (+ve skewness) 40.48% 45.27% 47.62% A (-ve skewness) 28.57% 16.67% 14.29% 5 B (0 skewness) 30.95% 38.10% 21.43% 126 C (+ve skewness) 40.48% 45.24% 64.29% A (-ve skewness) 35.71% 16.67% 16.67% 6 B (0 skewness) 45.24% 45.24% 26.19% 126 C (+ve skewness) 19.05% 38.10% 57.14% A (-ve skewness) 11.90% 11.90% 28.57% 7 B (0 skewness) 28.57% 52.38% 21.43% 126 C (+ve skewness) 59.52% 35.71% 50.00% A (-ve skewness) 35.71% 21.43% 14.29% 8 B (0 skewness) 26.19% 42.86% 23.81 % 126 C (+ve skewness) 38.10% 35.71% 61.90% The percentages across columns indicate the histogram of choices across treatments. The “Low” [“Mid” / “High”] treatment includes all rounds in which the “Donation threshold” was set such that lottery “A” [“B” / “C”] maximized the probability of the donation. Bolded percentages indicate the largest percentage of the row. 223
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Three essays on strategic commuters with late arrival penalties and toll lanes
PDF
The evolution of decision-making quality over the life cycle: evidence from behavioral and neuroeconomic experiments with different age groups
PDF
Essays on understanding consumer contribution behaviors in the context of crowdfunding
PDF
Essays in behavioral and entrepreneurial finance
PDF
Three essays on behavioral economics approaches to understanding the implications of mental health stigma
PDF
Essays in health economics and provider behavior
PDF
Bounded technological rationality: the intersection between artificial intelligence, cognition, and environment and its effects on decision-making
PDF
Three essays on the causes and consequences of China’s governance reforms
PDF
Essays on narrative economics, Climate macrofinance and migration
PDF
Essay on monetary policy, macroprudential policy, and financial integration
PDF
Essays on nonparametric and finite-sample econometrics
PDF
Relationships and meaning: examining the roles of personal meaning and meaning-making processes in couples dealing with important life events
PDF
Intergenerational transmission of values and behaviors over the family life course
PDF
Essays on information design for online retailers and social networks
PDF
Three essays on the evaluation of long-term care insurance policies
PDF
Self-perceptions of Aging in the Context of Neighborhood and Their Interplay in Late-life Cognitive Health
PDF
Examining the longitudinal influence of the physical and social environments on social isolation and cognitive health: contextualizing the role of technology
PDF
Toward counteralgorithms: the contestation of interpretability in machine learning
Asset Metadata
Creator
Aristidou, Andreas
(author)
Core Title
Essays on behavioral decision making and perceptions: the role of information, aspirations and reference groups on persuasion, risk seeking and life satisfaction
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Publication Date
11/29/2020
Defense Date
10/27/2020
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
applied microeconomics,aspirations,behavioral economics,Economics,financial decision making,habit formation,incentives,life satisfaction,microeconomics,OAI-PMH Harvest,persuasion,reference group,skewness seeking,subjective well-being
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Coricelli, Giorgio (
committee chair
), Camara, Odilon (
committee member
), Frydman, Cary (
committee member
), Kapteyn, Arie (
committee member
)
Creator Email
aaristid@usc.edu,aristidouandreas13@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c89-400549
Unique identifier
UC11666708
Identifier
etd-AristidouA-9162.pdf (filename),usctheses-c89-400549 (legacy record id)
Legacy Identifier
etd-AristidouA-9162.pdf
Dmrecord
400549
Document Type
Dissertation
Rights
Aristidou, Andreas
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
applied microeconomics
aspirations
behavioral economics
financial decision making
habit formation
incentives
life satisfaction
microeconomics
persuasion
reference group
skewness seeking
subjective well-being