Close
Home
Collections
Login
USC Login
Register
0
Selected
Invert selection
Deselect all
Deselect all
Click here to refresh results
Click here to refresh results
USC
/
Digital Library
/
University of Southern California Dissertations and Theses
/
The evolution of decision-making quality over the life cycle: evidence from behavioral and neuroeconomic experiments with different age groups
(USC Thesis Other)
The evolution of decision-making quality over the life cycle: evidence from behavioral and neuroeconomic experiments with different age groups
PDF
Download
Share
Open document
Flip pages
Contact Us
Contact Us
Copy asset link
Request this asset
Transcript (if available)
Content
THE EVOLUTION OF DECISION-MAKING QUALITY OVER THE LIFE CYCLE: EVIDENCE FROM BEHAVIORAL AND NEUROECONOMIC EXPERIMENTS WITH DIFFERENT AGE GROUPS Niree Kodaverdian A Dissertation Presented to the FACULTY OF THE USC GRADUATE SCHOOL UNIVERSITY OF SOUTHERN CALIFORNIA In Partial Fulllment of the Requirements for the Degree DOCTOR OF PHILOSOPHY (ECONOMICS) May 2017 Copyright 2017 Abstract Through controlled laboratory experiments, this thesis examines age-related dierences in decision-making quality { in particular, in choice consistency, transitivity, and mutually- benecial cooperation. The four studies here uncover a nonlinear path of decision quality over the life cycle: increasing from childhood to adulthood, until eventually decreasing with old age. Age-related changes, it is found, are not independent of the complexity or domain of the decision, and are sometimes masked by the use of simple choice rules. Taken together, these ndings indicate a compensatory nature of choice rules: utilized to coun- teract cognitive faculties that are yet-underdeveloped, as in the case of younger children, or deteriorating, as in the case of older adults. The rational use of simple strategies, on the other hand, increases from childhood to adulthood, as a growing proportion of subjects learns to best respond to their environment. In the context of individual decision-making, these behavioral and neuroeconomic ndings implicate working memory, attention, and self-knowledge of preferences as important factors in consistency and transitivity. In the context of dynamic interactions, adaptive and anticipatory strategic reasoning { and to a lesser extent, other-regarding preferences { are found to be instrumental in achieving mutually-benecial cooperation, with group membership magnifying their eects. Under- standing changes in decision-making quality and their causes can help expand standard economic theories to more accurately re ect human behavior, as well as, inform the design of more favorable policies. ii For my late grandfather, Tadevos Der Sarkissian { who encouraged me endlessly and instilled in me a passion for learning. Although you cannot be here today to witness the completion of this thesis, you are always with me in spirit. iii Acknowledgments The research included in this dissertation could not have been performed if not for the support of many individuals, and I would like to acknowledge them here. First and foremost, I would like to express my gratitude to my advisor, Dr. Isabelle Brocas, for her support over the years. Although I had one advisor on paper, Dr. Juan Carrillo also acted in an advisor capacity and I would like to express my gratitude to him as well. I appreciate their vast knowledge in microeconomic theory and experimental economics, and their assistance in writing this thesis. Thank you both, for encouraging my research and for allowing me to grow as a scientist. A special thanks goes out to Dr. Edward McDevitt, without whose motivation and encouragement, I might not have considered a graduate career in economics. Although I did not continue research in the same vein as my undergraduate thesis, the direction, resources, and technical support he provided as an advisor were invaluable in introducing me to economic research. Besides my advisors, I would like to thank the third member of my thesis committee, Dr. John Monterosso, for his guidance over the years. Thank you for always being there for me. I thank Dr. Giorgio Coricelli and Dr. Wendy Wood for being on my larger com- mittee. Thank you for your insightful comments during my qualifying exam and for your constant encouragement. Wendy and John, a special thanks for funding my research for two summers and for your brilliant feedback at the Social Neuroscience retreats. I thank my fellow Los Angeles Behavioral Economics Laboratory (LABEL) labmates for the stimulating discussions inside and outside of meetings, for the comments at various phases of my studies, and for helping me run experiments. I am grateful for the friendships that stemmed from LABEL. I am also grateful to my co-author, oce-mate, and friend, Dr. Thomas Dalton Combs. Thank you for introducing me to computer programming and encouraging me to learn new languages, for working non-stop with me before deadlines, and for always remaining calm, even when we needed to do last minute trouble-shooting. To the faculty in the Economics department who taught core and special topics, and to the faculty in Psychology whose teachings made it possible for me to undertake a neuroeconomic study { Dr. Jonas Kaplan for teaching me experimental programming, Dr. Antoine Bechara for teaching me functional neuroanatomy and decision neuroscience, the late Dr. Bosco Tjan for teaching me the fundamentals of fMRI, as well as Dr. Giorgio Coricelli in Economics for teaching me advanced neuroeconomics { thank you for sharing your immense knowledge with me. I am grateful to have learned from the best. I would like to thank the Economics department for their nancial and administrative support over the years. Thank you for providing summer funding opportunities, travel iv grants, and teaching awards. I thank the department sta for taking care of the logistics of this program with the utmost diligence, understanding, and friendliness. A special thanks to Morgan Ponder for his support during the job market, and in general. You were not only a student services advisor, but also a friend. I thank the sta at our various study sites who helped organize experimental sessions and made sure I had what I needed to run sessions smoothly: the women at OASIS-Baldwin Hills and OASIS-West Los Angeles and especially then-director Rosa Aguirre who allowed access to the sites, the people at Lyc ee International de Los Angeles (LILA) and especially IT Director Tim Lough who worked with me to set up a closed network for our experiment, as well as the sta at the USC Dornsife Cognitive Neuroimaging Center, and especially Dr. Jiancheng Zhuang who helped immensely in running fMRI sessions. Additionally, I would like to thank the women at OASIS centers, the students at LILA campuses, and the students at USC who participated in our studies. Last but not least, I would like to extend my most profound gratitude to my family, for their unwaivering love and support throughout the writing of this thesis, and in my life generally. Words cannot express how deeply grateful I am to my parents, for all of the sacrices they have made on my behalf over the years. Thank you, Dad { for inspiring curiosity in me from a young age; and thank you, Mom { for encouraging me to follow my dreams. A special thanks to my siblings: I knew I could always count on you for your invaluable perspective and humor. I also thank my grandparents, living and deceased, for always believing in me. Your prayers for me were what sustained me thus far. v Table of Contents List of Tables ix List of Figures x Introduction xii 1 Consistency in Simple vs. Complex Choices over the Life Cycle 1 1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Theoretical background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.1 Bundles with identical goods . . . . . . . . . . . . . . . . . . . . . . 6 1.2.2 Bundles with dierent goods . . . . . . . . . . . . . . . . . . . . . . 8 1.3 Experimental design and procedures . . . . . . . . . . . . . . . . . . . . . . 9 1.4 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4.1 Frequency of violations . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4.2 Severity of violations . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.4.3 Trivial trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 1.5 Understanding violations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 1.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 1.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 1.8.1 Appendix A1. Example of direct violation in a triplet of trials of treatment S . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 1.8.2 Appendix A2. List of all food items (with portions) used in the experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 1.8.3 Appendix A3. Individual analysis . . . . . . . . . . . . . . . . . . . . 38 1.8.4 Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2 Value-Based Decision-Making: A New Developmental Paradigm 50 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 2.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 2.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 2.6 Supporting Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 2.6.1 SI-1. Design and procedures . . . . . . . . . . . . . . . . . . . . . . . 63 2.6.2 SI.2. Analysis of transitivity violations . . . . . . . . . . . . . . . . . 67 vi 2.6.3 SI-3. Additional statistical analysis . . . . . . . . . . . . . . . . . . . 72 3 Bundling Options in Value-Based Decision-Making: Attention, Calcula- tion, and Working Memory 83 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.2 Materials and methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.2.1 Subjects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.2.2 Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86 3.2.3 MRI data acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 3.2.4 MRI data preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . 89 3.2.5 Behavioral analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 3.2.6 Analysis of reaction times . . . . . . . . . . . . . . . . . . . . . . . . 90 3.2.7 MRI data analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 3.2.8 Postprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 3.3.1 Behavioral results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 3.3.2 Reaction times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 3.3.3 Regions correlating with subjective value . . . . . . . . . . . . . . . 93 3.3.4 Regions involved in complex conditions (SCALING and BUNDLING) 95 3.3.5 Regions involved in complex calculations (BUNDLING vs. SCALING) 95 3.3.6 Connectivity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 98 3.3.7 Other analyses of interest . . . . . . . . . . . . . . . . . . . . . . . . 98 3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 3.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 4 Altruism and strategic giving in children and adolescents 110 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 4.2 Experimental design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 4.3 Analysis of actions and strategies . . . . . . . . . . . . . . . . . . . . . . . 118 4.3.1 One-shot (OS) games: altruism . . . . . . . . . . . . . . . . . . . . 118 4.3.2 Alternating dictator supergames: strategic adaptation . . . . . . . . 120 4.3.3 The relationship between altruism and strategic adaptation . . . . . 123 4.3.4 One-shot game vs. alternating supergame: strategic anticipation . . 124 4.3.5 The dual eect of altruism and rst decisions . . . . . . . . . . . . . 125 4.3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 4.4 Empirical best response . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 4.4.1 Best response to Markov behavior . . . . . . . . . . . . . . . . . . . 129 vii 4.4.2 Best response to simple strategies . . . . . . . . . . . . . . . . . . . . 130 4.4.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 4.5 Payos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 4.6 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 4.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 4.8 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 4.8.1 Appendix A: analysis of one-shot games (b), (c) and (d) . . . . . . . 143 4.8.2 Appendix B: transcript of instructions (school-age subjects) . . . . . 144 viii List of Tables 1 Summary of treatments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2 Pearson correlations of memory, intelligence and GARP violations. . . . . . 23 3 Ordinary least squares (OLS) regression of number of violations in treat- ment C (all subjects) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4 OLS Regression of number of violations in treatment C (subjects with 2 or less violations in treatment A) . . . . . . . . . . . . . . . . . . . . . . . . . 26 5 Summary statistics by cluster. . . . . . . . . . . . . . . . . . . . . . . . . . 40 6 Types of preferences by subjects in clusters 1 and 2 . . . . . . . . . . . . . . 43 7 Description of participants by age and gender. . . . . . . . . . . . . . . . . . 63 8 Average number of indierences. . . . . . . . . . . . . . . . . . . . . . . . . 76 9 Most and least favorite options in Social domain derived from implicit rankings 77 10 Favorite option in the Risk domain derived from implicit rankings . . . . . 79 11 OLS regression of transitivity violations in the Goods domain . . . . . . . . 81 12 OLS regression of transitivity violations in the Social domain . . . . . . . . 82 13 OLS regression of transitivity violations in the Risk domain . . . . . . . . . 82 14 Local minima in corrected p-value parametric value regressor. . . . . . . . . 94 15 Local minima in corrected p-value parametric value regressor for CON- TROL trials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 16 Local minima in corrected p-value parametric value regressor for BUNDLING trials Part1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 17 Local minima in corrected p-value parametric value regressor for BUNDLING trials Part2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 18 Local minima in corrected p-value for BUNDLING>SCALING. . . . . . . . 101 19 Local minima in corrected p-value for BUNDLING Connectivity >CON- TROL Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 20 Subjects by grade. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 21 One-shot anonymous dictator games . . . . . . . . . . . . . . . . . . . . . . 115 22 Proportion of subjects with xed actions by age group . . . . . . . . . . . 120 23 OLS regression of reciprocity and group cooperation rates across supergames126 24 Probit regression of rst choice by First and Second mover . . . . . . . . . . 127 25 Best response to simple strategies by age group. . . . . . . . . . . . . . . . . 133 26 OLS regression of Per-round payos across supergames . . . . . . . . . . . . 137 ix List of Figures 1 Trials a 12 vs. a 0 12 and b 12 vs. b 0 12 . . . . . . . . . . . . . . . . . . . . . . . . 7 2 The 35 trials in treatment S. . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3 Screenshot of one trial in treatment C . . . . . . . . . . . . . . . . . . . . . 13 4 Number of violations in treatments S (left) and C (right) . . . . . . . . . . 15 5 Direct (left) and Indirect (right) violations in treatment C . . . . . . . . . 17 6 Severity of violations in treatments S (left) and C (right) . . . . . . . . . . 18 7 Choices to remove for consistency in treatments S (left) and C (right) . . . 19 8 Number of violations in treatment A (trivial trials) . . . . . . . . . . . . . 20 9 Choice violations by subjects with at most two treatment A violations . . 21 10 Trials (a 12 vs. a 0 12 ), (b 12 vs. b 0 12 ), (c 12 vs. c 0 12 ) . . . . . . . . . . . . . . . . . 37 11 Cluster representation. Misclassied trials (left) and number of violations (right) in treatments S and C. . . . . . . . . . . . . . . . . . . . . . . . . . 41 12 Decision-making tasks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 13 Performance improves with age in the Goods-Choice and Social-Choice tasks but not in the Risk-Choice task. . . . . . . . . . . . . . . . . . . . . . 54 14 Transitivity violations decrease with age dierently across choices. . . . . . 55 15 Social vs. Goods, Risk vs. Goods after removing simple policies. . . . . . . 56 16 Transitivity violations across choices among participants who do not use simple policies. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 17 Toys used in the Goods-Choice and Goods-Ranking tasks. . . . . . . . . . . 64 18 Sharing rules between self and other used in the Social-Choice and Social- Ranking tasks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 19 Lotteries used in the Risk-Choice and Risk-Ranking tasks. . . . . . . . . . . 65 20 Transitive Reasoning Task sample trial. . . . . . . . . . . . . . . . . . . . . 66 21 Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 22 Analysis of severity of violations. . . . . . . . . . . . . . . . . . . . . . . . . 72 23 Ranking and choices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 24 Discrepancies between explicit and implicit rankings (all subjects). . . . . . 75 25 Evolution of other-regarding preferences. . . . . . . . . . . . . . . . . . . . . 78 26 Three types of trials. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 27 Experimental Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 28 Value tracking regions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 29 Attentiveness is required in complex conditions. . . . . . . . . . . . . . . . . 99 x 30 The contrast between the single-item-trial regressor and the regressors for scaled-option-trials and bundled-option-trials correlates with the canonical default mode network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 31 Relative to SCALING, BUNDLING recruits regions associated with calcu- lation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 32 The vmPFC is more connected to the dlPFC during BUNDLING trials than during CONTROL trials from gPPI. . . . . . . . . . . . . . . . . . . . 103 33 Screenshot of dictator (left) and recipient (right) . . . . . . . . . . . . . . . 117 34 Aggregate choices in the one-shot (OS) dictator games by age group . . . . 118 35 Cooperation in rst and second supergame by age group . . . . . . . . . . . 120 36 Conditional cooperation by age group . . . . . . . . . . . . . . . . . . . . . 121 37 Learning to cooperate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 38 Conditional cooperation as a function of choice in OS . . . . . . . . . . . . 123 39 Choice in OS game (a) and rst round of rst supergame by age group . . . 124 40 Best response to Markov strategy by age group . . . . . . . . . . . . . . . . 130 41 Strategies (1 deviation allowed) . . . . . . . . . . . . . . . . . . . . . . . . . 132 42 Per-round payos by age group: unconditional (left) and as a function of choice in OS (right) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 43 Per-round payos of rst (left) and second (right) mover from round 3 on, and as a function of choice in round 1 . . . . . . . . . . . . . . . . . . . . . 136 44 Evolution of other-regarding preferences. . . . . . . . . . . . . . . . . . . . . 143 45 Screenshots for the instructions . . . . . . . . . . . . . . . . . . . . . . . . . 146 xi Introduction This thesis examines age-related dierences in decision-making quality { in particular, in choice consistency, transitivity, and mutually-benecial cooperation. Through controlled laboratory experiments, the evolving of decision quality over the life cycle was studied, and the determinants of this evolution investigated. In general, changes in cognitive factors, such as reasoning, working memory, and attention { and where applicable, preferences { were considered, as mechanisms of decision-making changes. Where relevant, the use of heuristics, or simple choice rules, was accounted for, which can \buy" consistency and in turn, belie the underlying patterns of change. Simple strategies replaced simple rules in the analysis of dynamic interactions, which when considered alongside the empirical distribution of play, can be revealing of a subject's rationality. While most of these in- vestigations were purely behavioral, the fMRI procedure was utilized in one study to gain deeper insight into the neural processes associated with decision quality. Understanding changes in decision-making quality and their causes can help expand standard economic theory to more accurately re ect human behavior, as well as, inform policy. In the rst chapter, titled \Consistency in Simple vs. Complex Choices over the Life Cycle," joint with I. Brocas, J. Carrillo, and T. D. Combs, we studied how choice consis- tency changes from young- to old-adulthood as a function of the complexity of the task. In our standard theories, agents are assumed to be rational { an implication being that her choices are consistent with one another. In this controlled laboratory study, we had younger and older adult subjects make binary choices between bundles of snack foods. Measuring choice consistency using a variant of the Generalized Axiom of Revealed Pref- erences (GARP), we found that younger and older adults were similarly consistent when making simple (two-good) decisions, both in the number and severity of their violations. However, when making complex (three-good) decisions, older adults were signicantly less consistent than younger adults, both in their violation frequency and severity. In support of our hypothesis that working memory is a critical mechanism driving such an interaction, we found that older adults' working memory scores were signicantly lower than that of younger adults and signicantly correlated with inconsistencies, but with only those from the complex task. Conducting individual analyses, we found that older adults likely relied on simple choice rules that (inadvertently) generate consistency; they employed these rules with relative ease in the simple task, but with diculty in the complex task, where their implementation is less intuitive. This type of research is increasingly important, as the baby boomer generation approaches old age and faces complex, multi-attribute decisions. Shifting focus to the other end of the life cycle, I. Brocas, J. Carrillo, T. D. Combs, and I studied the development of a more fundamental form of consistency { transitivity. xii In Chapter 2, titled \Value-Based Decision-Making: A New Developmental Paradigm," we examined how transitivity develops in childhood and how it diers across dierent deci- sion domains: goods, social, and risk. In this controlled laboratory study, subjects from kindergarten to 5th grade (with college students as a control group) were given choices between pairs of goods, of sharing options, and of lotteries. We found evidence that tran- sitivity develops gradually, but dierentially, across the dierent types of decisions in a way that cannot be explained by the development of attentional control or transitive rea- soning alone. In the goods domain, transitivity increased with age, as has been found in previous studies using GARP. A similar age trend of transitivity was found in the social domain, though initially masked by a disproportionate number of the youngest children using simple choice rules, based on one attribute of the sharing option only { a tendency referred to as centration. The trajectory of transitivity was dierent in the risk domain, where transitivity was similar across all school-aged children, and could not be explained by centration. While transitivity is independent of one's underlying preference ordering, we found stark preference dierences across ages, and more interestingly, discovered criti- cal role of one's self-knowledge of preferences in her decision quality. Our ndings indicate that in the goods domain, children become more transitive with age as they learn what they like most, as well as, what they like least. In the social domain, children become more transitive with age as they learn what they like most; while in the risk domain, children become more transitive with age as they learn what they like least. The working memory hypothesis from the rst chapter was taken under the micro- scope, or more literally, the Magnetic Resonance Imaging (MRI) scanner, in a joint study with I. Brocas, J. Carrillo, T. D. Combs, and J. Monterosso. In Chapter 3, titled \Bundling Options in Value-Based Decision-Making: Attention, Calculation, and Work- ing Memory," subjects made simple and complex binary choices between bundles of snack foods, while we measured their brain activity using functional Magnetic Resonance Imag- ing (fMRI). By having subjects make choices in the scanner, we were able to identify the neural regions that are positively correlated, and those that are negatively correlated, with choice complexity. In-line with the working memory hypothesis from Chapter 1, we found that complex decisions more heavily recruit the working memory system, as working memory regions (dlPFC) were signicantly more functionally-connected to value regions (vmPFC) during complex (two-unique-good) choices than they were during simple (one- good) choices. We also found that during complex (two-good) choices, the default mode network was less activated as compared to during simple (one-good) choices, re ecting the higher attentional requirements of making complex choices. During complex choices involving two unique goods, (parietal) regions previously associated with calculation were more activated as compared to during two-same-good choices. Biophysical techniques, xiii such as fMRI, provide vital insights into otherwise unobservable correlates of behavior. In Chapter 4, titled \Altruism and strategic giving in children and adolescents," joint with I. Brocas and J. Carrillo, we studied the evolution of behavior, motivation, and pay- o consequences of ecient but costly sharing in dynamic relationships from childhood to adulthood. Our primary goal was to disentangle between altruism { dened as the willing- ness to sacrice own payo to benet others { and strategic giving { dened as foregoing a current payo as a means to encourage a mutually-protable dynamic relationship. In a controlled laboratory experiment, children from kindergarten to 11th grade (with college students as a control) made binary choices about how much to share with an anonymous \other" in a one-shot setting, and with a xed but anonymous \partner" in a repeated setting. Analyzing choices from the two settings, we found that though altruism evolved with age, the age-related rise in cooperation was more strongly driven by increased strate- gic adaptation (adapting choice to partner's play in preceding rounds) and anticipation (anticipating the gains of cooperation and thus, cooperating in the rst round). The ob- served rise in cooperation was likely exacerbated by group eects, as \outlier" subjects (e.g. strategic subjects in the youngest group) conformed to their group (in anticipation of, or in response to, exploitation). Optimally conforming to one's group, however, was also age-dependent; the fraction of subjects using a simple strategy that best responds to the distribution of play in their group increased with age. Understanding how children and adolescents reason about choices and make strategic decisions is crucial for designing favorable educational policies. The ndings in this chapter suggest that such policies be tailored for dierent age groups, as motivations and logical abilities dier by age. The four studies here uncover a nonlinear path of decision-making quality over the life cycle: increasing from childhood to adulthood, until eventually decreasing with old age. Age-related changes, it is found, are not independent of the complexity or domain of the decision, and are sometimes masked by the use of simple choice rules. Taken together, these ndings indicate a compensatory nature of choice rules: utilized to counteract cognitive faculties that are yet-underdeveloped, as in the case of younger children, or deteriorating, as in the case of older adults. The rational use of simple strategies, on the other hand, increases from childhood to adulthood, as a growing proportion of subjects learns to best respond to their environment. In the context of individual decision-making, these behav- ioral and neuroeconomic ndings implicate working memory, attention, and self-knowledge of preferences as important factors in consistency and transitivity. In the context of dy- namic interactions, strategic reasoning { and to a lesser extent, other-regarding preferences { are found to be instrumental for mutually-benecial cooperation, with group member- ship magnifying their eects. A better understanding of the decision-making process at dierent ages is important for more accurate theories, and more appropriate policies. xiv 1 Consistency in Simple vs. Complex Choices over the Life Cycle ∗ Isabelle Brocas University of Southern California and CEPR Juan D. Carrillo University of Southern California and CEPR T. Dalton Combs University of Southern California Niree Kodaverdian University of Southern California Abstract Employing a variant of GARP, we study consistency in aging by comparing the choices of younger adults (YA) and older adults (OA) in a `simple', two-good and a `complex' three-good condition. We nd that OA perform worse than YA in the complex con- dition but similar to YA in the simple condition, both in terms of the number and severity of GARP violations. Working memory and IQ scores correlate signicantly with consistency levels, but only in the complex treatment. Our ndings suggest that the age-related deterioration of neural faculties responsible for working memory and uid intelligence is an obstacle for consistent decision-making. ∗ We are grateful to members of the Los Angeles Behavioral Economics Laboratory (LABEL) for their insights and comments in the various phases of the project. We thank Cary Deck, Mara Mather, John Mon- terosso, and participants at the 2014 LABEL Experimental Economics Conference (University of Southern California), and at the 2013 Social Neuroscience retreat (Catalina island, USC) for useful comments. We also thank Rosa Aguirre for helping to organize sessions at OASIS. All remaining errors are ours. The study was conducted with the University of Southern California IRB approval UP-08-00052. We acknowledge the nancial support of the National Science Foundation grant SES-1425062. Address for correspondence: Juan D. Carrillo, Department of Economics, University of Southern California, 3620 S. Vermont Ave., Los Angeles, CA 90089, USA, <juandc@usc.edu>. 1.1 Introduction As most day-to-day decisions involve comparing options and making trade-os between them, understanding how people attribute value to options is crucial in understanding how people make decisions. Economics builds theories under the assumption that individuals have unambiguous values for options and maintain stable preferences. These in turn imply consistency of choice, which can be tested empirically. Experimental studies have shown that choice consistency is prevalent, at least for younger adults (Andreoni and Miller, 2002; Andreoni and Harbaugh, 2009; Choi et al., 2014). By contrast, our knowledge about choice consistency in older adults is still incomplete. Understanding the eect of age on consistency can provide a foundation for rened economic models. Recent eld and experimental evidence has shown that older adults (OA) make dier- ent choices as compared to younger adults (YA) in a variety of domains. 1 Such dierences can potentially be due to two very dierent mechanisms: either preferences change with age or preferences remain stable but the ability to act consistently on them changes with age. There is indirect evidence for both possibilities. In line with the rst prong, aging brings dramatic changes in our motivations, which in turn aects the decisions we make (Carstensen and Mikels, 2005; Mather and Carstensen, 2005). At the same time, the ag- ing process aects many brain structures and brain mechanisms, hindering the ability to evaluate alternatives and select among options (Mohr, et al., 2010; Nielsen and Mather, 2011), especially when they become complex (Brand and Markowitsch, 2010; Besedes et al., 2012a, 2012b). Disentangling between preference changes and mistakes is essential for policy-making purposes (Bernheim and Rangel, 2009) as well as for purposes of cost avoid- ance on the part of the decision maker (Lichtenstein and Slovic, 1973). We aim to resolve this problem by controling for dierences in preferences and testing those preferences for consistency. In this paper, we propose to use the Generalized Axiom of Revealed Preference (GARP) to test the internal consistency of the preferences of YA and OA by oering repeated choices between bundles of goods. Additionally, we vary the complexity of the task by changing the number of unique goods that are present in a choice. Our goal is to understand consistency at dierent ages and as a function of the complexity of the situation. 1 See e.g. Fehr et al., 2003; Ameriks et al., 2007; Bellemare et al., 2008; Engel, 2011; Albert and Duy, 2012; Castle et al., 2012. However, there is also evidence that OA and YA make similar choices in some of the same domains (Dror et al., 1998; Kovalchik et al., 2005; Sutter and Kocher, 2007; Charness and Villeval, 2009). Yet others nd curvilinear age eects (Harrison et al., 2002; Read and Read, 2004). Some studies oer to resolve these mixed ndings, by arguing that results are highly sensitive to dierences in the learning requirements (Mata et al., 2011), the completeness of information (Zamarian et al., 2008), the number of options to choose between (Brand and Markowitsch, 2010) and the contents of choice sets (Mather et al., 2012). 2 More specically, we use a controlled laboratory experiment with a 22 design, where YA and OA make choices in two dierent domains: simple and complex. In the simple domain, subjects decide between two bundles each composed of dierent quantities of the same two goods (e.g., 5 pistachios plus 1 cheese vs. 2 pistachios plus 2 cheese). In the complex domain, subjects choose between two bundles, each also composed of dierent quantities of two goods, but now with exactly one common good (e.g., 5 pistachios plus 1 cheese vs. 2 pistachios plus 2 crackers). Besides the contribution of comparing YA and OA in a simple and a complex domain, our design has three new elements relative to the existing literature (reviewed below). We ask subjects to choose between two bundles presented pictorially. This simplies the choice problem relative to presenting a large number of bundles (as in Harbaugh et al., 2001) or relative to presenting a budget set on a coordinate plane (as in Choi et al., 2007; Fisman et al., 2007; Choi et al., 2014). We also include trivial trials to our task, where subjects choose between a smaller and a larger quantity of one desirable good. Subjects who fail these trials are likely to violate one or more assumptions of the model; they are inattentive, they do not monotonically value the good over the tested range, and/or they misunderstand the task. This allows us to conduct the consistency analysis both with the full sample and with the subsample of subjects for whom we are most condent the model is appropriate. In addition, our subjects perform a working memory and IQ task. This allows us to study the determinants of consistency. Sample selection issues limit the extent to which causal relationship can be assessed. Notice that, despite our best eorts to match samples, dierences found across age groups may be due to cohort-specic factors and may be unrelated to age. 2 Our analysis will take these limitations into account to draw conclusions. With this caveat in mind, we next summarize the two main ndings of our study. First, both OA and YA are reasonably (and roughly equally) consistent in the simple treatment whereas the OA in our sample are signicantly more inconsistent than the YA in the complex treatment. This dierence across populations applies generally: to the number of total violations, to the number of violations by type (direct and indirect) and to the severity of violations (using two dierent criteria). Surprisingly, a signicant fraction of subjects (12% of YA and 33% of OA) fails the trivial trials. This calls into question the reliability and interpretability of the consistency results for those individuals. We then conduct the same analysis with the subsample of subjects who pass the trivial trials. Not surprisingly, the total number of violations is substantially smaller in this subsample. Importantly, however, the treatment 2 For instance, these dierences could be driven by dierences in sociability, experience, opportunity cost of time or income (in particular, the mean household income of our YA sample is greater than that of our OA sample). 3 eect is identical: marginal dierences between OA and YA in the simple domain and signicant dierences in the complex domain. Second, we nd that dierences in violations in the complex treatment are associated with dierences in performance in the working memory test. Since YA score signicantly higher in that test compared to OA, most of the dierence in performance across ages is captured through the working memory eect. Our ndings thus indicate that the working memory system is more heavily recruited in the complex task than in the simple one. The result echoes the studies reviewed below, which show this precise relationship between complexity and working memory demands. Interestingly, the result also extends to IQ (although less strongly) but it should be noted that working memory and IQ are highly correlated. Finally, we also conduct an individual and cluster analysis (see the Appendix). One group of subjects is very inconsistent in both the simple and complex treatments. A second group, mostly composed of OA, are individuals who commit almost no violations in the simple treatment. Interestingly, these subjects have a preference that can be implemented with a simple rule: maximize the quantity of the favorite good in the bundle. Their behavior becomes signicantly more inconsistent in the complex domain, possibly because that simple rule is less intuitive to implement in that context. The last group, mostly composed of YA, are subjects who do not exhibit preferences that can be implemented with simple rules. They are slightly less consistent than the previous group in the simple treatment but signicantly more consistent in the complex one. The study builds on three strands of the literature. First, laboratory experiments have used GARP to assess the degree of consistency of subjects in dierent domains, such as goods (bundles with positive quantities of two or more desirable items), risk (bundles of quantities and probabilities) and social (bundles of money for oneself and money for another party). Studies nd that YA make choices generally consistent with revealed preference theory. 3 Second, experiments have concurred in the nding that consistency increases between 8 and 12 years old children (Bradbury and Nelson, 1974) and thereafter stabilizes (Har- 3 See e.g. Sippel (1997), Mattei(2000) and Fevrier and Visser (2004) for studies in the good domain, Choi et al. (2007), Andreoni and Harbaugh (2009) and Choi et al. (2014) for studies in the risk domain and Andreoni and Miller (2002) and Fisman et al. (2007) for studies in the social domain. Studies also report GARP consistent behavior in the context of criminal behavior (Visser et al., 2006) and by inebriated (Burghart et al., 2013) or sleepy (Castillo et al., 2014) subjects. In a cross cultural study, Tanzanian YA are found to commit more GARP violations as compared to YA from the United States (Cappelen et al., 2014). Finally, in a multi-domain study (bundles of consumption goods, labor hours, and token money) with female mental hospital patients, Battalio et al. (1973) nd some inconsistencies but when a subsequent work (Cox, 1997) studies the same data taking into account severity of violations, all but one of the subjects is deemed consistent. 4 baugh et al., 2001). By contrast, the full trajectory across the lifetime has not been established. Indeed, some laboratory (Tentori et al., 2001; Kim and Hasher, 2005) and eld (Dean and Martin, 2014) experiments nd that OA are more consistent than YA while other laboratory (Finucane et al., 2002; Finucane et al., 2005) and eld (Echenique et al., 2011) experiments nd the opposite. These disparate ndings may be partly due to two methodological choices. First and contrary to standard practices in experimental economics, decisions in those YA vs. OA studies are not incentivized. Second, they use dif- ferent domains (health, extra credit, grocery coupons, nutrition, nance). This introduces confounding factors since dierent age groups have varying degrees of domain-specic expertise. Additional support for an inverse relationship between age and consistency can be found in the recent work by Choi et al. (2014). In this comprehensive online study, the authors show that GARP consistency in the risk domain decreases with age and increases with household wealth. The paper combines benets of eld (large sample size) and laboratory (incentivized) experiments. Moreover, subjects are drawn from a sample designed to be representative of the Dutch population. The study, however, does not address the two questions we are interested in, namely (i) how choice inconsistencies depend on the combination of age and task complexity and (ii) whether they can be traced to compromised working memory and uid intelligence. Third, studies have demonstrated that task complexity imposes demands on working memory. Working memory is the short-term mental maintenance (Cohen et al., 1997; Curtis and D'Esposito, 2003) and manipulation of information (Pochon et al., 2001), and this process is less ecient in OA. Varying levels of task complexity may account for dierences in choice consistency OA and YA. Indeed, neuroimaging studies have shown that the working memory regions of the brain are recruited during more dicult tasks, such as those requiring task-switching (MacDonald et al., 2000), integral-solving (Krueger et al., 2009), and attention-shifting (Kondo et al., 2004). Crucially, the circuitry is dierentially recruited as tasks become more complex (Demb et al., 1995; Baker et al., 1996; Braver et al., 1997; Cohen et al., 1997; Carlson et al., 1998; Greene et al., 2004). 4 Interestingly, it has been shown that older adults perform worse on such tasks (Grady et al., 2006; Zamarian et al., 2008; Brand and Merkowitsch, 2010; Henninger et al., 2010) and the age-related atrophy of regions involved in working memory (Raz et al., 2005) could be a main cause of that decline: these regions are activated less in OA as compared to YA 4 This relationship extends to tasks requiring the explicit representation and manipulation of knowledge, when the ability to reason relationally is essential (Kroger et al., 2002), when the number of dimensions to be considered simultaneously is increased (Christo et al., 2001), or when the number of objects to remember is increased (Gould et al., 2003). 5 in working memory tasks (Rypma and D'Esposito, 2000), especially when the number of items to be maintained (Cappell et al., 2010) or manipulated (Wright, 1981) in memory is high. The article is organized as follows. The theoretical framework is presented in section 1.2. The experimental setting is described in section 1.3. The analysis is reported in sections 1.4 and 1.5. Concluding remarks are gathered in section 1.6. The individual and cluster analysis can be found in the Appendix. 1.2 Theoretical background Consider a subject making choices between pairs of bundles, each with two goods that are assumed to be desirable, in the sense that more of each good is strictly preferred to less. A choice between a pair of bundles is called a \trial." Denote a xy : = (q a x ;q a y ) the bundle a xy that has positive quantities q a x and q a y of goods x and y, respectively. 1.2.1 Bundles with identical goods Suppose rst that bundles are composed of the same two goods (x;y2f1; 2g withx6=y) and consider trials with bundlesa 12 anda 0 12 so that each bundle has strictly more quantity of one good and strictly less of the other (q a x > q a 0 x , q a y < q a 0 y ). In the experimental section, this is called treatment S (for simple). When a trial is considered in isolation, the question of consistency does not arise, and any choice between pairs of bundles with the aforementioned properties is consistent with the maximization of monotonic and transitive preferences. However, when we jointly consider a pair of trials, some combinations of choices may constitute a violation of revealed preferences (which we callD S , for direct violation in the simple treatment). 5 Here is why. Consider the example in Figure 1 and suppose that a 12 is chosen over a 0 12 and b 12 is chosen over b 0 12 . Since q a 0 x > q b x for all x, we have a 12 a 0 12 b 12 . Since q b 0 x >q a x for all x, we have b 12 b 0 12 a 12 . This forms a contradiction to the maximization of monotonic and transitive preferences. Denition 1 sets conditions for a direct violation in a pair of trials of treatment S. Denition 1 Direct violation in a pair of trials of the simple treatment (D S ). (i) Trials a 12 vs. a 0 12 and b 12 vs. b 0 12 may involve aD S -violation if and only if q a 0 x q b x for all x (with at least one strict inequality) and q b 0 x q a x for all x (with at least one strict inequality). (ii) AD S -violation occurs when a 12 is chosen over a 0 12 and b 12 is chosen over b 0 12 . 5 The seminal work on revealed preference theory is due to Samuelson (1938). It was subsequently extended by Houthakker (1950), Afriat (1967) and Varian (1982) among others. 6 - 6 t a 12 t b 0 12 t b 12 t a 0 12 q 2 q 1 0 Figure 1: Trials a 12 vs. a 0 12 and b 12 vs. b 0 12 The logic of the argument is very similar to the standard revealed preferences argument made in earlier GARP studies (Sippel, 1997; Harbaugh et al., 2001; Choi et al., 2007). The only dierence is that, in our case, the set of options per choice is dramatically reduced. Therefore, a choice in one trial only reveals that the selected option, or \bundle," is preferred to the only other bundle proposed rather than to any bundle on the \budget line." Notice that Denition 1 is made of two parts. Part (i) provides conditions such that choices in a pair of trials may result in a violation. Intuitively, the requirement is that for each trial one bundle dominates (i.e., has weakly more quantity of both goods and strictly more of at least one than) a bundle in another trial whereas the remaining bundle is dominated by (i.e., has weakly less quantity of both goods and strictly less of at least one than) the remaining bundle in the other trial. Naturally, some pairs of trials will fail to satisfy this condition, in which case aD S -violation will not be possible. Given a pair of trials such that aD S -violation is possible, part (ii) provides conditions such that the violation indeed occurs. Again intuitively, the requirement is that in each trial the subject selects the bundle that is dominated by a bundle in the other trial. In our example, the dominated bundles are a 12 and b 12 . Hence, only one out of the four possible choice combinations will result in aD S -violation. Givenn trials, there aren(n1)=2 pairs of trials. By considering all pairs of trials and checking whether the condition in Denition 1(i) is satised, we can identify all possible violations between pairs of trial. Then, actual violations are determined simply by checking whether a subject's selected bundles (in those pairs of trials in which a violation is possible) satisfy the condition in Denition 1(ii). Two important remarks are in order. First, it is unfortunately not possible to determine the maximum number of violations that a subject can eectively incur. Indeed, when a subject makes a choice that induces a violation it may preclude violations between other 7 pairs of trials. 6 Second, by the discrete nature of our choice problem, it is possible that a direct violation occurs between a triplet of trials (or more) even though the condition in Denition 1(i) is not satised by any pair of trials in that triplet (and therefore no direct violation occurs between pairs of trials in the triplet). In Appendix A1, we construct an example of such case. Given our choice of bundles, conditions such that direct violations can occur between triplets of trials { but not between pairs of trials in that triplet { are very rare but still possible. We will not consider them in the analysis, which means that our experimental study may miss some (small number of) GARP violations. 1.2.2 Bundles with dierent goods Assume now that there are three possible goods (x;y;z2f3; 4; 5g with x6= y6= z) and as before, bundles are composed of two goods. Consider trials between pairs of bundles that have exactly one good in common, that is, between bundle a xy and bundle a 0 xz . In the experimental section, this is called treatment C (for complex). As the choice problem involves more goods, the decision is arguably more complicated. Since a trial still has two bundles and each bundle still has positive quantities of exactly two goods, the two treatments remain comparable. By denition, each bundle now has strictly more quantity of at least one good (onlya xy has a positive quantity of goody and onlya 0 xz has a positive quantity of goodz). Again, when a trial is considered in isolation, any choice between pairs of bundles is consistent with the maximization of monotonic and transitive preferences. Denition 2 identies conditions for a direct violation in a pair of trials of treatment C to occur. These are very similar to the conditions described in Denition 1. Denition 2 Direct violation in a pair of trials of the complex treatment (D C ). (i) Trials a xy vs. a 0 xz and b xz vs. b 0 xy may involve aD C -violation if and only if q a 0 x q b x and q a 0 z q b z (with at least one strict inequality) and q b 0 x q a x and q b 0 y q a y (with at least one strict inequality). (ii) AD C -violation occurs when a xy is chosen over a 0 xz and b xz is chosen over b 0 xy . Just like in the example of Figure 1, when the conditions of Denition 2(i) and (ii) are satised, we get a xy a 0 xz b xz and b xz b 0 xy a xy which is a contradiction to the maximization of monotonic and transitive preferences. 6 To see this, consider the example in Figure 1 and suppose there is a third trial between bundles c12 and c 0 12 such that q c x < q a x and q c 0 x > q a 0 x for all x. By choosing a12 over a 0 12 and b12 over b 0 12 the subject incurs a violation. However, by choosing a12 over a 0 12 the subject precludes any possible violation between the pair of trials a12 vs. a 0 12 and c12 vs. c 0 12 (even though a violation would have occurred had the subject chosen a 0 12 over a12 and c12 over c 0 12 ). 8 Interestingly, in treatment C there is also the possibility of incurring an indirect vio- lation (I C ). AnI C -violation involves choices in three trials, each with a dierent common good. Denition 3 describes an indirect violation. Denition 3 Indirect violation in a triplet of trials of the complex treatment (I C ). (i) Trials a xy vs. a 0 xz , b xz vs. b 0 yz and c yz vs. c 0 xy may involve anI C -violation if and only if q a 0 x q b x and q a 0 z q b z (with at least one strict inequality), q b 0 y q c y and q b 0 z q c z (with at least one strict inequality), and q c 0 x q a x and q c 0 y q a y (with at least one strict inequality). (ii) AnI C -violation occurs when a xy is chosen over a 0 xz , b xz over b 0 yz and c yz over c 0 xy . Although the argument is slightly more sophisticated, the idea behind indirect viola- tions is similar to that behind direct violations. AnI C -violation may occur if in each trial, one bundle dominates the bundle composed of the same goods in another trial and the remaining bundle is dominated by the bundle composed of the same goods in the other trial. In Denition 3(i) and given that more quantity is always desirable, we have a 0 b, b 0 c, and c 0 a. When this condition is satised, an indirect violation occurs if the subject chooses bundles a, b, and c. Indeed, these choices imply a a 0 b b 0 c on one hand, and cc 0 a on the other, which forms a contradiction. For the same reasons as in the second remark of the simple treatment, in the complex treatment it may be the case that a direct violation involving three or more trials occurs but no violation occurs between any subset of two trials. Similarly, it may be the case that an indirect violation involving four or more trials occurs but no violation occurs between any subset of three trials. For simplicity, we will again ignore those violations. 1.3 Experimental design and procedures To study choice consistency of younger adults (YA) and older adults (OA) with dierent levels of complexity, we conduct an experiment based on the setup described in the theory section using the MatLab extension Psychtoolbox (Brainard, 1997; Pelli, 1997).We ran 10 sessions with OA and 7 sessions with YA. Each session had between 5 and 8 subjects and lasted between 1.5 and 2 hours. OA sessions were conducted at two OASIS senior centers in Los Angeles, OASIS Baldwin Hills and OASIS West Los Angeles. A total of 51 OA (age 59- 89) were recruited through the OASIS activities catalogue. 7 Six subjects were omitted from analysis: four subjects experienced software malfunctioning; one spontaneously reported 7 OASIS is a non-prot organization active in 25 states. Its mission is to promote successful aging by disseminating knowledge and oering classes and volunteering opportunities to its members. Recruitment is mostly word-of-mouth, with existing members referring new members. More information can be found at http://www.oasisnet.org. 9 miscomprehension of the task halfway through the experiment; the only male subject in the pool was excluded to make the sample more demographically homogeneous. We therefore retained 45 female OA for the analysis. 8 OA in our sample are highly educated. 9 Given their education level, we deemed it appropriate to recruit college students for our YA sample. 10 YA subjects were recruited from the Los Angeles Behavioral Economics Laboratory (LABEL) pool, which consists of over 2,500 USC students, and sessions were conducted at LABEL, in the department of Economics at the University of Southern California. In order to match gender, we recruited 50 YA female USC students, age 18-34. 11 All subjects were compensated with a xed amount of $20 plus an incentive payment (described below). As discussed in the introduction, the potential selection problem limits our ability to make causal inferences. Experimental evidence regarding dierences in behavior between our YA and OA does not prove a causal eect of age. In section 1.5 we review dierent channels through which dierences in behavior between age groups could arise and address how plausible these alternative explanations are in light of the specic dierences we obtain in our study. In particular and as developed below, we nd a clear dierential eect across task complexity which is consistent with age-related changes. This nding seems inconsistent with the alternative channels we consider, as we would not expect them to discriminate across task complexity. 12 GARP task. Each subject participated in 140 core trials with ve goods (1; 2; 3; 4; 5). In each core trial, subjects chose between two bundles each composed of two goods, and were not allowed to express indierence. There were 35 trials of the simple treatment S, where the same two goods (1; 2) appeared in both bundles (a 12 vs. a 0 12 , b 12 vs. b 0 12 , etc.). There were also three sets of 35 trials of the complex treatment C, where each bundle 8 The overwhelming majority of OASIS members are female (88%), which explains the extreme gender selection in our sample but also raises some concerns about self-selection. Besedes et al. (2012b) also report a larger fraction of female participation (75%), although the dierence is not as extreme as ours. 9 The distribution of their highest educational attainment is: PhD (4%), MA (22%), Professional degree (2%), BA (29%), AA (11%), some college credit (26%), and trade/technical/vocational school (4%). This is representative of the OASIS members and substantially above national averages. It is not surprising that an organization dedicated to the sharing of knowledge and promotion of research-based programs attracts individuals with above average levels of education and intellectual curiosity. 10 All the YA in our sample are USC students. Based on national average statistics (reported by U.S News in 2009), we expect that 26% of undergraduates will pursue a graduate degree. Therefore, education of our OA is comparable to the nal education that can be expected for our YA. 11 For more information about the laboratory, see http://dornsife.usc.edu/label. We had 45 under- graduate and 5 master students in our YA sample from all 4 disciplines: Arts and Humanities (10%), Natural Sciences (30%), Social Sciences (50%), and Technical Sciences (10%). 12 For example, if the self-selected OA had a lower opportunity cost of time than the general OA popu- lation, and if opportunity cost of time aects decision consistency, then we would expect the eect to be present for both the simple and complex version of the GARP task. 10 had one common good and one unique good for a total of three goods (3; 4; 5) in each trial. These three sets of 35 trials were identical up to a permutation of the identity of the common good: good 3 (a 34 vs. a 0 35 ), good 4 (a 34 vs. a 0 45 ) and good 5 (a 35 vs. a 0 45 ). Importantly, quantities in each bundle were chosen to maximize the chances to satisfy condition (i) in Denitions 1, 2 and 3: for each trial, we chose one bundle that dominated a bundle in as many other trials as possible and a second bundle that was dominated by a bundle in as many other trials as possible. There are two reasons for this choice. First, to give as many chances as possible to observeD S -,D C -, andI C -violations if subjects were inconsistent. Second, to minimize the chances of violations that are not identied in our analysis (e.g., direct violations between triplets of trials that are not captured with pairs of trials, as explained in the second remark of section 1.2.1 and Appendix A1). Figure 2 depicts the 35 trials in treatment S. The x- and y-axis represent the quantities of the two goods. Each point represents a bundle of some quantity of good x and some quantity of good y. Each segment corresponds to one trial in which the two bundles it connects were oered against one another. For example, the bold red segment represents a trial in which a bundle of 1x and 5y were oered against a bundle of 2x and 2y. We used the same quantities for trials in treatment C, in order to facilitate the comparison of violations across treatments. The only dierence is that bundles have only one good in common. 13 Figure 2: The 35 trials in treatment S. Finally, we added 10 trivial trials to check for the attentiveness of subjects (treatment A). In these trials, subjects chose between dierent quantities of the same good (q x vs. q 0 x ). Including trivial trials are typical in psychology experiments (under the misleading terminology of \catch trials") but less common in economics, which assumes that incentive 13 If we were to depict it, we would use a three-dimensional graph. Each bundle would then have positive quantities of exactly two goods. 11 payments ensure attentiveness. For subjects who failed to choose the higher quantity option in treatment A, our design is not intended to (and therefore cannot) distinguish between inattention, satiation, disliking, or miscomprehension of the task, although our procedures were intended to minimize all four possibilities, as described below. Either way, such violations would call into question the reliability and interpretability of that subject's choices in treatments S and C. All subjects faced the 150 trials, which were presented in a randomized and counterbalanced order. Table 1 summarizes this information. Treatment Goods # of trials S (1,2) vs. (1,2) 35 C (3,4) vs. (3,5) 35 C (3,4) vs. (4,5) 35 C (3,5) vs. (4,5) 35 A (1) vs. (1) 10 Total 150 Table 1: Summary of treatments A major concern in experiments on revealed preferences is the choice of goods. Follow- ing some of the recent literature on revealed preferences and value elicitation (Harbaugh et al., 2001; Hare et al., 2009; Rangel and Clithero, 2013), we opted for food items. We presented subjects with 21 popular salty and sweet snacks and we asked each subject to pick ve of them for consumption: two were then randomly used in treatment S and the other three in treatment C. Therefore, each subject completed the task with a personal- ized set of snacks. Each portion was small (for example, one portion consisted of \two pistachios") ensuring that the maximum quantity oered of each good was substantially below satiation level. 14 Subjects were instructed not to eat or drink anything except for water for a period of at least three hours prior to the experiment and all sessions were conducted between 10am and 2:30pm to ensure that subjects were hungry. Figure 3 presents a sample screenshot of a trial in treatment C. In this example, the subject had to choose between a bundle of 5 portions of chips plus 1 portion of peanuts and a bundle of 4 portions of chips plus 2 portions of pretzels. At the end of the experiment, one trial was randomly selected for each subject, and the subject's choice in that trial was given to them to consume. Subjects were kept in the experimental room for 15 minutes 14 We made sure that all ve selected items were desirable. To address the issue of complementarity or substitutability of goods, we also made sure that subjects understood they might have to consume a combination of two items at the end of the experiment. Appendix A2 presents the list of food items and quantities per portion. 12 following the end of the experiment. This was to ensure that all the foods would be fully consumed, that they would be consumed by the intended subject, and that they would not be consumed in combination with foods other than those in each subject's bundle. An advantage of using food items is that subjects cannot trade goods at the end of the experiment. Every subject complied with the procedure. Figure 3: Screenshot of one trial in treatment C Working memory and Raven's IQ tests. After the GARP task, subjects performed a spatial working memory test and an IQ test. To measure working memory, we used the computerized Spatial Working Memory test (WM) developed by Lewandowsky et al. (2010). This test measures the capacity of individuals to store and retrieve information in short term memory. It runs as follows. The individual observes a 10 10 grid. A trial consists of a sequence of 2 to 6 dots that appear in dierent cells of the grid for 0.9 seconds with 0.1 seconds between dots. After the nal dot in a sequence disappears the subject attempts tap the cells where the dots appeared in any order. Score decreases with the distance between the correct and the selected cells. The entire test consists of 32 such trials, including 2 practice trials. No feedback is given between trials or upon completion of the test. For IQ, we used the short version of Raven's IQ test, namely Set I of Raven's Advanced Progressive Matrices (APM) as developed by Raven et al. (1998). This set consists of 12 non-verbal multiple choice questions that become progressively more dicult. For each question, there is a pattern with a missing element. From the eight choices below the pattern, the subject is to identify the piece that will complete the pattern. As Set I is typically used as a screening tool for Set II of the APM, the test provides a rough measure of IQ. Instructions for the test were read directly from the script provided with the test. The test was administered in the intended format (paper) and was not timed. Subjects were made familiar with the format of the test and method 13 of thought required through two practice problems preceding the test. During this time, they were allowed to ask questions from the experimenters. The test started only after all subjects had armed their understanding of the instructions. Questionnaire. Following completion of the tests, subjects were asked to complete a questionnaire, adapted from one used by the Emotion and Cognition Lab at USC. It includes questions about their highest diploma, occupation, income, ethnicity, various stress rankings and health levels, as well as information relative to current medications. Summary. From a design viewpoint, there are two new elements relative to the existing experimental tests of revealed preferences. First, we study choice across ages and choices across task complexity but, most importantly, we study the interaction between the two. Second, we correlate choices with measures of memory and uid intelligence, to better un- derstand the source of dierences in consistency over the life cycle. From a methodological viewpoint, there are also two novelties. First, we add trivial tasks. This allows us to dier- entiate between subjects who violate consistency because they violate one of the premises of the model (such as inattention, satiation, disliking or miscomprehension) from those who violate consistency even though they are likely to satisfy all those premises. Second, each trial has only two possible choices. This is obviously less rich than the traditional setting, where a large number (or even a continuum) of options are presented. However, it allows us to focus on a simpler choice problem with an easy graphical depiction so that we can conduct a large number of trials in a relatively short period of time. 15 A sample copy of the instructions can be found in the Appendix. 1.4 Analysis 1.4.1 Frequency of violations Our rst and central objective is to assess choice consistency across populations (YA vs. OA) and treatments (simple vs. complex). Comparisons across treatments are only possible for direct violations since the metric is radically dierent between direct and indirect violations. To give an idea, for each set of 35 trials there are 3534 2 = 595 pairs of trials, of which 170 can potentially result in direct violations. Therefore, there is a total of 170 possibleD S -violations and 510 possibleD C -violations. By contrast, of the 35 3 = 42; 875 triplets of trials in treatment C, only 188 can result in anI C -violation. This means that at most 28.6% of choices can result in direct violations between pair of 15 Our design contrasts with some recent experimental literature in other domains (risk, time) where it is shown that convexifying the budget set helps obtaining accurate estimates (Andreoni and Sprenger (2012a,b)). Choi et al. (2007) also perform many trials thanks to their ingenious software presentation, although the decision problem in their setting is substantially more complex. 14 trials but only 0.4% can result in indirect violations between triplets of trials. 16 A more informative measure to assess the extent of violations across treatments is to compare them with the number of violations incurred by a simulated subject choosing randomly between bundles. Figure 4 presents the cumulative distribution function (c.d.f.) of the number of realizedD S -violations in each population (OA, YA) for treatment S (left) and the total number of realizedD C - andI C -violations in each population (OA, YA) for treatment C (right). It also presents the c.d.f. of violations when the decisions in each set of 35 trials are simulated 100,000 times using a random choice rule. Figure 4: Number of violations in treatments S (left) and C (right) In treatment S, a signicant fraction of subjects have no violations (66% of YA and 42% of OA). This fraction shrinks substantially in treatment C (34% of YA and 11% of OA). To quantify the extent of violations, we can use the random choice distribution. According to our simulation, there is a 10% chance that a subject choosing randomly will incur less than 23 violations in treatment S and less than 105 in treatment C. Using these numbers as a benchmark, we get instead that 88% of our YA and 84% of our OA incur less than 23 violations in treatment S and 94% of our YA and 80% of our OA incur less than 105 violations in treatment C. Therefore, in line with previous studies (Battalio et al. (1973), Cox (1997), Sippel (1997), Harbaugh et al. (2001), and others), the majority of our subjects incur relatively few violations. Perhaps more interestingly, we can compare violations across age groups. YA in our sample incur fewer violations than OA in our sample and dierences are more pronounced in treatment C than in treatment S. More precisely, non-parametric Kolmogorov-Smirnov 16 Recall the second remark in section 1.2.1, stating that choices which induce some violations may preclude some others. As such, 170 and 188 are upper-limits on the number of eectively feasible direct violations of pairs (DS ,DC) and indirect violations of triplets (IC), respectively. 15 (KS) and Wilcoxon Rank Sum (WRS) tests of comparisons of c.d.f. establish marginal dierences of distributions in treatment S (p-value = .110 and .043, respectively) and strong dierences of distributions in treatment C (p-value = .001 and < .001, respec- tively). 17 As can be seen from the graph, the dierence in treatment S is mostly driven by the higher fraction of subjects with 0 violations in the YA population. Figure 4 also highlights the usefulness of the random choice benchmark: even if in both treatments the empirical distributions of violations by YA and OA are signicantly smaller than if they were generated by a random choice process, the dierence between empirical (YA or OA) and random distributions is more pronounced in S than in C for both populations. This is consistent with the hypothesis that treatment C is more dicult to comprehend and therefore likely to generate relatively more mistakes than treatment S. In this respect, it is particularly interesting to notice the behavior in the tail of the distribution: the 16% of OA who commit the most mistakes in treatment C perform worse than the 16% of subjects who would commit the most mistakes if they all behaved randomly. As we will see later on, these are subjects who are likely to violate some assumption of the model. Overall, we nd that treatment C is more dicult than treatment S and generates relatively more mistakes in both populations. Also, the results in this section are consistent with a strong age eect on the number of violations in the complex task and a weak or no age eect in the simple task. It is also interesting to distinguish between direct and indirect violations in treatment C, especially sinceD C -violations are of similar (though not identical) nature to theD S - violations presented in the left graph of Figure 4. Figure 5 separates violations in treatment C into direct (D C ) and indirect (I C ) for each population. As before, it also represents the distribution of violations under a random choice rule. According to KS and WRS tests, dierences in the distributions between YA and OA in treatment C are substantially more pronounced for direct violations (p-value < .001 for both) than for indirect violations (p-value = .187 and .022). The dierence is mainly driven by the fact that a relatively high fraction of subjects in both populations (84% of YA and 64% of OA) do not incur any indirect violation. It also suggests that treatment C is cognitively more demanding even when we look only at direct violations. Hence, it is the diculty of having to compute and keep track of the value of a third good which makes the comparison of two-good bundles more challenging and not so much the added possibility of a dierent type of intransitivity through indirect violations. 17 As it is well-known, KS is sensitive to any dierence in distributions (shape, spread, median, etc.) whereas WRS is mostly sensitive to changes in the median. In an attempt to remain agnostic about which test is more appropriate for our sample, we will report results for both tests in all of our comparisons of distributions. 16 Figure 5: Direct (left) and Indirect (right) violations in treatment C 1.4.2 Severity of violations So far, we have focused on the number of violations. However, not all violations are equally important. Indeed, as emphasized by Afriat's (1967) eciency index and further developed by Varian (1990) and more recently by Echenique et al. (2011) and Dean & Martin (2014) among others, one should also take into account the severity of violations. Populations may dier in frequency of violations but not in severity and vice-versa. There are several ways to study severity. One possibility is to consider a severity index that puts a measurement to the intuition that, if the dierences in quantities between bundles is small, the violation is less severe than if the dierences are large. 18 Recall from Denitions 1, 2, and 3 that the condition for a GARP violation to occur between a pair (direct) or a triplet (indirect) of trials is that for both trials, the chosen bundle must have less quantity of both goods than the bundle not chosen in the other trial. This is independent of how much smaller these quantities are. For example (and, again, assuming monotonic preferences) if my choices reveal a preference for (1,1) over (2,2), the violation is less acute than if they reveal a preference for (1,1) over (5,5). This is consistent with the theory behind stochastic choices, whereby rational individuals are more likely to incur smaller mistakes than bigger ones. To formalize this idea of severity, we take each pair (triplet) of trials involved in a direct (indirect) violation, and measure the euclidean distance between the amounts in the chosen bundles and the amounts in the bundles that have weakly more quantity of both goods and were not chosen. We then take the minimum of these distances, which we 18 Contrary to the previously mentioned papers, our goal here is not to develop a new measure of severity in violations but, instead, to use a simple way to quantify their extent. 17 call d. This value captures the minimum quantity by which we should change one of the choices of the individual in order to remove the violation. It can also be interpreted as the magnitude of the \mistake" incurred by not choosing the bundles with more quantities of both goods. To illustrate the concept, consider the case of aD S -violation described in Figure 1. If the individual commits a violation (that is, selectsa 12 andb 12 ), the severity is given by d min d(a 12 ;b 0 12 );d(b 12 ;a 0 12 ) . Intuitively, ifa 12 is very close tob 0 12 , it means that the error is small and reversing two very similar choices would remove the violation. For the case ofD C andI C -violations in treatment C, the severity is given by d min d(a xy ;b 0 xy );d(b xz ;a 0 xz ) and by d min d(a xy ;c 0 xy );d(b xz ;a 0 xz );d(c yz ;b 0 yz ) respectively. Notice that the euclidean distance is always taken between two bundles containing positive quantities of the same two goods. Including all subjects in the analysis would exacerbate dierences in severity between OA and YA since we know from section 1.4.1 that the fraction of perfectly consistent subjects (for whom d = 0) is larger in the younger population. To avoid this ctitious eect, we include in the analysis only subjects with a positive number of violations and count the average severity of the choices that are inconsistent for that subject (not of all choices). Figure 6 presents the c.d.f of this severity index by population and treatment. Figure 6: Severity of violations in treatments S (left) and C (right) Given the bundles proposed in the experiment, the range of d is relatively small: between 1.0 and 3.0 in treatment S and between 1.0 and 2.0 in treatment C. If anything, this will bias the results against nding dierences across treatments. With this in mind, we can see from the graph that some subjects commit only the minimal possible violations (d = 1:0) whereas others incur more severe ones (d = 1:5 on average). In treatment S the distribution of severity of violations is not signicantly dierent across populations 18 (p-value = .881 and .858 for KS and WRS tests). By contrast, in treatment C violations are signicantly more severe for OA than for YA (p-value = .049 and .012 for KS and WRS tests). 19 An alternative measure of severity of violations consists of nding, for each individual, the minimum number of trials that need to be removed in order to suppress all violations for that individual. Subjects with more violations are likely to necessitate the elimination of more trials to achieve consistency. At the same time, if a subject makes one outlier choice, he may exhibit many inconsistencies that are \cleaned up" when that single trial is removed. 20 As before, we exclude the individuals with no violations to avoid articially exacerbating dierences between OA and YA. This means that the minimum number of trials to be removed is 1. Figure 7 presents the c.d.f. of the number of choices to be removed for perfect consistency by population and treatment. Figure 7: Choices to remove for consistency in treatments S (left) and C (right) This severity measure yields results similar to the previous one. The distribution of the number of choices that need to be removed to achieve consistency is not statistically dierent between populations in treatment S (p-value = .807 and .440 for KS and WRS tests) but it is highly signicant in treatment C (p-value = .016 and .002 for KS and 19 We performed the exact same analysis with the average amount (instead of the minimum amount) choices of an individual should be changed in order to remove the violation. So, for example, in Figure 1 that would be d 0 d(a12;b 0 12 ) +d(b12;a 0 12 ) =2. The results were very similar and the treatment eect sharper than before: still no signicant dierence between OA and YA in treatment S (p-value = .727 and .820 for KS and WRS tests) and signicantly more severe violations for OA than YA in treatment C (p-value = .023 and .004 for KS and WRS tests). The graphs are omitted for brevity. 20 This is similar to the Houtman-Maks index (Houtman and Maks, 1985), a measure of severity often cited in the literature (e.g., in Choi et al. (2007) and Burghart et al. (2013)), and which is dened as the largest subset of all observed choices that does not include any cycles. 19 WRS tests). For example, in order to achieve consistency for two-thirds of the YA in treatment C, we only need to remove 3 trials whereas to achieve consistency for the same fraction of OA we need to remove 14 trials. Taken together, the results in this section lend further support to our previous nding: OA in our experiment are marginally more inconsistent than YA in the simple treatment but substantially more inconsistent than YA in the complex treatment, both in terms of the number and the severity of violations. 1.4.3 Trivial trials We next analyze the behavior in treatment A to see if the premises of our analysis { that subjects are attentive, understand the task, like each good and always prefer more to less { are satised. Figure 8 presents the number of violations incurred by YA and OA in the 10 trivial trials. Figure 8: Number of violations in treatment A (trivial trials) The results are highly surprising: we expected some mistakes but not quite as many as we got. In both populations there is a signicant fraction of subjects who violate at least one trivial trial (28% of YA and 62% of OA). There are even 3 subjects who violate all 10 trivial trials. Violations are much stronger in OA than in YA: both KS and WRS tests reject that samples are drawn from the same cumulative distribution functions (p-value = .002 and .001, respectively). This is a severe problem and suggests that at least some of our subjects do not satisfy the assumptions of the model. Subjects who fail 9 or 10 trivial trials are very likely expressing a preference for less rather than more food, even though our protocol imposed the strongest possible emphasis into having hungry subjects, 20 desirable goods, and small portions. 21 For subjects who fail 4 or 5 trivial trials, it is more dicult to disentangle between inattentiveness, miscomprehension, and interior optimal quantity. Either way, it calls into question the reliability and interpretability of the results on choice consistency. More generally, our results raise a red ag on choice consistency experiments and strongly suggest the importance of including trivial trials in studies of consistency to test whether the assumptions of the model are satised by the experimental subjects. A natural next step is to conduct the same study as in section 1.4.1 keeping only those subjects that we think satisfy the assumptions of our model. This substantially reduces the sample size, and asymmetrically so for YA and OA. Furthermore, it creates its own selection problem since the subsample is selected based on a variable (choices where less is preferred to more) which is linked to the dependent variable of the study. However, we still think it is a useful exercise, and we nd it more satisfactory than ignoring the problem altogether. Below, we present the results when we restrict our attention to subjects who fail at most two trivial trials. We choose that number in order to exclude the subjects who unquestionably violate the premises of the model but, at the same time, to permit some mistakes and keep a reasonable sample size (44 YA and 30 OA). The choice of allowing two errors is admittedly ad-hoc. Figure 9 is the analogue Figure 4 for those individuals. Figure 9: Choice violations by subjects with at most two treatment A violations As expected, violations are signicantly reduced when we consider only the subjects with at most two errors in the trivial trials, most notably in treatment C. This suggests 21 One subject with 101 violations in S and 536 violations in C explicitly stated during the debrieng that she tried to minimize the quantity to consume. 21 that a non-negligible fraction of violations may be attributed to factors outside the ob- jective of the study. On the other hand, the basic results of the previous analysis remain unaltered. As before, there are more violations by OA than by YA and the dierence is more signicant in the complex treatment than in the simple treatment. Formally, KS and WRS tests show marginal dierences in distributions in treatment S (p-value = .072 and .016, respectively) and highly signicant dierences in treatment C (p-value = .004 and .001, respectively). 22 1.5 Understanding violations An obvious reason why an individual might commit violations is that her preferences do not satisfy the main GARP assumptions. Given the behavior of subjects in treatment A, there might be a non-negligible fraction of those individuals. Since the interpretation of the results is radically dierent for those subjects, we investigate below the determinants of consistency using the entire population and also using the subsample of subjects for whom we are most condent that the model is appropriate (those who fail at most two trivial trials). A main hypothesis of our experiment is that OA will commit more violations than YA due in part to the cognitive diculty to store information regarding the attributes of the goods. Working memory is the ability for storing information for immediate processing (Baddeley and Hitch, 1974; Baddeley, 1992). Subjects with low working memory and high working memory perform similarly in simple discrimination or detection tasks, but in complex tasks, working memory predicts task performance (Cerella et al., 1980; Gick et al., 1988). If non-consistent choices are a result of the subject's inability to simultaneously maintain a representation of many values, then GARP violations are expected to be more pronounced in treatment C { where more item values must be held in-mind { than in treatment S. To investigate this hypothesis, we study scores in the spatial working memory test performed in the experiment. Performance in the working memory test is higher for YA (mean = 203, st. error = 3) than for OA (mean = 152, st. error = 1.73) and the dierence is highly signicant (p-value< .001). 23 This is consistent with many previous ndings (see e.g., Salthouse and 22 Due to the ad-hoc nature of allowing two errors in treatment A, we also performed the same analysis with the most conservative possible measure, which is to include only subjects with no errors in trivial trials. Violations decrease substantially and the sample size is dramatically reduced to 17 OA and 36 YA so the statistical power is limited. However, the treatment eect is similar to that of the entire population: KS and WRS tests show no signicant dierences in distributions in treatment S (p-value = .534 and .305, respectively) and show signicant dierences in treatmentC (p-value = .060 and .022, respectively). Again, the graphs are omitted for brevity. 23 Due to software malfunction, 2 OA did not complete the working memory test and are excluded from 22 Babcock, 1991; Park et al., 2002). A regression between working memory scores and a group dummy shows that the two are highly correlated both when we consider the full sample (p-value < .001, Adj. R 2 = 0.71) and when we restrict attention to subjects with at most two violations in treatment A (p-value < .001, Adj. R 2 = 0.69). Another candidate to explain dierences in consistency across age groups is IQ. Gen- eral intelligence has two main components: uid intelligence, which is our reasoning and problem solving ability, and crystallized intelligence, our ability to use skills, knowledge and experience. Intuitively, when a subject is asked to choose between two bundles, her objective is to accurately represent her true preferences and act accordingly. This task requires a certain level of reasoning about true values, which may rely on uid intelligence. To test this hypothesis, we can use the answers to the Raven's IQ test, which is designed to measure uid intelligence. Performance in Raven's IQ test is again higher for YA than for OA both for the full sample (11.44 vs. 8.16) and for subjects with at most two violations in treatment A (11.39 vs. 8.77) and the dierences are highly signicant (p-value < .001 for both). This is not surprising. Indeed, the consensus is that uid intelligence declines with age after early adulthood, while crystallized intelligence remains intact (Horn and Cattell, 1967; Kaufman and Horn, 1996). Given that Raven's test measures uid intelligence, OA are expected to perform worse. Having established that working memory (WM ) and uid intelligence (IQ) are lower for older subjects, we now study the correlation between these two measures and the number of violations in treatment S (Viol-S) as well as the total number of direct and indirect violations in treatment C (Viol-C ). 24 The results are presented in Table 2 for the entire sample (left) and the subsample of subjects who fail at most two trivial trials (right). All subjects Viol-S Viol-C WM Viol-C 0.56 WM -0.10 -0.23 IQ -0.01 -0.22 0.68 , , : signicant at 5%, 1%, 0.1% level Subjects with 2 or less violations in A Viol-S Viol-C WM Viol-C 0.12 WM -0.06 -0.26 IQ -0.06 -0.29 0.65 , , : signicant at 5%, 1%, 0.1% level Table 2: Pearson correlations of memory, intelligence and GARP violations. all following analyses which include the measure. 24 We also conducted the analysis for direct and indirect violations separately and found qualitatively similar results. It is also worth noting thatDC andIC are strongly correlated (Pearson correlation = 0.84). 23 Except for the correlation between violations in both treatments, the results are very similar when we consider the entire sample or only the subjects who fail at most two trivial trials. We nd no relationship between the number of violations in treatment S and performance in the working memory or IQ tests. By contrast, violations in treatment C are negatively correlated with both working memory and IQ scores. The ndings related to working memory are consistent with the hypothesis that sub- jects use a decision-making process that requires them to encode the value of items. They are not consistent with interpretations that subjects are attending to only the count of items or attending to only a single element of the options. The ndings related to IQ suggest that uid intelligence is heavily involved in choice processing only for the most complex tasks. It should be noted however that working memory and uid intelligence are very strongly correlated. This is in line with previous studies (see e.g., Engle et al., 1999) and re ects the fact that both working memory and uid intelligence can be traced to the same brain systems (Prabhakaran et al., 1997; Kane and Engle, 2002; Gray et al., 2003; Olesen et al., 2004; Geary, 2005; Jaeggi et al., 2008). 25 To further investigate the relationship between violations in the complex treatment and performance in working memory and IQ tests, we conduct a set of ordinary least squares (OLS) regressions where the dependent variable is the number of violations in C. Explanatory variables include the variables presented above (violations in S, working memory scores, IQ scores) as well as a Younger Adult dummy (YA-d) and household income (Income). The results are presented in Table 3. After controlling for violations in S, working memory, IQ, and age, group has signicant explanatory power to understand consistency in C (regressions 1-3), but income does not (regression 4). 26 The similarities in signicance of the regressions are not surprising since we know from the previous analysis that WM and IQ are highly correlated and age is a strong predictor of performance in those tests. 27 When violations in treatment C are regressed on all of the variables (regression 5), the coecients on working memory scores, IQ scores, and the YA dummy lose signicance as these are highly correlated. Finally, the results are very similar when the variable Viol-S is excluded (regressions 6-10). However, the adjusted R 2 values are drastically lower, indicating a worse t. 25 Given the age heterogeneity in our OA population, we performed a within-sample analysis. We found that the correlation between violations in C and scores in working memory and IQ tests keep the same sign but lose signicance, in part due to the lower number of observations (data omitted for brevity). 26 We should notice however a further selection problem since not all individuals reported the income of their household. Comparisons are also dicult since for most YA income refers to that of their parents whereas for OA it is theirs and their spouses. 27 A principal component analysis on WM and IQ suggests that working memory data contains the largest fraction of the relevant information: the rst component is mostly driven by working memory score and explains 70% of the data. 24 1 2 3 4 5 6 7 8 9 10 Const. 142 118 50.5 38.5 44.9 204 148 79.1 85.6 197 (57) (37) (14) (29) (123) (68) (44) (16) (35) (155) Viol-S 2.45 2.48 2.46 2.60 2.74 (0.37) (0.37) (0.37) (0.43) (0.42) WM -0.66 0.43 -0.85 -0.44 (0.31) (0.82) (0.38) (1.0) IQ -9.33 -9.72 -9.60 -6.21 (3.5) (5.9) (4.3) (7.6) YA-d -46.2 -43.3 -50.2 -2.15 (18) (49) (22) (62) Income -3.05 5.91 -7.39 0.05 (7.5) (8.2) (9.2) (11) Adj. R 2 0.35 0.35 0.35 0.34 0.39 0.04 0.04 0.04 -0.01 -0.01 obs. 93 95 95 72 70 93 95 95 72 70 (standard errors in parentheses); , , = signicant at 5%, 1% and 0.1% level Table 3: Ordinary least squares (OLS) regression of number of violations in treatment C (all subjects) For robustness, we then perform the same regressions with the subset of subjects who failed two or less trials in treatment A. The results are presented in Table 4. While vio- lations in treatment S are no longer a signicant predictor for violations in treatment C, working memory scores, IQ scores, and age group are still highly signicant in their ex- planatory power (regressions 1-3) and income is still not (regression 4). Similar qualitative conclusions are obtained when we exclude violations in treatment S (regressions 6-10). Overall, the results are consistent with a cognitive decline theory of behavioral dier- ences between YA and OA. Our ndings are suggestive of the following process: GARP consistency is mediated by the brain structures involved in working memory and uid intelligence, both of which are aected by aging. When the environment is simple, the cognitive demands are limited so subjects with a low working memory and uid intelligence (typically, but not exclusively, OA) can still perform the necessary reasoning. By contrast, when the environment is more complex, the capacity of a subject to store and retrieve information as well as to perform logical reasoning is re ected in the level of consistency of her choices. Next, we examine the responses obtained in our questionnaire. We nd that OA self- report a lower stress level compared to YA (p-value < .001) and that reported stress 25 1 2 3 4 5 6 7 8 9 10 Const 56.6 48.3 23.7 13.1 25.7 58.5 49.9 24.7 15.5 33.2 (19) (14) (4.5) (9.5) (41) (19) (14) (4.3) (9.0) (40) Viol-S 0.18 0.17 0.17 0.20 0.26 (0.2) (0.2) (0.2) (0.3) (0.2) WM -0.23 0.12 -0.24 0.09 (0.1) (0.3) (0.1) (0.3) IQ -3.37 -3.85 -3.45 -3.83 (1.4) (1.9) (1.4) (1.9) YA-d -17.1 -21.4 -17.4 -19.4 (5.6) (14) (5.6) (14) Income 0.10 4.49 -0.32 3.87 (2.4) (2.4) (2.3) (2.3) Adj. R 2 0.05 0.07 0.10 -0.03 0.17 0.06 0.07 0.11 -0.02 0.17 obs. 74 74 74 53 53 74 74 74 53 53 (standard errors in parentheses); , , = signicant at 5%, 1% and 0.1% level Table 4: OLS Regression of number of violations in treatment C (subjects with 2 or less violations in treatment A) correlates with working memory scores (Pearson correlation = .29, p-value = .006). Inter- estingly, self-reported health rankings are similar across groups and uncorrelated to any relevant element of our analysis. Last, we check for dierences across ethnic groups. We rst note that our OA population is mostly composed of White and African American subjects while our YA population is composed of White and Asian subjects. Working memory scores, IQ scores, and violation counts across White OA and African American OA are not statistically dierent. The same applies for the comparison between White YA and Asian YA. Finally, there are inevitably some unobservable factors that may have dierent eects on subjects of dierent ages. One such factor is fatigue. The case could be made that fatigue aects OA more severely than it aects YA, leading to disparate levels of con- sistency. There is a stream of psychological research that is relevant to the question of whether older adults are more susceptible to fatigue. This research is on the phenomenon of \ego-depletion" { the impairment of decision-making immediately following a task re- quiring thoughtfulness or self control. If a subject is highly susceptible to ego-depletion then the quality of their decisions worsens as an experiment progresses. Research shows that older adults are less susceptible to ego-depletion than are younger adults (Dahm et 26 al., 2011), so the main eect of ego-depletion in our experiment should go in the opposite direction. 28 Other possible factors include dierences in the opportunity cost of time, sensitivity to hunger, cognitive skills or experience in individual decision making. Unfortunately, it is not feasible to rule out all these factors, given our design. While we agree that they raise caution as to the interpretation of our ndings, we do not feel that any of them has a clear and unambiguous dierential eect on our populations. 29 Perhaps more importantly, if either age group were to be more susceptible to any of these factors, it would be reasonable to believe that their performance would be aected in treatments S and C alike. 1.6 Conclusion In this paper we have studied choice consistency of younger and older adults in simple and complex domains. We have highlighted several dierences in behavior across our two populations. Our older adults are less consistent than our younger adults, both in terms of the number and severity of violations, especially when the choice task is complex. Also, we can trace the dierences in consistency in the complex task to deciencies of working memory, that is, in the ability to store and retrieve information regarding the value of the dierent items in bundles. The individual analysis (see the appendix) suggests that consistency across ages is similar in the simple task partly because the older adults in our sample have preferences consistent with a simple rule (typically, to maximize the quantity of the preferred item in the bundle) that can be easily implemented without errors, or on-line reference to subjective value. An important question for future study is whether these preferences are intrinsic to subjects or if it is a second-best strategy employed by individuals who are aware of their compromised working memory and uid intelligence. Finally, the importance of working memory in ensuring choice consistency is a key result of the paper with fundamental medical and policy implications. Our experimental design, characterized by two bundles presented in a screen, a left-right choice and the possibility of multiple repetitions (see Figure 3) is suitable to be implemented in the 28 When violations per trial are regressed on their order of appearance, a Younger Adult dummy, and the interaction of these variables, order is a signicant predictor of violations but the interaction of order and age dummy is not. This suggests that fatigue may aect consistency but not dierently across age groups. 29 For example, OA are likely to have a lower opportunity cost of time, but this may very well increase consistency by, other things equal, inducing them to take more time, eort and care in their decision making. Cognitive skills help consistency but this is aected by education and our OA are at least as educated as our YA. Finally, YA in our sample are possibly more experienced with experimental tasks but it is unlikely that it eclipses the additional decades of decision-making experience held by the OA. 27 scanner. In future research, we plan to use fMRI techniques to study the neural correlates of choice consistency. We already know that simple choices between items involve the ventromedial prefrontal cortex (Hare et al., 2008; Hare et al., 2009) that represents the value dierence between options. Our objective is to study how the working memory system (which involves the dorsolateral prefrontal cortex) and the ventromedial prefrontal cortex interact to produce consistent choices, and why this interaction diers across ages. 28 1.7 References Afriat, S. N. (1967). The Construction of Utility Functions from Expenditure Data. In- ternational Economic Review, 8 (1), 67-77. Albert, S. M., and Duy, J. (2012). Dierences in risk aversion between young and older adults. Neuroscience and Neuroeconomics(1), 3-9. Ameriks, J., Caplin, A., Leahy, J. and Tyler, T. (2007). Measuring Self-Control Problems. The American Economic Review, 97 (3), 966-972. Andreoni, J., and Harbaugh, W. (2009). Unexpected utility: Experimental tests of ve key questions about preferences over risk. Mimeo, University of Oregon. Andreoni, J., and Miller, J. (2002). Giving according to GARP: An experimental test of the consistency of preferences for altruism. Econometrica, 70 (2), 737-753. Andreoni, J., and Sprenger, C. (2012a). Estimating time preferences from convex budgets. The American Economic Review, 102 (7), 3333-3356. Andreoni, J., and Sprenger, C. (2012b). Risk preferences are not time preferences. The American Economic Review, 102 (7), 3357-3376. Baddeley, A. (1992). Working Memory. Science, 255(5044), 556-559. Baddeley, A., and Hitch, G. (1974). Working Memory. In G. H. Bower (Ed.), The Psy- chology of Learning and Motivation (47-89). Academic Press. Baker, S. C., Frith, C. D., Frackowiak, S. J., and Dolan, R. J. (1996). Active representation of shape and spatial location in man. Cerebral Cortex, 6 (4), 612-619. Battalio, R. C., Kagel, J. H., Winkler, R. C., Fisher, E. B., Basmann, R. L., and Krasner, L. (1973). A test of consumer demand theory using obersvations of individual consumer purchases. Economic Inquiry, 11 (4), 411-428. Bellemare, C., Kroger, S., and Van Soest, A. (2008). Measuring inequity aversion in a heterogeneous population using experimental decisions and subjective probabilities. Econometrica, 76 (4), 815-839. Bernheim, B.D., and A. Rangel (2009). Beyond Revealed Preference: Choice-Theoretic Foundations for Behavioral Welfare Economics. The Quarterly Journal of Economics, 124(1), 51-104. 29 Besedes, T., Deck, C.A., Sarangi, S., and Shor, M. (2012a). Age eects and heuristics in decision making. Review of Economics and Statistics, 94 (2), 580-595. Besedes, T., Deck, C.A., Sarangi, S., and Shor, M. (2012b). Decision-making strategies and performance among seniors Journal of Economic Behavior & Organization, 81 (2), 524-533. Bradbury, H., and Nelson, T. M. (1974). Transitivity and the patterns of children's preferences. Developmental Psychology, 10 (1), 55-64. Brainard, D. (1997). The psychophysics toolbox. Spatial Vision, 10 (6), 433-436. Brand, M., and Markowitsch, H. J. (2010). Aging and decision-making: a neurocognitive perspective. Gerontology, 56 (3), 319-324. Braver, T. S., Cohen, J. D., Nystrom, L. E., Jonides, J., Smith, E. E., and Noll, D. C. (1997). A parametric study of prefrontal cortex involvement in human working memory. Neuroimage, 5 (1), 49-62. Bronars, S. G. (1987). The power of nonparametric tests of preference maximization. Econometrica, 55 (3), 693-698. Burghart, D. R., Glimcher, P. W., and Lazzaro, S. C. (2013). An expected utility maxi- mizer walks into a bar... Journal of Risk and Uncertainty, 46 (3), 215-246. Cappelen, A. W., Kariv, S., Sorensen, E., and Tungodden, B. (2014). Is There a Devel- opment Gap in Rationality? Mimeo, Norwegian School of Economics. Cappell, K.A., Gmeindl, L., and Reuter-Lorenz, P.A. (2010). Age Dierences in Prefrontal Recruitment During Verbal Working Memory Maintenance Depend on Memory Load. Cortex, 46 (4), 462-473. Carlson, S., Martinkauppi, S., Rm, P., Salli, E., Korvenoja, A., and Aronen, H. J. (1998). Distribution of cortical activation during visuospatial n-back tasks as revealed by func- tional magnetic resonance imaging. Cerebral Cortex, 8 (8), 743-752. Carstensen, L. L., and Mikels, J. A. (2005). At the intersection of emotion and cognition aging and the positivity eect. Current Directions in Psychological Science, 14 (3), 117-121. Castillo, M., Dickinson, D. L., and Petrie, R. (2014). Sleepiness, Choice Consistency, and Risk Preferences. Mimeo, Institute for the Study of Labor (IZA). 30 Castle, E., Eisenberger, N. I., Seeman, T. E., Moons, W. G., Boggero, I. A., Grinblatt, M. S., and Taylor, S. E. (2012). Neural and behavioral bases of age dierences in perceptions of trust. Proceedings of the National Academy of Sciences, 109 (51), 20848- 20852. Cerella, J., Poon, L. W., and Williams, D. M. (1980). Age and the complexity hypothesis. Aging in the 1980s: Psychological issues, 332-340. Charness, G., and Villeval, M. C. (2009). Cooperation and Competition in Intergenera- tional Experiments in the Field and the Laboratory. The American Economic Review, 99(3), 956-978. Choi, S., Fisman, R., Gale, D., and Kariv, S. (2007). Consistency and heterogeneity of individual behavior under uncertainty. American Economic Review, 97 (5), 1921-1938. Choi, S., Kariv, S., Muller, W., and Silverman, D. (2014). Who Is (More) Rational? American Economic Review, 104 (6), 1518-1550. Christo, K., Prabhakaran, V., Dorfman, J., Zhao, Z., Kroger, J. K., Holyoak, K. J., and Gabrieli, J. D. (2001). Rostrolateral prefrontal cortex involvement in relational integration during reasoning. Neuroimage, 14 (5), 1136-1149. Cohen, J.D., Perlstein, W. M., Braver, T. S., Nystrom, L. E., Noll, D. C., Jonides, J., and Smith, E. E. (1997). Temporal dynamics of brain activation during a working memory task. Nature, 386, 604-607. Cox, J.C. (1997). On Testing the Utility Hypothesis. The Economic Journal, 107 (443), 1054-1078. Curtis, C.E., and D'Esposito, M. (2003). Persistent activity in the prefrontal cortex during working memory. TRENDS in Cognitive Sciences, 7 (9), 415-423. Dahm, T., Neshat-Doost, H. T., Golden, A. M., Horn, E., Hagger, M., and Dalgleish, T. (2011). Age shall not weary us: Deleterious eects of self-regulation depletion are specic to younger adults. PLoS ONE, 6 (10), e26351. Dean, M., and Martin, D. (2014). Measuring Rationality with the Minimum Cost of Revealed Preference Violations. Mimeo, Brown University. Demb, J. B., Desmond, J. E., Wagner, A. D., Vaidya, C. J., Glover, G. H., and Gabrieli, J. D. (1995). Semantic encoding and retrieval in the left inferior prefrontal cortex: a functional MRI study of task diculty and process specicity. Journal of Neuroscience, 15(9), 5870-5878. 31 Dror, I. E., Katona, M., and Mungur, K. (1998). Age dierences in decision making: To take a risk or not? Gerontology, 44 (2), 67-71. Echenique, F., Lee, S., and Shum, M. (2011). The money pump as a measure of revealed preference violations. Journal of Political Economy, 119 (6), 1201-1223. Engel, C. (2011). Dictator games: a meta study. Experimental Economics, 14 (4), 583-610. Engle, R. W., Tuholski, S. W., Laughlin, J. E., and Conway, A. R. (1999). Working memory, short-term memory, and general uid intelligence: a latent-variable approach. Journal of Experimental Psychology: General, 128 (3), 309-331. Fehr, E., Fischbacher, U., Von Rosenbladt, B., Schupp, J., and Wagner, G. G. (2003). A nation-wide laboratory: Examining trust and trustworthiness by integrating behavioral experiments into representative surveys, IZA Discussion paper series. Fevrier, P., and Visser, M. (2004). A study of consumer behavior using laboratory data. Experimental economics, 7 (1), 93-114. Finucane, M. L., Mertz, C. K., Slovic, P., and Schmidt, E. S. (2005). Task complexity and older adults' decision-making competence. Psychology and aging, 20 (1), 71-84. Finucane, M. L., Slovic, P., Hibbard, J. H., Peters, E., Mertz, C. K., and MacGregor, D. G. (2002). Aging and decision-making competence: an analysis of comprehension and consistency skills in older versus younger adults considering health-plan options. Journal of Behavioral Decision Making, 15 (2), 141-164. Fisman, R., Kariv, S., and Markovits, D. (2007). Individual preferences for giving. Amer- ican Economic Review, 97 (5), 1858-1876. Fraley, C., and Raftery, A. E. (2002). Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97 (458), 611-631. Fraley, C., and Raftery, A. E. (2006). MCLUST version 3: an R package for normal mixture modeling and model-based clustering. Mimeo, University of Washington. Geary, D. C. (2005). The Origin of Mind: Evolution of Brain, Cognition, and General Intelligence. American Psychological Association. Gick, M. L., Craik, F. I., and Morris, R. G. (1988). Task complexity and age dierences in working memory. Memory and Cognition, 16 (4), 353-361. Gould, R.L., Brown, R.G., Owen, A.M., Ffytche, D.H., and Howard, R.J. (2003). fMRI BOLD response to increasing task diculty during paired associates learning. Neu- roImage, 20(2), 1006-1019. 32 Grady, C. L., Springer, M., Hongwanishkul, D., McIntosh, A., and Winocur, G. (2006). Age-related changes in brain activity across the adult lifespan. Journal of Cognitive Neuroscience, 18 (2), 227-241. Gray, J. R., Chabris, C. F., and Braver, T. S. (2003). Neural mechanisms of general uid intelligence. Nature Neuroscience, 6 (3), 316-322. Greene, J.D., Nystrom, L.E., Engell, A.D., Darley, J.M., and Cohen, J.D. (2004). The Neural Bases of Cognitive Con ict and Control in Moral Judgment. Neuron, 44 (2), 389-400. Harbaugh, W. T., Krause, K., and Berry, T. R. (2001). GARP for Kids: On the Devel- opment of Rational Choice Behavior. American Economic Review, 91 (5), 1539-1545. Hare, T. , O'Doherty, J. , Camerer, C. , Schultz, W., and A. Rangel (2008) \Dissociating the Role of the Orbitofrontal Cortex and the Striatum in the Computation of Goal Values and Prediction Errors", The Journal of Neuroscience, 28 (22), 5623-5630. Hare, T. A., Camerer, C. F., and Rangel, A. (2009). Self-control in decision-making involves modulation of the vmPFC valuation system. Science, 324(5927), 646-648. Harrison, G. W., Lau, M. I., and Williams, M. B. (2002). Estimating individual discount rates in Denmark: A eld experiment. American Economic Review, 92 (5), 1606-1617. Henninger, D. E., Madden, D. J., and Huettel, S. A. (2010). Processing speed and memory mediate age-related dierences in decision making. Psychology and aging, 25 (2), 262- 270. Horn, J. L., and Cattell, R. B. (1967). Age dierences in uid and crystallized intelligence. Acta psychologica, 26, 107-129. Houthakker, H.S. (1950). Revealed Preference and the Utility Function. Economica, 17, 159-174. Houtman, M., and Maks, J. (1985). Determining all maximal data subsets consistent with revealed preference. Kwantitatieve methoden, 19, 89-104. Jaeggi, S. M., Buschkuehl, M., Jonides, J., and Perrig, W. J. (2008). Improving uid intelligence with training on working memory. Proceedings of the National Academy of Sciences, 105(19), 6829-6833. Kane, M. J., and Engle, R. W. (2002). The role of prefrontal cortex in working-memory capacity, executive attention, and general uid intelligence: An individual-dierences perspective. Psychonomic bulletin and review, 9 (4), 637-671. 33 Kaufman, A. S., and Horn, J. L. (1996). Age changes on tests of uid and crystallized ability for women and men on the Kaufman Adolescent and Adult Intelligence Test (KAIT) at ages 17-94 years. Archives of clinical neuropsychology, 11 (2), 97-121. Kim, S., and Hasher, L. (2005). The attraction eect in decision making: Superior per- formance by older adults. The Quarterly Journal of Experimental Psychology Section A, 58(1), 120-133. Kondo, H., Osaka, N., and Osaka, M. (2004). Cooperation of the anterior cingulate cortex and dorsolateral prefrontal cortex for attention shifting. NeuroImage, 23 (2), 670-679. Kovalchik, S., Camerer, C. F., Grether, D. M., Plott, C. R., and Allman, J. M. (2005). Aging and decision making: A comparison between neurologically healthy elderly and young individuals. Journal of Economic Behavior and Organization, 58 (1), 79-94. Kroger, J. K., Sabb, F. W., Fales, C. L., Bookheimer, S. Y., Cohen, M. S., and Holyoak, K. J. (2002). Recruitment of anterior dorsolateral prefrontal cortex in human reasoning: a parametric study of relational complexity. Cerebral Cortex, 12 (5), 477-485. Krueger, F., Spampinato, M.V., Pardini, M., Pajevic, S., Wood, J.N., Weiss, G.H., Land- graf, S., and Grafman, J. (2009). Integral calculus problem solving: An fMRI investi- gation. Neuroreport, 19 (11), 1095-1099. Lewandowsky, S., Oberauer, K., Yang, L. X., and Ecker, U. K. (2010). A working memory test battery for MATLAB. Behavior Research Methods, 42 (2), 571-585. Lichtenstein, S., and Slovic, P. (1973). Response-induced reversals of preference in gam- bling: An extended replication in Las Vegas. Journal of Experimental Psychology, 101(1), 16-20. MacDonald, A.W., Cohen, J.D., Stenger, V.A., and Carter, C.S. (2000). Dissociating the Role of the Dorsolateral Prefrontal and Anterior Cingulate Cortex in Cognitive Control. Science, 288(5472), 1835-1838. Mata, R., Josef, A. K., Samanez-Larkin, G. R., and Hertwig, R. (2011). Age dierences in risky choice: a meta-analysis. Annals of the New York Academy of Sciences, 1235 (1), 18-29. Mather, M., and Carstensen, L. L. (2005). Aging and motivated cognition: The positivity eect in attention and memory. Trends in Cognitive Sciences, 9 (10), 496-502. Mather, M., Mazar, N., Gorlick, M. A., Lighthall, N. R., Burgeno, J., Schoeke, A., and Ariely, D. (2012). Risk preferences and aging: The \certainty eect" in older adults' decision making. Psychology and aging, 27 (4), 801-816. 34 Mattei, A. (2000). Full-scale real tests of consumer behavior using experimental data. Journal of Economic Behavior and Organization, 43 (4), 487-497. Mohr, P. N., Li, S. C., and Heekeren, H. R. (2010). Neuroeconomics and aging: neuro- modulation of economic decision making in old age. Neuroscience and Biobehavioral Reviews, 34 (5), 678-688. Nielsen, L., and Mather, M. (2011). Emerging perspectives in social neuroscience and neuroeconomics of aging. Social cognitive and aective neuroscience, 6 (2), 149-164. Olesen, P. J., Westerberg, H., and Klingberg, T. (2004). Increased prefrontal and parietal activity after training of working memory. Nature neuroscience, 7 (1), 75-79. Park, D.C., Lautenschlager, G., Hedden, T., Davidson, N.S., and Smith, A.D. (2002). Models of visuospatial and verbal memory across the adult life span. Psychology and Aging, 17(2), 299-320. Pelli, D. G. (1997). The VideoToolbox software for visual psychophysics: transforming numbers into movies. Spatial Vision, 10 (4), 437-442. Pochon, J.B., Levy, R., Poline, J.B., Crozier, S., Leh ericy, S., Pillon, B., Deweer, B., Le Bihan, D., and Dubois, B. (2001). The role of dorsolateral prefrontal cortex in the preparation of forthcoming actions: an fMRI study. Cerebral Cortex, 11 (3), 260-266. Prabhakaran, V., Smith, J. A., Desmond, J. E., Glover, G. H., and Gabrieli, J. D. (1997). Neural substrates of uid reasoning: an fMRI study of neocortical activation during performance of the Raven's Progressive Matrices Test. Cognitive psychology, 33 (1), 43-63. Rangel, A., and Clithero, J. (2013). The computation of stimulus values in simple choice. In P.W. Glimcher and E. Fehr (Eds.), Neuroeconomics: decision making and the brain (125-147). Academic Press. Raven, J., Raven, J.C., and Court, J.H. (1998). Manual for Raven's progressive matrices and vocabulary scales. Raz, N., Lindenberger, U., Rodrigue, K. M., Kennedy, K. M., Head, D., Williamson, A., Dable, C., Gerstorf, D., and Acker, J. D. (2005). Regional brain changes in aging healthy adults: general trends, individual dierences and modiers. Cerebral Cortex, 15(11), 1676-1689. Read, D., and Read, N. L. (2004). Time discounting over the lifespan. Organizational behavior and human decision processes, 94 (1), 22-32. 35 Rypma, B., and D'Esposito, M. (2000). Isolating the neural mechanisms of age-related changes in human working memory. Nature Neuroscience, 3 (5), 509-515. Salthouse, T.A., and Babcock, R.L. (1991). Decomposing Adult Age Dierences in Work- ing Memory. Developmental Psychology, 27 (5), 763-776. Samuelson, P.A. (1938). A Note on the Pure Theory of Consumer Behavior. Economica, 5(17), 61-71. Sippel, R. (1997). An Experiment on the Pure Theory of Consumer's Behaviour. The Economic Journal, 107 (444), 1431-1444. Sutter, M., and Kocher, M. G. (2007). Trust and trustworthiness across dierent age groups. Games and Economic Behavior, 59 (2), 364-382. Tentori, K., Osherson, D., Hasher, L., and May, C. (2001). Wisdom and aging: Irrational preferences in college students but not older adults. Cognition, 81 (3), B87-B96. Varian, H.R. (1982). The Nonparametric Approach to Demand Analysis. Econometrica, 50(4), 945-74. Varian, H.R. (1990). Goodness-of-t in optimizing models. Journal of Econometrics, 46(1), 125-140. Visser, M., Harbaugh, B., and Mocan, N. (2006). An experimental test of criminal behav- ior among juveniles and young adults. NBER Working Paper 12507. Wright, R.E. (1981). Aging, Divided Attention, and Processing Capacity. Journal of Gerontology, 36 (5), 605-614. Zamarian, L., Sinz, H., Bonatti, E., Gamboz, N., and Delazer, M. (2008). Normal aging aects decisions under ambiguity, but not decisions under risk. Neuropsychology, 22 (5), 645-657. 36 1.8 Appendix 1.8.1 Appendix A1. Example of direct violation in a triplet of trials of treat- ment S - 6 t a 12 t c 0 12 t b 12 t a 0 12 t c 12 t b 0 12 q 2 q 1 0 Figure 10: Trials (a 12 vs. a 0 12 ), (b 12 vs. b 0 12 ), (c 12 vs. c 0 12 ) In the example of Figure 10, no pair of trials satises condition (i) of Denition 1, so there cannot be a direct violation between any pair of trials. Suppose now that a 12 is chosen over a 0 12 , b 12 is chosen over b 0 12 and c 12 is chosen over c 0 12 . Since q a 0 x >q b x for all x and q b 0 x > q c x for all x, we have a 12 a 0 12 b 12 b 0 12 c 12 . Since q c 0 x > q a x for all x, we have c 12 c 0 12 a 12 . This forms a contradiction to the maximization of monotonic and transitive preferences. Notice that the key reason we have a direct violation between triplets of trials but not between any pair of trials is that q b 0 2 < q a 2 and q a 0 2 < q c 2 . This issue would not arise if, instead of a discrete number of alternatives we were to oer subjects the entire budget set. 1.8.2 Appendix A2. List of all food items (with portions) used in the exper- iment Almond (2); Barbecue popped potato chip (1); Cashew (2); Cheddar cracker (2); Mini cheese sandwich cracker (1); Citrus gum drop (2); Roasted gorgonzola cracker (2); Gummy bears (2); Popcorn (2); M&M (2); Dark chocolate peanut butter cup (1); Mini chocolate- covered pretzel (1); Mini Oreo (1); Onion- avored corn snack, \Funyuns" (1); Peanut (3); Pistachio (2); Potato chip (1); Mini pretzel (1); Pretzel nugget (2); Sweet potato chip (1); Yogurt-covered raisin (1); 37 1.8.3 Appendix A3. Individual analysis The aggregate results suggest that complexity aects the ability to make consistent choices dierentially across individuals. Eects are stronger among OA and may be attributable to declines in working memory. Yet, behavior is heterogeneous even in the OA group indicating that aging is either not aecting all subjects similarly or that some subjects are capable of developing strategies to remain consistent. A3.1. A random utility model (RUM) In order to better understand individual dierences, we estimate a random utility model (RUM) for each subject in each treatment. Specically, in treatment S, each subjecti in each trialo chooses between a bundle on the left (l) of the screen, denoted by BU l , and a bundle on the right (r) of the screen, denoted by BU r . A decision is obtained by comparing the utility derived from each option. We assume that utility depends linearly on the observable quantities of the goods 1 and 2, as well as on a stochastic unobserved error component k i , k =l;r. Formally: u io (BU l ) = i1 q l 1o + i2 q l 2o + l i and u io (BU r ) = i1 q r 1o + i2 q r 2o + r i whereq k jo is the quantity of goodj =f1; 2g in bundlek =fl;rg of trialo. The probability of individual i choosing option BU l in trial o is therefore: P l io = Pr h i1 q l 1o + i2 q l 2o + l i > i1 q r 1o + i2 q r 2o + r i i = Pr h r i l i < i1 (q l 1o q r 1o ) + i2 (q l 2o q r 2o ) i and P r io = 1P l io . We assume that error terms are i.i.d. and follow an extreme value distribution: the cumulative distribution function of the error term isF i ( k i ) =exp(e k i ). Therefore, the probability that subject i chooses option BU l is the logistic function: P l io (q l 1o q r 1o ;q l 2o q r 2o ) = 1 1 +e i1 (q l 1o q r 1o )+ i2 (q l 2o q r 2o ) : For each individual i the parameters to estimate are i1 and i2 , which we achieve by maximum likelihood. 30 30 We obtain O observations. The log-likelihood is therefore: logLi = O X o=1 log h P l io (q l 1o q r 1o ;q l 2o q r 2o )1 l + [1P l io (q l 1o q r 1o ;q l 2o q r 2o )][11 l ] i where1 l = 1 if BU l is chosen and1 l = 0 if BU r is chosen. 38 A similar model is estimated in treatment C. The bundle on the left is made of goods s and w while the bundle on the right is made of goods p and s. The utilities are now: u io (BU l ) = is q l so + iw q l wo + l i and u io (BU r ) = ip q r po + is q r so + r i and the probability that subject i chooses option BU l is the logistic function: 31 P l io (q l wo ;q l so q r so ;q r po ) = 1 1 +e iw q l wo + is (q l so q r so ) ip q r po : We estimate the parameters for each individual in each treatment. We then predict the choice in each trial given the estimated parameters and we count the number of misclassi- ed trials. Importantly, we nd that misclassication rates in each treatment are strongly correlated with the number of violations (Pearson coecient = .77 in treatment S and .77 in treatment C). This suggests that the classication level of RUM is a reliable proxy for GARP consistency: subjects who are not well predicted by the model are inconsistent. Finally, notice that RUM presupposes more errors when the dierence in utility between the two bundles is small, which means that it is also related to severity of violations. 32 It is therefore natural that we also nd a signicant correlation between severity of violations and misclassication rates (Pearson coecient = .52 in treatment S and .32 in treatment C). A3.2. Clustering We then use RUM misclassication data to group individuals with the objective of nding common patterns of behavior. For each individual, we com- pute the percentage of misclassied trials given the maximum likelihood estimation of the RUM model in treatments S and C, respectively. Contrary to violation counts, these two percentages are comparable between treatments. They provide two interpretable measures related to, but not based on, violations that we can use to cluster our subjects. We consider a model-based clustering method to identify the clusters present in our population. We 31 The log-likelihood is now: logLi = O X o=1 log h P l io (q l wo ;q l so q r so ;q r po )1 l + [1P l io (q l wo ;q l so q r so ;q r po )][11 l ] i 32 To check the specication of the model, we ran a Probit regression of the probability of correct classication as a function of the absolute utility dierencejBU r BU l j. As predicted by RUM, most subjects have positive coecients (better classication when utility dierences are large). Also, subjects with negative coecients are those with highest number of violations, that is, those for which RUM is not well specied. Finally, as another robustness check, we correlated RUM misclassications with GARP violationsDC andIC separately, and obtained the same results. 39 retain two measures: the % of RUM misclassications in S, and the dierence between the % of RUM misclassications in C and the % of RUM misclassications in S. We opt for this second measure (rather than simply % of RUM misclassications in C) because of the importance of understanding the treatment eect between simple and complex choices. A wide array of heuristic clustering methods are commonly used, however they usually require the number of clusters and the clustering criterion to be set ex-ante rather than endogenously optimized. Mixture models, on the other hand, treat each cluster as a component probability distribution. Thus, the choice between numbers of clusters and models can be made using Bayesian statistical methods (Fraley and Raftery, 2002). We implement our model-based clustering analysis with the Mclust package in R (Fraley and Raftery, 2006). We consider ten dierent models with a maximum of nine clusters each, and determine the combination that yields the maximum Bayesian Information Criterion (BIC). For our data, the ellipsoidal, equal shape model that endogenously yields three clusters maximizes the BIC. Table 5 provides summary statistics of the three clusters. The rst two rows display the average percentage of RUM misclassications in S and C by subjects in each cluster, the variables used for the clustering. The next two rows present the composition of YA and OA in each cluster. The last ve rows summarize the average performance within each cluster in the consistency task (GARP violations) and the tests (WM and IQ). Clusters are ordered from smallest to largest in the percentage of misclassied observations. Cluster 1 Cluster 2 Cluster 3 % RUM misclassications in S 3.2 (0.6) 16.2 (0.5) 36.5 (5.0) % RUM misclassications in C 12.7 (1.4) 17.0 (1.2) 40.4 (5.3) Number of YA 13 30 7 Number of OA 20 15 10 Number of violations in S 1.3 (0.6) 3.6 (1.6) 48.1 (9.6) Number of violations in C 29.6 (9.7) 18.1 (6.3) 189.2 (47.0) Number of violations in A 1.6 (0.4) 0.9 (0.3) 4.7 (1.0) Working Memory test 173.3 (5.1) 187.0 (4.2) 169.8 (8.3) IQ test 9.8 (0.4) 10.2 (0.4) 9.2 (0.8) standard errors in parentheses Table 5: Summary statistics by cluster. Cluster 1 is characterized by almost no misclassication in S and few in C. Cluster 2 also exhibits limited misclassications in both S and C (although more than cluster 1). Cluster 3 has substantial misclassications in both treatments. The rst surprising 40 nding is the allocation of OA and YA across clusters. Given our previous results, one would expect more YA in cluster 1 and more OA in cluster 2. We nd the reverse. Cluster 3 is a mix of subjects. When we consider performance in the choice tasks and tests, we notice that cluster 3 stands out as a group of inconsistent subjects exhibiting a large number of GARP violations and low performance in WM and IQ tests. These subjects also fail our trivial trials much more frequently than the rest of the subjects. Not surprisingly, the vast majority of minimizers (6 in treatment S and 5 in treatment C) belong to this cluster. Clusters 1 and 2 are composed of relatively consistent subjects and dier mostly in the way their behavior compares between treatments. In treatment S, subjects in cluster 1 are very well-classied and have almost no violations while subjects in cluster 2 are slightly more inconsistent. In treatment C, subjects in cluster 1 decrease signicantly in their performance while subjects in cluster 2 remain more consistent. Overall, cluster 2 is a group of \consistently consistent" subjects. By contrast, subjects in cluster 1 are remarkably consistent in S but signicantly less in C. Figure 11 provides two dierent representations of the three clusters. In the left graph, clusters 1, 2, and 3 are displayed according to the % of RUM misclassications in treat- ments S and C (rows 1 and 2 in Table 5). 33 In the right graph, these same subjects and clusters are represented based on a log transformation of the average number of violations in treatments S and C (rows 5 and 6 in Table 5). Figure 11: Cluster representation. Misclassied trials (left) and number of violations (right) in treatments S and C. 33 Recall that the exact variables used to group the individuals are % of RUM misclassications in S and dierence between the % of RUM misclassications inC andS. Our display helps visual clarity (both measures are between 0 and 100) while keeping the essence of the clustering. 41 Clusters are clearly dierentiated in the left graph. This is is not surprising since the variables are, up to a transformation, the ones used for grouping the individuals. The gure highlights the dierences across clusters emphasized above: small percentage of RUM misclassications in both treatments for cluster 1, slightly larger in S for cluster 2, and a substantial fraction of misclassications for cluster 3 in both treatments. More interestingly, the right graph also shows clear dierences across clusters. Cluster 1 has (with a few exceptions) almost no violations in S and some in C, cluster 2 has a more even distribution of violations between S and C than cluster 1, and cluster 3 is, again, an outlier in both types of violations. This reasonable mapping is quite remarkable given that subjects are not clustered on the basis of that variable. It suggests a tight relationship between classication by RUM and GARP violations. It also suggests that the transition from the simple to the complex situation is more dicult for individuals in cluster 1 than for those in cluster 2. We investigate this issue in more detail in the next section. Finally, we nd that subjects in cluster 1 have signicantly worse working memory scores than subjects in cluster 2 (p-value = .040); they also have lower IQ scores but the dierence is not statistically signicant. This suggests a relationship between working memory and the ability to remain consistent as the complexity of the task increases. A3.3. Simple choice rules The extreme degree of consistency and lack of misclas- sications by cluster 1 subjects in treatment S (26 subjects out of 33 have zero violations), together with the fact that many of them are in the OA population and perform signif- icantly worse in C is somewhat puzzling. Examining the value estimates of the RUM model (the ij -coecients) in more detail, we nd that for some subjects one value esti- mate in S and two value estimates in C are close to 0. These are subjects whose behavior is consistent with maximizing the quantity of their most preferred item. For some other subjects, the value estimates of all goods are almost identical to each other. These are subjects for whom goods are perfect substitutes, so that their behavior is consistent with maximizing the total quantity in the bundle. These two choice strategies are clearly con- sistent with the maximization of monotonic and transitive preferences, resulting in high degrees of consistency. At the same time, subjects with these types of preferences do not need to perform sophisticated mental trade-os between items and, instead, can use simple choice rules. We therefore hypothesize that having these specic preferences may potentially explain why cluster 1 exhibits such an extremely high level of consistency in treatment S. With this idea in mind, we construct two simple choice rules for subjects in clusters 1 42 and 2: H (for highest), where the subject maximizes the quantity of one of the items in the bundle (presumably, the one with highest value) and T (for total) where the subject maximizes the total quantity in the bundle (presumably because goods are perfect substi- tutes). We assign type H (T) to a subject if (i) the ruleH (T ) generates the same or fewer number of misclassications as RUM and (ii) this number is smaller than 3 in treatment S and smaller than 10 in treatment C. These arbitrary thresholds are simply meant to re ect the nature of a quick and simple choice rule that can be implemented with \few" errors. Otherwise, we assign type O to the subject (for other). In other words, we assume that the 33 subjects in cluster 1 and 45 subjects in cluster 2 maximize a well dened utility function, linear in the goods present in the bundle, but that they make some errors. As we know from sections A3.1 and A3.2, this is a reasonable description of behavior by subjects in those clusters. We then divide the sample into three types, depending on whether the optimal choice given their preferences can be implemented with a simple rule (types H and T) or not (type O). Table 6 summarizes the number of subjects of each type by cluster. We also add in parentheses the average number of violations for type O subjects (since violations are typically very small for types H and T, the numbers are omitted). Cluster 1 Cluster 2 H T O H T O Treatment S 27 5 1 (0) 1 10 34 (3.9) Treatment C 10 4 19 (46.7) 7 3 35 (21.4) Table 6: Types of preferences by subjects in clusters 1 and 2 All but one subjects in cluster 1 have preferences consistent with a simple rule in treatment S, mostly H . More than half of these subjects change their strategy in treatment C and are there best classied as type O. By contrast, only one-quarter of subjects in cluster 2 have a preference consistent with a simple rule and there is no treatment eect. It is remarkable to see such sharp dierences across clusters of choices consistent with simple rules, even though subjects are not grouped based on that dimension. Our conjecture is that simple rules are more natural in treatment S, where the same goods are oered in both bundles: if one good is strongly preferred, the subject can lexicographically settle for it (type H ); if both goods are of similar value, the subject can focus on total quantities (type T). In treatment C, subjects are forced to compare \apples to oranges" so simple rules are less intuitive to implement. 34 Subject are more likely to explicitly trade-o the dierent alternatives, which explains why more of them are better 34 Interestingly, the majority of subjects make signicantly more violations when one specic item is common, which suggests that trade-os are more or less dicult depending on the composition of bundles. 43 classied as type O. Finally, since trade-os are dicult, type O subjects have signicantly more violations than either type H or T subjects. A3.4. Summary Overall, the individual analysis reveals interesting insights re- garding the preferences and strategies of our subjects. First, a structural model (RUM) { where utility depends linearly on the quantities of goods in each bundle and the subject chooses the bundle that yields the highest utility { provides a good t for a majority of subjects, but by no means for all of them. Second, a cluster analysis based on RUM mis- classications suggests three distinct groups. The RUM provides a reasonably good t for two groups of subjects (clusters 1 and 2) and a poor t for the last one (cluster 3). Third and as expected, RUM misclassications are correlated with GARP violations. Subjects in cluster 3 perform badly in both treatments of the consistency task, whereas subjects in clusters 1 and 2 perform reasonably well. Surprisingly, however, cluster 1 (the group with the fewest RUM misclassications) has more violations in treatment C than does cluster 2. The composition is also dierent: two-thirds of subjects in cluster 1 are OA whereas two-thirds of subjects in cluster 2 are YA. Fourth, an analysis of simple rules of behav- ior consistent with utility maximization sheds light on the dierences in age composition and consistency across tasks between clusters 1 and 2. Cluster 1 is mostly composed of OA who use a simple rule in treatment S (maximize the amount of the preferred good), resulting in extremely consistent behavior. Their consistency decreases substantially in treatment C, possibly due to the diculty of implementing a simple rule when the bun- dles contain dierent goods. By contrast, cluster 2 is mostly composed of YA who use simple rules signicantly less often but perform better value-quantity tradeos. These subjects make slightly more consistency mistakes in S but less in C. Finally, a conjecture consistent with the results presented here is that some subjects who are aware of their compromised working memory and uid intelligence (mostly OA) resort to simple choice rules. Such strategies can be applied in the simple treatment but not in the complex one. This explanation is reasonable and appealing, however it requires the supporting evidence of new experiments. 44 1.8.4 Instructions PART 1 - Prep and Introduction (10-15 minutes): EXP 1 and EXP 2: Prepare computers, label seats, and have ready a list of conrmed subjects. Have ready consent forms with Items Sheet attached to front. Place a pen at each table. Lay out one serving of each type of food on the counter in the waiting area. The food items should be labeled both by their name and by the image that will represent them during the experiment. EXP 1: Call in subjects one at a time and check their IDs. EXP 1: \Hello and welcome. Before we start we need to ask you when your last meal was. When did you last eat or drink something besides water?" EXP 1: Wait for response. If last meal was less than three hours ago, thank them for com- ing and explain that they cannot participate due to their noncompliance to pre-experiment instructions. If last meal was at least three hours ago, proceed. EXP 1: \Today, you will be making choices between bundles of dierent foods. We want to make sure you like the food items between which you are deciding. Please take some time to look at the dierent items laid out here and think of which ve you like the most. Keep in mind that you may be consuming some of these foods together in dierent amounts. The images you see above each food will be the ones you see during the experiment. This is NOT part of the experiment { please pick the food items that you are most interested in eating." EXP 1: Give subject a few minutes to survey the foods. EXP 1: \Have you chosen your ve items?" EXP 1: In the case that they have not chosen their items, wait another couple of minutes. Otherwise, continue. EXP 1: \Please let me know which items you have chosen. Would you enjoy eating any combination of these items? Again, this is NOT part of the experiment but you may be consuming some of these foods together so we want to be sure that you like them." EXP 1: After ensuring their choices are indeed desirable in combination with one another, write item names on subject's Items Sheet. EXP 1: \Attached to this sheet is a consent form. As you wait for the experiment to begin, please read the form and sign the last page to consent." EXP 1: Direct subject to their seat. EXP 1: Repeat above steps until all subjects have been seated. EXP 2: Modify subject's MATLAB code to ensure only chosen items will be displayed during the experiment. EXP 2: \Please make sure your phone is o or on silent mode and do not touch anything as you wait for further instructions." After all subjects have been seated and their MATLAB code modied... 45 EXP 1: \Dear participants: hello and thank you for coming to this experiment. Today, you will be making choices between bundles of dierent food items that you like. After you have made your choices, you will complete two short tests and a questionnaire. You will receive food at the end, based on your responses during the experiment. More specically, one of the choices you make during the experiment will be randomly selected, and, at the end you will receive the amount of food represented in that choice. So, make every choice today as if it were the ONLY choice you were making. For example, if your choice of \three chips and two cookies" is randomly selected, at the end of the experiment this is exactly what you will be receiving { and, eating. You will be given fteen minutes after the experiment to eat what you receive. You are asked to stay in the waiting area for the whole fteen minutes. During that time you will have to consume your food items and nothing else. Water will be provided upon request. After that, you will be paid $20 in cash for your participation. You may leave at any time during the experiment, but if you leave before the end, you will not receive the full compensation. Before each part of the experiment, I will be giving you brief instructions. You can ask questions during these times." PART 2 - GARP Task (15-20 minutes): EXP 1: \Now, you will be choosing between dierent combinations of food items displayed on your computer." EXP 1: Show sample screenshot. EXP 1: \Here is a sample of what your screen may look like. This is a screenshot for someone that had chosen - say what the items are - in the beginning. The only foods you will see on your screen are the ones you chose in the beginning. Similar to here, you will always have a choice between two combinations: one shown on the right side of the screen, and one shown on the left. If you like the combination shown on the right side more, tap the right side of the computer. If you like the combination on the left side of the screen more, tap the left side. You cannot tap both sides at once. Remember to make every choice as if it were the ONLY one that counted because you will be receiving exactly one of your choices at the end. For example, if I were to tap the left side, there is a chance that I will receive and eat - say what the foods and quantities of each food are - at the end. The experiment is broken down into four parts. You will be making about 35 such choices in each part. When you are done with each part, a screen that reads `Break' will appear. Please do not touch your screen at that time, but wait for instructions from me to proceed. We will always wait for everyone to nish a part before moving on. Raise your hand if you have any questions now." EXP 1: Look around for raised hands and answer any questions that may arise. EXP 1: \Let us proceed with Part 1 of the experiment. Remember, when you are done with this part, a screen that reads `Break' will appear. Do not press anything but wait for further instruction from me at that point. Remember to make every choice as if it were the ONLY one that counted because you will be receiving exactly one of your choices at the end. Tap the screen to begin the experiment." 46 EXP 1: Wait until everyone has completed Part 1. Wait 30 seconds after the last person has nished. EXP 1: \Now we will move on to Part 2. As before, tap the side of the screen displaying the combination you like more. When you are done with this part, a screen that reads `Break' will appear. Do not press anything but wait for further instruction from me at that point. Remember to make every choice as if it were the ONLY one that counted because you will be receiving exactly one of your choices at the end. Tap the screen to begin." EXP 1: Wait until everyone has completed Part 2. Wait 30 seconds after the last person has nished. EXP 1: \Now we will move on to Part 3. As before, tap the side of the screen displaying the combination you like more. When you are done with this part, a screen that reads `Break' will appear. Do not press anything but wait for further instruction from me at that point. Remember to make every choice as if it were the ONLY one that counted because you will be receiving exactly one of your choices at the end. Tap the screen to begin." EXP 1: Wait until everyone has completed Part 3. Wait 30 seconds after the last person has nished. EXP 1: \Now we will move on to Part 4. As before, tap the side of the screen displaying the combination you like more. When you are done with this part, a screen that reads `Break' will appear. Do not press anything but wait for further instruction from me at that point. Remember to make every choice as if it were the ONLY one that counted because you will be receiving exactly one of your choices at the end. Tap the screen to begin." EXP 1: Wait until everyone has completed Part 4. PART 3 - Working Memory Test (10-15 minutes): EXP 1: \You are done with the decision-making portion of the experiment. We will now begin the rst test. This test is designed to measure your short-term memory abilities." EXP 1: Show a sample image of the 10-by-10 matrix they will be seeing during the exper- iment. EXP 1: \During the test you will see a 10-by-10 checkerboard as shown here. Solid black dots will appear, and quickly thereafter, disappear, in some of the spaces. You will see anywhere between two to six black dots appear and disappear in succession. After a short time, the entire checkerboard will disappear, and in its place, an empty checkerboard will appear. You are to tap the spaces of the empty checkerboard where you remember the dots to have been. In this test, it is not important that you accurately recall the positions of the dots; it is more important that you remember the relative positions of the dots. For example, if three dots appeared, one in the top center, one on the bottom right, and one on the bottom left, it would be more benecial to recall the triangular pattern and recreate it to the best of your abilities, than to accurately remember the position of one 47 of the dots of that triangle. Also, you do not need to remember the order in which the dots appeared - you can tap the spaces of the empty checkerboard in whatever order you like. If you would like to undo a selection, you can tap the dot to erase it. You will rst do two practice trials and then the test will begin. Are there any questions? Let us begin the practice trials. Please tap the screen to begin." EXP 1: Wait for all subjects to complete practice trials. EXP 1: \Are there any questions about this test?" EXP 1: Look around for raised hands and answer any questions that may arise. EXP 1: \You may begin now." EXP 2: Once all subjects have completed the test, collect tablets from subjects and begin preparing their rewards. PART 4 - IQ Test (15-20 minutes): EXP 1: \This next part is a test of perception and clear thinking. We will rst do two practice problems to familiarize you with the format of the test and method of thought required. The top part of the rst sample problem is a pattern with a bit cut out of it. Look at the pattern, think what the piece needed to complete the pattern correctly both along and down must be like. Then nd the right piece out of the eight bits shown below. Only one of these pieces is perfectly correct. No. 2 completes the pattern correctly going downwards, but is wrong going the other way. No. 1 is correct going along, but is wrong going downward. Think about which piece is correct both ways. No. 4 is the right bit, isn't it? So the answer is No. 4, and you select No. 4." EXP 1: Check that everyone has selected \4" for the rst sample problem. EXP 1: \Now turn to the next page and do the second sample problem by yourselves." EXP 1: Allow 20 seconds. EXP 1: \The answer is No. 8. See that you have selected No. 8. Have you all done that?" EXP 1: Check that everyone has selected No. 8. EXP 1: \Is everyone clear about what it is you are to do on this test?" EXP 1: Answer any questions that subjects may have. EXP 1: \You can have as much time as you like for the rest of the test. You will nd that the problems soon get dicult. Whether the problems are easy or dicult, you will notice that to solve them you have to use the same method all the time. Keep in mind, it is accurate work that counts. Attempt each problem in turn. Do your best to nd 48 the correct piece to complete it before going on the next problem. If you get stuck, you can move on and come back to the problem later. But remember, in every case, the next problem is harder and it will take you longer to check your answers carefully. When you get to the end of the test, please wait for further instructions. Are there any questions?" EXP 1: Pause brie y. Check that everyone is ready to start. EXP 1: \You may begin now." EXP 1: Wait for all subjects to complete test. PART 5 - Demographic Questionnaire (10-15 minutes): EXP 1: \You will now complete a brief questionnaire, which begins on the following page. After you have completed the questionnaire please remain in your seat. Are there any questions?" EXP 1: Look around for raised hands and answer any questions that may arise. After subjects have completed questionnaire... EXP 1: \The computer has randomly selected one of the bundles you chose today. The other experimenter will now call you one-by-one by your subject ID number. They will hand you your randomly-selected food items. As stated earlier, you will be receiving portions that correspond exactly with one of your choices during the experiment. Once you have received your items, please remain in the waiting area. You may begin to consume your food once received, however you are required to stay in the waiting area for fteen minutes after the last subject arrives there. Raise your hand if you have any questions now." EXP 1: Look around for raised hands and answer any questions that may arise. PART 6 - Consumption (15 minutes): EXP 2: Call the rst subject to the waiting area using subjects' number. Give the subject their food items and call the next subject. Repeat until all subjects have received their bundles. EXP 2: \You now have fteen minutes to eat the items you received. You are asked to stay in this room for the whole fteen minutes. After that period we will pay you the $20 participation fee and you will be free to leave." EXP 2: After fteen minutes, call each subject one-by-one using subject ID numbers and pay subjects their participation fee. Have subjects sign receipt upon receiving their com- pensation. Thank them and let them know they are free to leave. 49 2 Value-Based Decision-Making: A New Developmental Paradigm ∗ Isabelle Brocas University of Southern California and CEPR Juan D. Carrillo University of Southern California and CEPR T. Dalton Combs Dopamine Labs Niree Kodaverdian University of Southern California Abstract How does value-based reasoning develop and how dierent this development is from one domain to another? Children from Kindergarten to 5th grade made pairwise choices in the Goods domain (toys), Social domain (sharing between self and other), and Risk domain (lotteries) and we evaluated how consistent their choices were. We report evidence that the development of consistency across domains cannot be ac- counted for by existing developmental paradigms: it is not related to the development of transitive reasoning and it is only partially linked to developmental aspects of atten- tional control and centration. Rather, choice consistency is related to self knowledge of preferences which develops gradually and dierentially across domains. The Goods domain oers a developmental template: children become more consistent over time because they learn what they like most and least. Systematic dierences exist how- ever in both the Social and Risk domains. In the Social domain, children gradually learn what they like most but not what they like least, while in the Risk domain, they gradually learn what they like least but not what they like most. These asymmetric developments give rise to asymmetric patterns of consistency. ∗ We are grateful to members of the Los Angeles Behavioral Economics Laboratory (LABEL) for their insights and comments in the various phases of the project. We also thank participants at the 2014 Social Neuroscience retreat (Catalina Island, USC) and at the 2015 Morality, Incentives and Unethical Behavior Conference (UCSD) for useful comments and the sta at the Lyc ee International de Los Angeles for their support. All remaining errors are ours. The study was conducted with the University of Southern California IRB approval UP-12-00528. We acknowledge the nancial support of the National Science Foundation grant SES-1425062. Address for correspondence: Isabelle Brocas, Department of Economics, University of Southern California, 3620 S. Vermont Ave., <brocas@usc.edu>. 2.1 Introduction Adults have many abilities that children lack. Multiple paradigms exist to explain why these dierences exist and change during development. Each of these paradigms describes an important aspect of development, although none of them alone can explain all of the diverse abilities that change as we grow. Here, we present evidence that an important dierence between children and adults cannot be explained by existing paradigms; namely, that adults consistently know what they want but children do not. We can measure the consistency of preferences by testing for transitivity of choices: if a subject chooses option A over option B and option B over option C, consistency of preferences requires that she must choose option A over option C. A number of studies have shown that children, especially young ones, are less consistent than adults in the Goods domain, that is, when choosing between foods or toys (Smedslund, 1960; Harbaugh, Krause, and Berry, 2001; List and Millimet, 2008) while there is some partial evidence that age is less of a predictor of consistency in the Social (Harbaugh and Krause, 2000) and Risk (Harbaugh, Krause, and Vesterlund, 2002) domains. This suggests that children's self-knowledge of preferences is imperfect and varies across domains. It is intuitive, however, that dierences in transitivity across ages and domains may re ect known aspects of development. Even though children know that they prefer A to B and B to C, their choices may not conform to this ranking for reasons related to established developmental paradigms. For example, it could occur because attentional control, a decit which has previously been associated with intransitive behavior in older adults (Brocas, Combs, Carillo, and Kodaverdian, 2016), is still underdeveloped in children (Davidson, Amso, Anderson, and Diamond, 2006; Astle and Scerif, 2008). Alternatively, it may be because the ability to reason logically (Sher, Koenig, and Rustichini, 2014) and transitively (Piaget, 1948; Bouwmeester and Sijtsma, 2006) is still developing. Finally, it may result from children's inability to focus on more than one attribute of an item at a time, a phenomenon referred to as centration (Piaget, Elkind, and Tenzer, 1967; Donaldson, 1982; Crain, 2015). The objective of this research is to assess the common and domain-specic develop- mental trajectories of transitive decision-making in the Goods, Social, and Risk domains and to determine if the dominant developmental paradigms (attentional control, logical reasoning, and centration) are enough to explain age- and domain-related dierences in transitive decision-making. Our study is most closely related to the literature on choice consistency. There are however two main dierences between the existing studies and ours. A rst dierence is conceptual. We want to compare consistency across domains and investigate how existing developmental paradigms account for changes in preference 51 consistency over childhood. Earlier studies oered instead analyses of consistency in one domain at a time. A second dierence is methodological. Earlier studies have relied on the Generalized Axiom of Revealed Preferences (GARP), an indirect test of transitivity which focuses on choices between bundles of options given a budget constraint, a system of prices, and a non-satiation assumption (Samuelson, 1938; Varian, 1982). By contrast, our design focuses on transitivity at a more fundamental level. The extreme simplicity of the design makes it especially suitable to test it in young children, and also delivers results that are directly comparable across domains. 2.2 Methods We recruited 134 children from Kindergarten to 5th grade and 51 Undergraduate students to participate in three Choice tasks and three Ranking tasks (Fig.12). In each Ranking task, we asked participants to provide explicit rankings of seven items. In each Choice task, we asked them to choose between all 21 pairwise combinations of those seven items. The Goods-Choice and Goods-Ranking tasks involved age-appropriate toys, the Social- Choice and Social-Ranking tasks involved sharing rules between self and other, and the Risk-Choice and Risk-Ranking tasks involved lotteries. For each domain and age group, we determined whether the pairwise choices in the Choice tasks were transitive and how in- transitive choices were distributed over rankings elicited in the Ranking tasks. We included attention trials to assess attentional control, and a reasoning task to measure transitive reasoning. To assess centration, we determined whether actual choices were consistent with attending to single attributes. For analysis, we grouped children into the age groups K-1st (Kindergarten and 1st grade, 47 participants), 2nd-3rd (2nd and 3rd grades, 55 participants), 4th-5th (4th and 5th grades, 32 participants), and U (Undergraduates, 51 participants). The U age group was a control for adult level value-based decision-making. Appendix SI-1 contains a detailed description of the design and procedures. 2.3 Experimental Results We measured choice consistency by counting the number of transitivity violations (TV). Formally, for each triplet of items A, B and C, a TV occurs when the participant chooses A over B, B over C, and C over A. In the analysis, we considered all combinations of triplets of items and counted the number of times choices between those triplets were intransitive for each subject in each Choice task. Then, we computed the average number of TV by age group and Choice task. Appendix SI-2 explains the procedure and provides details of the statistical methods and results in this section. Choice transitivity is domain-dependent (Fig.13). TV decreased with age in both the 52 A) Choice Task B) Ranking Task C) Goods D) Social E) Risk Opt. 1 Opt. 2 Figure 12: Decision-making tasks. (A) In each of the 21 trials of the Choice tasks, participants were shown one combination of two options, left (Opt.1) and right (Opt. 2). They touched one of three buttons displayed at the top of their screen to select an option or to express indierence (middle button). (B) In Ranking tasks, participants ranked all seven options from most preferred (green happy face) to least preferred (yellow neutral face). Both types of tasks were conducted in the (C) Goods domain involving toys, in the (D) Social domain involving sharing rules for self (hand pointing at self) and other (hand pointing right), and in the (E) Risk domain involving lotteries that consisted of quantities (number of toys) and probabilities (green share of the pie). Goods and Social domains, while no two age groups diered signicantly in the Risk do- main. Also, young children were signicantly more consistent in the Social than in the Goods domain and U were less consistent in the Risk than in the other two domains. However, participants in the U age group were not making signicantly more transitive choices compared to participants in the 4th-5th age group in any domain, initially sug- gesting that the development of transitive decision-making stops around 4th grade in all domains. The result was robust to alternative ways of measuring choice inconsistencies (see Appendix SI-3.1.) Transitivity in the Goods domain improves gradually with age. Although consistency in the Goods-Choice task improved with age, that improvement was not uniform across all paired choices. In particular, trials featuring options ranked very dierently in the Goods-Ranking task were unlikely to be involved in a TV. By contrast, trials featuring options ranked similarly were signicantly more likely to be involved in TV (Fig.14). This general pattern was independent of age and suggested that it was more dicult to choose 53 K & 1 st 2 nd & 3 rd 4 th & 5 th U Age Group Transitivity Violations per Subject Goods Social Risk Choice-Task Figure 13: Performance improves with age in the Goods-Choice and Social- Choice tasks but not in the Risk-Choice task. We report the average number of TV in the Choice tasks (y-axis) for each age group (x-axis) broken down by domain (Goods, Social, and Risk). Shadings correspond to the 95% condence intervals. consistently when the options involved were close in value. We observed convergence to a state where participants almost never committed TV when choices involved their best (left column) or their worst (bottom row) options and most TV got concentrated in items with intermediate ranks (3rd, 4th and 5th). The result implies that children gradually \learn to know what they like most and least" with age. \Looking consistent" in the Social domain and the role of simple policies. We did not anticipate to nd that small children would be signicantly more consistent in the Social-Choice task compared to the Goods-Choice task. One possible explanation is that goods are atomic and need to be evaluated as a whole. By contrast, social options can be decomposed into several simple attributes such as \objects for self," \objects for other," and \total number of objects," where each attribute is easy to evaluate consistently. As such, a participant might focus on a single attribute at a time (centration) and use simple rules to choose. To test that hypothesis, we listed all such simple rules (for example, \pick the option that gives self more objects, and if both give the same, pick the option that gives other more objects") and determined the fraction of participants who employed them. From now on, we call them `simple policies,' although they may also correspond to simple heuristics or lexicographic preferences. Either way, these rules \buy" consistency, as they are easy to follow and unlikely to produce TV. The most popular simple policy was that of maximizing objects for self, then mini- 54 0.7 0.6 0.5 0.8 0.4 0.5 0.3 0.2 0.1 0.2 0.3 0.4 0.0 0.0 0.1 0.2 0.3 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 Rank of Higher Ranked Option Rank of Higher Ranked Option Rank of Higher Ranked Option Rank of Higher Ranked Option Rank of Lower Ranked Option Transitivity Violations in Goods-Choice-Task Plotted by Option Ranks in Goods-Ranking-Task for Each Age Group K and 1 st 2 nd and 3 rd 4 th and 5 th Undergraduate Figure 14: Transitivity violations decrease with age dierently across choices. Each cell represents the color-coded average number of TV involving a higher ranked option (x-axis) and a lower ranked option (y-axis), as revealed by explicit rankings obtained in the Goods-Ranking task. Lighter colors re ect more violations. All age groups are more likely to make TV when options have similar ranks. There is convergence to a state where participants almost never commit TV when choices involve their best (left column) or their worst (bottom row) options. The vectors in the top right corner show the average gradient of the heatmap. Subjects become more consistent if the rank of the higher-ranked option increases (left-right gradient of the vector) and if the rank of the lower-ranked option decreases (up-down gradient of the vector). mizing objects for other. The choices of 35% of K-1st, 20% of 2nd-3rd, 22% of 4th-5th and 4% of U were in-line with this policy. When we removed from our sample all the sub- jects who used simple policies, the developmental signature matched that of the Goods domain (Fig.15, left panel). However, a systematic dierence with the Goods domain per- sisted when we looked at how violations evolved between similarly- and dierently-ranked options (Fig.16, top row (A)). We found that children learned to become consistent in choices involving their best options (\they learned to know what they liked most") but they still committed a substantial number of violations in choices involving their worst options (\they did not learn to know what they liked least"). This was true even in the U age group. Last, a noticeable trend was the gradual evolution of behavior towards more integrative decision rules, re ecting trade-os between the two attributes. This result sug- gests that as children age, they become better able to think in terms of prosociality and social eciency, conrming the results from studies on other-regarding preferences (Fehr, Bernhard, and Rockenbach, 2008). Limited development of consistency in the Risk domain. Given lotteries are also multi- attribute options, in this case probabilities and outcomes, we hypothesized that centration could also play a role in the Risk domain. We listed all simple policies characterized by the evaluation of one attribute at a time (e.g., \pick the option with more goods, and if both have the same, pick the most likely option") and determined the fraction of 55 K & 1 st 2 nd & 3 rd 4 th & 5 th U Age Group Transitivity Violations per Subject Goods Social Risk Choice-Task Goods Social Social Choice-Task All Subjects w/o Heuristic Subjects K & 1 st 2 nd & 3 rd 4 th & 5 th U Age Group Goods Risk Risk Choice-Task All Subjects w/o Heuristic Subjects Figure 15: Left panel. Social vs. Goods after removing simple policies: the developmental signature of consistency is comparable across domains. Right panel. Risk vs. Goods after removing simple policies: the developmental signature of consistency is dierent across domains. All children are similarly inconsistent in Risk and improvements occur later in life. participants with a behavior consistent with one of these policies. Again one of them was used dominantly: 44% of K-1st, 35% of 2nd-3rd, 19% of 4th-5th and 8% of U chose the lottery oering the larger reward. When we removed all subjects who used simple policies, we found that the developmental signature did not match that of the Goods and Social domains (Fig.15, right panel). In particular, the number of violations was the same across all school-age groups and this was signicantly dierent from that of the U group. The result indicates that a potential milestone for Risk was outside our window of observation, somewhere in middle or high school. Also, children learned to become more consistent in choices involving their worst options (\they learned to know what they liked least"), but less so in choices involving their best options (\they did not learn to know what they liked most"), a trend opposite to the trend observed in the Social domain (Fig.16, bottom row (B)). Similar to the Social domain, however, older participants were better able to make integrative decisions, in this case, trade-os between reward amounts and probabilities. Who makes transitive choices? As noted earlier, centration took the form of using simple policies and it was strongly associated with consistency. In addition, among par- ticipants who did not use simple policies, intransitivity was strongly correlated across domains. Intransitivity was also strongly associated with mistakes in attention trials, indicating that attentional control also played a role. By contrast, performance in the 56 0.8 0.7 0.6 0.5 0.4 0.5 0.6 0.7 0.8 0.3 0.4 0.5 0.6 0.7 0.8 0.9 0.3 0.4 0.5 0.7 0.6 0.5 0.4 0.1 0.2 0.3 0.0 0.1 0.2 0.3 0.4 0.0 0.1 0.2 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6 Rank of Higher Ranked Option Rank of Higher Ranked Option Rank of Higher Ranked Option Rank of Higher Ranked Option Rank of Lower Ranked Option Rank of Lower Ranked Option 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 7 6 5 4 3 2 Transitivity Violations in Social-Choice-Task and Risk-Choice-Task Plotted by Option Ranks in the Respective Ranking-Task for Each Age-Group K and 1 st 2 nd and 3 rd 4 th and 5 th Undergraduates K and 1 st 2 nd and 3 rd 4 th and 5 th Undergraduates A) Social B) Risk Figure 16: Transitivity violations across choices among participants who do not use simple policies. (A) In the Social domain, participants converge to a state where TV rarely involve their highest-ranked options (vector with a horizontal gradient: \participants learn to know what they like"). (B) In the Risk domain, participants con- verge to a state where TV rarely involve their lowest-ranked options (vector with a vertical gradient: \participants learn to know what they do not like"). transitive reasoning task was not a predictor, suggesting that logical reasoning per se was not a key determinant of choice transitivity. Finally and notably, participants whose de- cisions in the Choice tasks were more in line with the ordering displayed in the Ranking tasks had also fewer TV (see Appendix SI-3.2). This eect is not reminiscent of any known paradigm. It implies that the gradual and dierential development of choice con- sistency across domains is not entirely explained by changes in attention, centration and logical reasoning. As children learn to know what they like best and least, their pairwise choices and explicit rankings become more consistent with each other. In the Appendix, we present some further investigation of the determinants of choice consistency (SI-3.2, SI-3.3, SI-3.4), the relationship between TV and attention control (SI-3.5), the relation- ship between TV and transitive reasoning (SI-3.6) as well as an integrative analysis of TV in each domain (SI-3.7) 2.4 Discussion Decision-making in the Goods domain: the developmental template. For each age group, the behavior observed in the Goods domain was consistent with the hypothesis that par- 57 ticipants make choices by estimating and comparing noisy values. Under that hypothesis, decisions involving options close in value are confusing and prone to error, while decisions involving options valued dierently are easy to make. The developmental trajectory we observed also suggests that the evaluation process becomes less and less noisy over time, reducing the number of confusing decisions, and hence the number of violations, especially among options ranked very dierently (Fig.14). This pattern of improvement implies that the ability to make the simplest decisions is what solidies in this developmental win- dow. Over time, children learn to know with more accuracy what they like most and least. In addition, the fact that inconsistency was not associated with the ability to rea- son transitively suggests that value-based reasoning requires the involvement of dierent brain regions compared to logical reasoning (ventromedial prefrontal cortex in the rst case (Levy and Glimcher, 2012) and parietal regions in the second (Hinton, Dymond, von Hecker, and Evans, 2010)). Furthermore, the association of inconsistency with age and attentiveness suggests that improvements in decision-making in the Goods domain are partly driven by known age-related changes in attentional control. Decision-making in the Social domain: the eect of centration. Reasoning over multiple attributes is known to be a dicult exercise for children 7 years of age and younger, and this ability develops during the concrete operational stage, somewhere between 2nd and 5th grade (Piaget, 1952; Crain, 2015). The high utilization of simple policies by children in the youngest age group is therefore consistent with centration. Participants who have not yet overcome centration are more likely to pick a simple policy that makes them look consistent to the outside observer. Among children who do not use simple policies, we observe the same trajectory as in the Goods domain, implying that centration conceals their underdeveloped decision-making system. Decision-making in the Risk domain: too complex. As in the Social domain, making choices in the Risk domain requires the evaluation of multiple attributes. It is therefore natural to observe a similar tendency, especially among young children, to act according to simple policies that makes them look consistent. However, integrative reasoning is substantially more complex in the Risk domain and participants who do not use simple policies are not nearly as consistent as in the other two domains. All children were at the same level of performance suggesting that the ability to trade-o probabilities and rewards was not yet developed, perhaps as a consequence of a still underdeveloped working memory system (Gathercole, Pickering, Ambridge, and Wearing, 2004). A dierent learning trajectory across domains. The results obtained for participants in the U age group are consistent with the typical ndings in choice consistency literature: that by adulthood, people have learned to know their preferences and are largely consistent in the Goods (Battalio, Kagel, Winkler, Fisher, Basmann, and Krasner, 1973; Cox, 1997) 58 and Social domains (Andreoni and Miller, 2002; Fisman, Kariv, and Markovits, 2007). Our ndings in the Risk domain are in accordance with the original transitivity studies (Loomes, Starmer, and Sugden, 1991), and less optimistic than recent GARP studies (Mattei, 2000; Choi, Fisman, Gale, and Kariv, 2007) perhaps due to dierences in design. More importantly, our study identies dierent learning trajectories across domains. In the Social domain, children learn to pick their most-preferred option. In the Risk domain, children learn to avoid their least-preferred option. In the Goods domain, children learn both. These asymmetries cannot be explained by the existence of an obvious best option in the Social domain and an obvious worst option in the Risk domain, since the most and least favorite items change signicantly across age groups (see Appendix SI-3.4 for details). It cannot be explained by both an obvious best and worst option in the Goods domain, since the set of goods are dierent for dierent gender and age groups (see Appendix SI-1 for details). Moreover the goods ranked highest and lowest within each age-group vary substantially across individuals. Finally, the development of attentional control cannot be the main explanation either, since these should act similarly across domains. Overall, the results suggest that self-knowledge of preferences follows its own developmental trajectory. To conclude, we have investigated the developmental trajectories of transitive decision- making in the Goods, Social, and Risk domains in children from Kindergarten to 5th grade and compared consistency levels with adult-level performance. We have found that transitivity in choices develops gradually, but dierentially across domains. It is not related to the development of transitive reasoning and it is only partially explained by the development of attentional control. Instead, choice transitivity follows a common trajectory across domains supported by the gradual improvement of self-knowledge of preferences. 59 2.5 References Andreoni, J. and J. Miller (2002). Giving According to GARP : An Experimental Test of the Consistency of Preferences for Altruism. Econometrica 70 (2), 737{753. Astle, D. E. and G. Scerif (2008). Using developmental cognitive neuroscience to study behavioral and attentional control. Developmental Psychobiology 51 (2), 107{118. Battalio, R. C., J. H. Kagel, R. C. Winkler, E. B. Fisher, R. L. Basmann, and L. Krasner (1973). A test of consumer demand theory using observations of individual consumer purchases. Economic Inquiry 11 (4), 411. Bouwmeester, S. and K. Sijtsma (2006). Constructing a transitive reasoning test for 6- To 13-year-old children. European Journal of Psychological Assessment 22 (4), 225{232. Brocas, I., T. D. Combs, J. D. Carillo, and N. Kodaverdian (2016). Consistency in Simple vs . Complex Choices over the Life Cycle. Pre-press. Choi, S., R. Fisman, D. Gale, and S. Kariv (2007). Consistency and Heterogeneity of Individual Behavior under Uncertainty. The American Economic Review 97 (5), 1921{ 1938. Cox, J. (1997). On Testing the Utility Hypothesis. The Economic Journal 107 (443), 1054{1078. Crain, W. (2015). Theories of Development: Concepts and Applications (6 ed.). Davidson, M. C., D. Amso, L. C. Anderson, and A. Diamond (2006). Development of cog- nitive control and executive functions from 4 to 13 years: Evidence from manipulations of memory, inhibition, and task switching. Neuropsychologia 44 (11), 2037{2078. Donaldson, M. (1982). Conservation - What is the question.pdf. British Journal of Psy- chology 23(2), 199. Fehr, E., H. Bernhard, and B. Rockenbach (2008). Egalatarianism in young children. Nature 454(28), 1079{1084. Fisman, R., S. Kariv, and D. Markovits (2007). Individual Preferences for Giving. The American Economic Review 97 (5), 1858{1876. 60 Gathercole, S. E., S. J. Pickering, B. Ambridge, and H. Wearing (2004). The Structure of Working Memory From 4 to 15 Years of Age. Developmental Psychology 40 (2), 177{190. Harbaugh, W. T. and K. Krause (2000). Children's Contributions in Public Good Ex- periments: The Development of Altruistic and Free-riding Behaviors. Economic In- quiry 38(1), 95{109. Harbaugh, W. T., K. Krause, and T. R. Berry (2001). GARP for Kids : On the Develop- ment of Rational Choice Behavior. The American Economic Review 91 (5), 1539{1545. Harbaugh, W. T., K. Krause, and L. Vesterlund (2002). Risk Attitudes of Children and Adults : Choices Over Small and Large Probability Gains and Losses. Experimental Economics 5 (1), 53{84. Hinton, E. C., S. Dymond, U. von Hecker, and C. J. Evans (2010). Neural correlates of relational reasoning and the symbolic distance eect: Involvement of parietal cortex. Neuroscience 168 (1), 138{148. Levy, D. J. and P. W. Glimcher (2012). The root of all value: A neural common currency for choice. Current Opinion in Neurobiology 22 (6), 1027{1038. List, J. A. and D. L. Millimet (2008). The Market: Catalyst for Rationality and Filter of Irrationality. The BE Journal of Economic Analysis & Policy 8 (1), Article 47. Loomes, G., C. Starmer, and R. Sugden (1991). Observing violations of transitivity by experimental methods. Econometrica, 425{439. Mattei, A. (2000). Full-scale real tests of consumer behavior using experimental data. Journal of Economic Behavior & Organization 43 (4), 487{497. Piaget, J. (1948). La psychologie de intelligence. Revue Philosophique de Louvain 46 (10), 225{227. Piaget, J. (1952). The child's conception of numbers. Trans. eds C. Gattegno and FM Hodgson, NY: Routledge. Piaget, J., D. Elkind, and A. Tenzer (1967). Six psychological studies. Reyna, V. F. and S. C. Ellis (1994). Fuzzy-trace theory and framing eects in children's risky decision making. Psychological Science 5 (5), 275{279. Samuelson, P. A. (1938). A Note on the Pure Theory of Consumer's Behaviour. Econom- ica 5(17), 61{71. 61 Sher, I., M. Koenig, and A. Rustichini (2014). Children's strategic theory of mind. Pro- ceedings of the National Academy of Sciences 111 (37), 13307{13312. Smedslund, J. (1960). Transitivity of preference patterns as seen by preschool children. Scandinavian journal of psychology 1 (1), 49{54. Varian, H. R. (1982). The Nonparametric Approach to Demand Analysis. Economet- rica 50(4), 945{973. 62 2.6 Supporting Information 2.6.1 SI-1. Design and procedures The experiment was reviewed and approved by the IRB of the University of Southern California. It was conducted through tablet computers and the tasks were programmed with the Psychtoolbox software, an extension of Matlab. Participants. We recruited 134 children from kindergarten (K) to 5th grade from Lyc ee International of Los Angeles, a bilingual private school. We ran 18 sessions, each with 5 to 10 subjects and lasting between 1 and 1.5 hours. Sessions were conducted in a classroom at the school. In all of our tasks, children had to make choices between options involving goods. To make sure that options were desirable by all participants, we organized sessions by gender and age group: K to 2nd grade boys, K to 2nd grade girls, 3rd to 5th grade boys, and 3rd to 5th grade girls. Goods included toys and stationary. As a control, we ran 7 sessions with 51 undergraduate students (U). These were conducted in the Los Angeles Behavioral Economics Laboratory (LABEL) in the department of Economics at the University of Southern California. For the undergraduate population, participants were recruited from the LABEL subject pool. Instead of toys or stationary, we used snack foods. A description of the distribution of our participants is reported in Table 7. K 1st 2nd 3rd 4th 5th U Male 12 12 15 20 11 6 22 Female 7 16 11 9 8 7 29 Total 19 28 26 29 19 13 51 Table 7: Description of participants by age and gender. Each participant completed several tasks. In Ranking tasks, participants received 7 cards, each with a picture of one of the options. Participants were instructed to rank these cards on a ranking board attached to their desk from most to least preferred. Once a participant was nished ranking the options, an experimenter transferred her rankings onto her tablet. There were three Ranking tasks. In the Goods-Ranking task, options were toys. In the Social-Ranking task, options were sharing rules for oneself and another child of the same age and gender in another school. The sharing rules contained dierent numbers of a single toy. To ensure the toy was highly desirable, we selected the one that was ranked as rst favorite in the Goods-Ranking task. In the Risk-Ranking task, options were lotteries. In each lottery, the participant could earn a given number of toys with a given probability. Probability was represented by a circle with outcomes shown 63 in white and green, with the green area corresponding to the probability of winning the good(s). Each circle corresponded to a spinner wheel (previous experiments on risk with children have successfully employed a similar design with a spinner wheel (Reyna and Ellis, 1994; Harbaugh et al., 2002)). Again, to ensure that options were desirable, we chose the toy that was ranked second favorite in the Goods-Ranking task. In Choice tasks, participants were presented with all 21 pairwise combinations of the 7 options in the corresponding Ranking task. We will refer to these tasks as Goods- Choice task, Social-Choice task and Risk-Choice task. In each trial, one option was displayed on the left side and another on the right of the tablet's screen. A participant could select their preferred option (by touching a button above the left or right options on their screen), or report to be indierent between them (by touching a button between the two options). The 21 pairs were presented randomly, both in terms of trial order and in terms of left-right presentation. These tasks are represented in Fig.12 and the specic options are displayed in Figs. 17 (Goods), 18 (Social), and 19 (Risk). In each Choice task, the 22nd trial was an attention trial, where subjects were presented with their most frequently and least frequently chosen options from the preceding 21 trials. 2 Figure 17: Toys used in the Goods-Choice and Goods-Ranking tasks. Children of dierent age and gender were presented dierent toys. For analysis, participants also completed a Transitive Reasoning task. This task was designed to measure levels of transitive reasoning. Several tasks have been proposed in the literature (Bouwmeester and Sijtsma, 2006). In order to test the relation between 2 As the 21 pairwise choices were being made, the computer awarded 1 point to the option selected by the participant and 0.5 points to each option in case of indierence. At the end of the 21st trial, the tally for each option was summed and the options with the most and least points were determined. In case of a tie, one of the options was chosen randomly. 64 Figure 18: Sharing rules between self and other used in the Social-Choice and Social-Ranking tasks. For children, each circle represented one toy, personalized to ensure desirability. For undergraduate students, each token represented $2. Figure 19: Lotteries used in the Risk-Choice and Risk-Ranking tasks. For children, each circle represented one toy, personalized to ensure desirability. For under- graduate students, each token represented $2. transitive choice and transitive reasoning devoid of memory or operational reasoning re- quirements, we opted for a new design; we constructed a test that does not require mem- ory and is visually represented. The task consisted of seven questions of varying diculty. Each of the seven questions consisted of two premises represented in two vignettes, and a third vignette with a response prompt. For each premise, participants were told that the animals shown in the vignette were at a party and the oldest wore a hat. They had to determine which animal in the third vignette should wear the hat. This is illustrated in Fig.20. Three of the seven questions did not require transitive reasoning and were included to test whether participants were paying attention. These are referred to as \pseudotransi- tivity" trials (Bouwmeester and Sijtsma, 2006). The remaining four did require transitive reasoning. Of the four transitivity questions, two of them were less dicult and two were more dicult. All subjects completed the tasks in the same order: (1) Goods-Choice task, (2) Goods- Ranking task, (3) Transitive Reasoning task, (4) Social-Choice task, (5) Social-Ranking task, (6) Risk-Choice task and (7) Risk-Ranking task. All tasks were untimed. Upon the completion of a given task by all subjects, instructions for the next task were given. To incentivize participants, we told them that it was important to choose options they liked, because their choices would determine what they would receive at the end of the 65 Figure 20: Transitive Reasoning Task. The animal wearing the hat is the oldest in each vignette on the left. The participant had to answer in the vignette on the bottom right by choosing the animal he thought was the oldest given the information on the left, or by reporting it could not be known (?). experiment. This was explained accessibly through a simple analogy. From each of the Choice tasks, one choice was randomly selected by the computer and subjects received their selection in that trial. For the Social-Choice task, we explained that each participant had been paired with another student, of their same gender and grade level, from another school in Los Angeles (Undergraduate were paired with a subject of another session, matched for gender). We explained that the selected sharing rule would actually be implemented: we would go to the other school and deliver to the other student the goods that were represented in the selected option. The shared items were delivered to Foshay elementary school, a public school in LAUSD. For the Risk-Choice task, we explained that their choice in the selected trial would be implemented and that the corresponding spinner wheel would be spun by a blindfolded assistant at the end of the session that day. If the spinner arrow landed in the green part of the wheel, the subject would win the number of toys (or money, for the case of undergraduates) associated to that choice. Otherwise, they would not win anything from this task. This was implemented at the end of each session. Before leaving, we collected demographic information consisting of \gender," \grade," \number of younger siblings," and \number of older siblings." All participants also received a xed show-up fee. Children received their highest ranked item in the Goods-Ranking task while Undergraduate students received $5. 66 2.6.2 SI.2. Analysis of transitivity violations We collected data from all participants in the Goods-Choice and Goods-Ranking tasks. The tablets did not record the choices of two subjects in the Risk-Choice and Risk-Ranking tasks and one subject in the Social-Choice and Social-Ranking tasks. Transitivity violations. As brie y explained in the main text, we measured consistency by counting the number of transitivity violations between all triplets of options: A vs. B, B vs. C and A vs. C. With 7 options, there are 35 triplets to consider. Importantly, considering triplets is enough to account for all TV. Indeed, it is trivial to show that a violation involving four or more items (such as, for example, A chosen over B, B over C, C over D and D over A) necessarily involves at least one violation between three items of that set. Also, given we allowed participants to express indierence between options, we allocated 0.5 violation to triplets involving weak violations, that is, when we observed A chosen over B, B chosen over C, and A indierent to C. We rst counted the number of times choices between triplets of items were intransitive for each subject in each Choice task. Then, we computed the average number of violations by age group and Choice task. We reported the results in Fig.13. In the Goods domain, we found a signicant improvement between consecutive age groups, with 4th-5th being at the same level of performance as U subjects. More speci- cally, participants in K-1st had signicantly more violations than participants in 2nd-3rd (p-value = 0:0050), participants in 2nd-3rd had signicantly more violations than par- ticipants in 4th-5th (p-value = 0:0355), but violations of 4th-5th and U groups were not signicantly dierent (p-value = 0:0835). In the Social domain, a similar pattern was observed. TV were signicantly dierent between the K-1st age group and the 4th-5th age group (p-value = 0:0162), between the K-1st age group and U age group (p-value = 0:0002) and between the 2nd-3rd age group and the U age group (p-value = 0:0048), while all other dierences in TV were not statistically signicantly. In the Risk domain however, there was no apparent improvement. No age group of children had signicantly more violations than U. When we compared TV across domains, we found that participants in the K-1st age group had signicantly more violations in the Goods than in the Social domain (p = 0:0009). A similar result held for the 2nd-3rd age group (p = 0:0117). Participants in the U group also had signicantly more violations in the Risk than in the Goods (p < 0:0001) or Social domains (p < 0:0001). Analysis of transitivity in the Goods domain. We used the explicit ranking elicited in the Goods-Ranking task to assess the frequency with which items were involved in transitive violations as a function of their ranking. For each transitivity violating triplet, 67 we assigned a score of 1 to its 3 corresponding items. Each item was then allocated the percentage of times it was involved in a violation (the number of times it obtained a score of 1 over all possible times). By repeating this exercise for all participants and then averaging over all of them, we obtained the percentage of times violations occurred in choices between options ranked x and y in each age group. Naturally, ranks x and y corresponded to dierent specic items since each participant had her own rankings of options. Fig.14 represents the color-coded result of this exercise. Darker colors are used to represent fewer violations. Next, we studied the marginal sensitivity to violations to learn if some choices were more consistent than others, and if so, why. Intuitively when choices are easy to make, inconsistencies are more unlikely. Easiness might just be a matter of picking what we like. Or it might be a matter of avoiding what we do not like. Assuming that ranks derived from the Ranking tasks give a proxy for ordinal values, the more (less) valuable options should be those with higher (lower) ranks. If consistency is only driven by choosing what is liked, it should not matter what the lower-ranked option is, and if consistency is only driven by avoiding what is liked least, then it should not matter what the higher-ranked option is. We tested this conjecture by analyzing how consistency changes if we marginally change the rank of the higher- or lower-ranked option. Intuitively, in Fig.14, if participants are making choices in a given cell and exhibit a certain level of consistency, how much more (or less) consistent they become by moving one cell to the right (changing the rank of the higher-ranked item but not the lower-ranked item) or by moving one cell up (changing the rank of the lower-ranked item but not the higher-ranked item). More specically, we considered every pair of adjacent boxes in the same row and we determined the dierence in violations between a box and the box to the right of it. We then computed the average dierence over all pairs of boxes and reported this number as the left-right gradient of the vector in the corner of Fig.14 for each age group. A vector with a larger x-coordinate means a greater increase in violations when the higher-ranked option is decreased by one rank. We did the same with every pair of adjacent boxes in the same column to determine the up-down gradient. In this case, a vector with a larger y- coordinate means a greater increase in violations when the lower-ranked option is increased by one rank. As expected, the vector of all four age groups have upper right gradients, indicating more violations when options are ranked more closely in either dimension (lower rank for the higher option or higher rank for the lower option). It also implies that the highest number of violations occurs between items of intermediate ranks (3rd, 4th, 5th). The left graph of Fig.21 presents a heatmap of all vectors in the Goods domain and conrms the increase in violations when the higher- and lower-ranked options are closer to each other. Finally, a t-test conrms that both the x- and y-coordinates are positive 68 and signicantly dierent from 0 (p-value < 0:01) overall and for all age groups, with the exception of the x-coordinate in the K-1st age group. Overall, in the Goods domain, all groups were signicantly driven by the rank of the lower-ranked option, and all groups except K-1st were driven by the rank of the higher-ranked option. Goods Social Risk Figure 21: Sensitivity analysis Analysis of transitivity in the Social domain. An option in the Social domain can be decomposed into three attributes that can be visually assessed through counting: the reward to self, the reward to other, and the total reward. When comparing attributes, a participant can use three simple rules: pick the maximum, pick the minimum, or be indierent. A participant could apply a rule to a single attribute (a policy such as \pick option with maximum reward for self"); a participant could alternatively apply a rule to one attribute then move to a second attribute and apply another rule (a policy such as \pick option with maximum reward for self and if they are the same, pick option with maximum total reward"). We looked at all such policies, and only 3 were used by subjects: \maximize amount for self, then minimize amount for other," \maximize amount for self, then maximize amount for other" and \maximize amount for self, irrespective of amount for other." We counted all participants who complied with any such policy and those who made exactly one mistake with respect to it. We will call these participants simple policy users. Among them, 61% chose the policy \maximize amount for self, then minimize amount for other." We removed all simple policy users from our sample, leaving us 125 participants. We computed TV by age group after the exclusion and found that the results were now similar to those obtained in the Goods domain. The development of consistency followed the same pattern: participants in the K-1st age group were signicantly more inconsistent than all other participants (all p-values< 0:01). Participants in the 2nd-3rd age group had signicantly more violations than participants in the U age group (p-value = 0:0144) and participants in the 4th-5th age group were not signicantly dierent from 69 participants in the U age group (p-value = 0:1205). This is represented in Fig.15 (left). We repeated the heatmap analysis we performed on the Goods-Choice task. The result is represented in Fig.16(A). We also performed the same marginal sensitivity analysis as in the Goods domain after removing simple policy users. The x-coordinates were positive and signicantly dierent from 0 (p-value < 0:01), overall and for all age groups except for participants in the K- 1st age group. However, and contrary to the Goods domain, the y-coordinates were not signicantly dierent from 0 for any age group. This suggests that participants had a very clear idea of what they liked best but a much less clear idea of what they liked least in the Social domain. The result can also be seen in the heatmap representation of the vectors (Fig.21, center). Analysis of transitivity in the Risk domain. The procedure to elicit simple policies was the same. In the Risk domain, there are two attributes per option, reward amount and probability, and three simple rules, pick the maximum, pick the minimum, or be indierent. We dened 6 policies, but only 4 were ever used by subjects: \maximize the amount, irrespective of the probability," \maximize the probability, then maximize the amount," \minimize the amount, irrespective of the probability," and \minimize the probability, irrespective of the amount." We counted all participants who complied exactly with any such policy as well as those who made exactly one mistake with respect to a policy. Among those, 89% chose the policy \maximize the amount irrespective of the probability." We removed from our sample all participants whose behavior was consistent with a simple policy, leaving us 128 subjects. We analyzed TV again and found that violations by the U group were dierent than those by any younger group (p-value = 0.0019 for the comparison with K-1st, p-value = 0.0050 for the comparison with 2nd-3rd, and p-value = 0:0153 for the comparison with 4th-5th), but that all school-age groups performed at similar levels from each other. This is represented in Fig.15 (right). We also repeated the heatmap analysis. The result is represented in Figure 16(B). Again, we performed a marginal sensitivity analysis after removing simple policy users and found the opposite result than in the Social domain: y-coordinates were signicantly dierent from 0 overall and for two age groups (4th-5th and U, t-test p-values < 0:05), whereas x-coordinates were not dierent from 0 in any age group (Fig.21, right). This means that in the Risk domain, participants had a clear idea of what they liked least but a much less clear idea of what they liked best. Transitivity violations across domains. The aggregate analysis of TV indicated that consistency improves with age. We addressed the question of how individual scores vary across domains to assess whether participants who commited relatively more violations in one domain were also those who committed relatively more violations in a dierent domain. 70 Said dierently, we wanted to know whether consistency was driven (at least partially) by a common factor or whether it resulted from the development of domain-specic skills. When considering the full sample, we found that TV in the Goods and Risk domains were not correlated with one another, while TV in the Goods and Social or TV in the Risk and Social domains were (Pearson coecient = 0:38 and 0:30, respectively; p-value 0:0001 for both). When we removed simple policy users, we found that TV were signicantly correlated across all domains (Pearson coecient = 0:36, p-value = 0:0006 between the Goods and Risk domains, Pearson coecient = 0:43, p-value< 0:0001 between the Goods and Social domains, and Pearson coecient = 0:49, p-value < 0:0001 between the Risk and Social domains) suggesting that participants' consistency was partially driven by the development of a skill useful in all domains. Comparison with random play. To assess whether participants committing many vio- lations might have been acting randomly, we simulated random players. Depending on the probability we assigned to their expressing indierence, the number of TV was between 8 and 10, substantially above the actual numbers obtained even among the most inconsistent participants. This was in line with earlier literature on consistency in children (Harbaugh et al., 2001). 71 2.6.3 SI-3. Additional statistical analysis SI-3.1. Other measures of choice inconsistencies Choice reversals and choice removals. Number of violations is only one possible way to address the issue of consistency. An alternative way is to look at severity of violations. We computed two measures of violation severity for each subject. The rst one counts the number of choices that need to be reversed to restore transitivity. To compute that number, an algorithm sequentially changes the choices made in each trial, and computes a TV score after each change. If the score is 0 at any point, the algorithm stops. After all single trials have been exhausted, the algorithm repeats the procedure over pairs of trials, then triplets and so on. We found that choice reversals followed a very similar pattern across age groups as TV (Fig.22, left), and were highly correlated to it in all three Choice tasks: Goods (Pearson = 0.95, Spearman = 0.98, p < 0:0001), Social (Pearson = 0.93, Spearman = 0.98, p < 0:0001), and Risk (Pearson = 0.92, Spearman = 0.95, p < 0:0001). The second of these severity measures counts the number of choices that need to be removed to restore transitivity. We used an algorithm similar to that for counting choice reversals. We found that choice removals and choice reversal were almost identical in capturing severity of violations (Fig.22 right and left), so this measure also closely re ected the age patterns observed with TV in the three Choice tasks. Choice Reversals per Subject K & 1 st 2 nd & 3 rd 4 th & 5 th U Age Group Goods Social Risk Choice-Task K & 1 st 2 nd & 3 rd 4 th & 5 th U Age Group Choice Removals per Subject Goods Social Risk Choice-Task Figure 22: Analysis of severity of violations. Left: choice reversals across domains and age. Right: Choice removals across domains and age. Implicit ranking and choice noisiness. From a subject's selections in a Choice task, it is possible to extract their implicit (or revealed) ranking for the options in that task. We 72 computed this revealed ranking in a very simple way, by tallying their selections \for" and \against" each of the options as brie y explained in footnote 2: each time the participant selected an option, 1 point was added to the running tally of that option and when the participant expressed indierence, 0.5 points were added. The tallied points for each option were summed, and the options were ordered according to this sum, giving the subject's implicit ranking. The subject's choices were then checked for inconsistencies against their implicit ranking using the following simple rule. Suppose that, according to the tally of the implicit ranking, option B was ranked lower than A. The participant would receive a score of 0, if A was chosen over B (showing consistency with the implicit ranking), a score of 0.5 if he expressed indierence between A and B, and a score of 1 if B was chosen over A (showing inconsistency with the implicit ranking). We called this classication error score ICR (Inconsistency in Choices given the Revealed ranking). Notwithstanding the endogeneity issues with this measure (we use choices to extract a revealed ranking then check that ranking against the very choices that were used to create it), it still provides a measure of choice \noisiness." As can be seen by comparing Fig.13 with Fig.23 (left), the age pattern of choice inconsistencies with implicit rankings (ICR) closely resembles that of TV: participants who make more TV are those who are more \noisy" (make more mistakes) around their implicit ranking. Explicit rankings and choices. As an additional measure, we evaluated inconsistencies between choices in Choice tasks and the explicit rankings elicited in Ranking tasks. This measure was computed in a similar way as for inconsistencies with respect to implicit rankings, that is, we checked the subject's choices for inconsistencies against their explicit ranking. We called ICE (Inconsistency in Choices given the Explicit ranking) the new classication error score. One advantage of this measure over the previous one is the absence of endogeneity (errors are computed from rankings in two dierent tasks). The results are illustrated in Fig.23 (right). In the Goods domain, we found that participants in the K-1st and 2nd-3rd age groups had relatively more diculty making choices consistent with their explicit rankings compared to older participants (p-values < 0.0167 for all comparisons). A similar story held qualitatively in the Social domain, except that the participants in the K-1st and 2nd-3rd age groups were not signicantly dierent from each other, nor were the older participants in groups 4th-5th and U. It suggests a less gradual development of the ability to choose from an explicit ranking. We found that inconsistencies in the Social domain were following the same trend as inconsistencies in the Goods domain after removing subjects who used simple policies. In the Risk domain, we found that the level of inconsistencies was high and similar across all subjects: they were not able to make choices consistent with their explicit rankings. This result changed when we removed subjects who used simple 73 policies: they were becoming more capable over time to express their explicit rankings in their choices. Nevertheless, the level of inconsistencies remained higher than in the Goods domain for all ages. K & 1 st 2 nd & 3 rd 4 th & 5 th U Age Group ICR per Subject Goods Social Risk Choice-Task K & 1 st 2 nd & 3 rd 4 th & 5 th U Age Group ICE per Subject Goods Social Risk Choice-Task Figure 23: Ranking and choices. Left: Inconsistencies with respect to implicit ranking (ICR). Right: Inconsistencies with respect to explicit ranking (ICE). Explicit vs. implicit ranking. The implicit ranking is revealed by choices and does not need to coincide with the explicit ranking elicited in a Ranking task. For each participant and task, we computed a measure of distance between those rankings. We used a very simple procedure where we assigned to each option a score representing the absolute dif- ference between its rank in the implicit and explicit rankings (e.g., if A was ranked 3rd according to the explicit ranking and 2nd according to the implicit ranking, its score was j3 2j = 1). We then computed the average score of all options as each subject's discrep- ancy score. Fig.24 depicts those scores by domain and age group. As can be seen from the graph, the patterns are similar to TV. In the Goods domain, the ability to choose accord- ing to one's explicitly disclosed preferences was found to gradually develop. The same is true in the Social domain for age groups 2nd-3rd and above. By contrast, discrepancies remained constant with age in the Risk domain, suggesting a persistent inability to draw choices from explicit preferences. Overall, the result reinforces the idea that preferences are not well established in young children, with the corresponding consequences in terms of transitivity violations as well as discrepancies between pairwise choices and explicit rankings. Relationship between measures of consistency. Not surprisingly given our previous 74 K & 1 st 2 nd & 3 rd 4 th & 5 th U Age Group Discrepancy Score per Subject Goods Social Risk Choice-Task Figure 24: Discrepancies between explicit and implicit rankings (all subjects). ndings, our measures of consistency were all highly correlated. Indeed, a high number of transitivity violations (TV) was associated with a high number of inconsistencies between explicit rankings and choices (ICE) in the Goods domain (Pearson = 0:79), in the Social domain (Pearson = 0:69) and in the Risk domain (Pearson = 0:72) with all p-values < 0:0001. A high number of TV was also associated with a high discrepancy score between implicit and explicit ranking in the Goods domain (Pearson = 0:61), in the Social domain (Pearson = 0:54) and in the Risk domain (Pearson = 0:60), again with all p-values < 0:0001. Last, TV was also highly correlated with classication error score (ICR) in the Goods domain (Pearson = 0:95), in the Social domain (Pearson = 0:93) and in the Risk domain (Pearson = 0:93), all p-values < 0:0001. The results were very similar when we removed subjects who used simple policies. Overall, although we feel that TV is the best measure of choice consistency, the main conclusion of SI3.1 is that the results presented in the main text are robust both when we consider severity instead of number of violations and also when we use dierent in- consistency measures, such as ICE or ICR: intransitivity is invariably associated with the inability to make choices consistent with rankings, both implicit and explicit, and these have dierent developmental signatures across domains. SI-3.2. Other analysis of choices Indierence. We checked whether age and choice domain had an in uence on the tendency to be indierent. The results are reported 75 in Table 8. In the Goods domain, the number of indierent choices decreased over time (unpaired t-test, p-values< 0:05 between K-1st and 4th-5th, between K-1st and U, and between 2nd-3rd and U). Participants were less often indierent in the Social domain than the Goods domain (paired t-test, p-values < 0:05 for age groups K-1st and U), and school-age children in the Social domain were more often indierent than participants in the U age group (unpaired t-test, p-values < 0:05). Finally, in the Risk domain we observed no trend in reducing indierences with age. Goods Social Risk K-1st 3.81 (0.54) 2.04 (0.47) 1.53 (0.59) 2nd-3rd 3.09 (0.47) 2.47 (0.54) 1.31 (0.45) 4th-5th 2.19 (0.42) 2.50 (0.60) 1.94 (0.53) U 1.86 (0.23) 1.12 (0.28) 1.49 (0.35) (standard errors in parenthesis) Table 8: Average number of indierences. At the same time, we also found that triplets involving indierent choices were less likely to result in TV in all domains for the whole sample (t-tests, p-values < 0:0001), as well as for each age group (p-values < 0:0001). Reaction times and choices. Reaction times in trials involving violations were longer compared to reaction times in trials involving no violations. This was true in all age groups and in all domains (KS test, p-value = 0:015 for K-1st in the Risk domain, p- value = 0:010 for 2nd-3rd in the Risk domain and p-value < 0:0001 for all other age groups and domains), suggesting that participants who were more confused, took longer to choose an ended up contradicting their choices. We also found that reaction times were longer when participants pressed the indierence button compared to when they made a left or right selection (KS test, p-value< 0:0001 for age groups above 2nd-3rd), consistent with the idea that choices were more dicult to make when options were close in value. Last, simple policy users were faster than the other subjects (KS test, p-value < 0:0001), indicating that their choice process was simpler. SI-3.3. Simple policy users From the 59 subjects who used simple policies in the Social domain and the 55 who used simple policies in the Risk domain, we found that only 19 used simple policies in both domains. We also compared TV in the Goods domain (where no simple policies were available) by subjects who did and did not use simple policies in the other domains and found no signicant dierences. Therefore, centration and the development of the 76 value-based decision-making system appeared to be uncorrelated. However, we found that simple policy users were very distinguishable from the other participants in terms of discrepancies between implicit and explicit rankings (t-test, p-value < 0:0001 in both Risk and Social). This means that they were explicitly revealing preferences that were not supported by their choices, that is, by their implicit rankings. SI-3.4. Evolution of preferences Evolution of preferences in the Social domain. The most common of the simple policies employed by our participants was to maximize the reward for self, but its predominance changed over time. Indeed, the preference for giving seemed to be developing: small children tended to maximize their own reward systematically and, other things being equal, they also preferred smaller rewards for others. Older participants however selected larger rewards for others. We did not nd any evidence that participants were maximizing the payos of both participants. However, we found that they came closer to that policy with age, with participants in the 4th-5th age group being similarly close to it as participants in the U group. The most and least favorite options, as revealed by implicit rankings, also changed over time. Indeed, for all ages, (4,0) and (3,3) were the most popular, but the frequency of (4,0) decreased while the frequency of (3,3) increased with age. Similarly, (0,4) and (0,5) were the least popular but the frequency of (0,5) decreased while the frequency of (0,4) increased with age. The results are summarized in Table 9. Most favorite Least favorite (4,0) (3,3) (0,4) (0,5) K-1st 0.65 0.33 0.43 0.77 2nd-3rd 0.40 0.58 0.31 0.78 4th-5th 0.50 0.50 0.50 0.72 U 0.33 0.73 0.80 0.30 Table 9: Most and least favorite options in Social domain derived from implicit rankings Our Social-Choice task was also rich enough to permit a study of the evolution of other-regarding preferences. In particular, it is possible to study whether participants were prosocial (chose (3,3) over (3,1)), willing to share (chose (3,1) over (4,0)) or envious (chose (0,4) over (0,5)). In other words, we can conduct a similar analysis as in Fehr et al. (2008), and determine a type for each subject as a function of the decisions in these three pairs of choices. We dened the same ve types as in Fehr et al. (2008): \strongly egalitarian" (choices (3,3); (3,1); (0.4)), \weakly egalitarian" (choices (3,3); 77 (4,0); (0,4)), \strongly generous" (choices (3,3); (3,1); (0,5)), \weakly generous" (choices (3,3); (4,0); (0,5)), and \spiteful" (choices (3,1); (4,0); (0,4)). To make it comparable with the literature, we excluded subjects who expressed one or more indierences. In Fig.25, we report the proportion of subjects in each of these ve categories (as in Fehr et al. (2008), the remaining subjects are those who do not belong to any of them). 0 10 20 30 40 50 60 70 80 90 100 3y-4y 5y-6y 7y-8y Fehr et al. (2008) 0 10 20 30 40 50 60 70 80 90 100 K-1st 2nd-3rd 4th-5th U This study Spiteful Weakly generous Strongly generous Weakly egalitarian Strongly egalitarian Figure 25: Evolution of other-regarding preferences. Notice that the results with the aforementioned paper are not directly comparable: ages overlap, reward structures are not identical and options are also dierent (e.g., choosing between (1,1) and (1,2) captures a slightly dierent aspect of envy than choosing between (0,4) and (0,5)). However, if we compare 5y-6y with K-1st and 7y-8y with 2nd-3rd we notice that outcomes are similar across studies. Consistent with the centration hypothesis, many young children are spiteful. As they grow, they develop some integrative reasoning and become more egalitarian. Our oldest participants are predominantly generous. Evolution of preferences in the Risk domain. The most common of the simple policies employed by our participants was to maximize reward but, as in the Social domain, its predominance changed over time. More than 20% of participants in the K-1st and 2nd- 3rd age groups used it against less than 10% in the 4th-5th and U age groups. We chose maximization of expected value, E(V), as a template of integrative reasoning, but strictly speaking no participant maximized E(V). Only two subjects were one step away from that integrative policy and they both had some TV. For each participant, we counted the number of choices that maximized E(V) and averaged this count across participants in each Choice task and each age group. We found that participants in the K-1st, 2nd- 3rd and 4th-5th age groups were making signicantly fewer choices consistent with the maximization of E(V) compared to participants in the U age group. In particular, after removing simple policy users, each group of children used policies that were farther away 78 from E(V) compared to participants in the U age group (t-test, p-values < 0:005). When looking at the favorite option revealed by implicit rankings, we found that chil- dren were transitioning gradually from the option involving the largest quantity (12; 12:5%) to the option exhibiting the largest expected value (5; 50%), as depicted in Table 10. In- terestingly, option (12; 12:5%) was ranked rst by many and, at the same time, last by others. (12,12.5%) (5,50%) K-1st 0.60 0.13 2nd-3rd 0.44 0.35 4th-5th 0.28 0.53 U 0.16 0.82 Table 10: Favorite option in the Risk domain derived from implicit rankings The results taken together showed that behavior was changing from choices consistent with very simple policies to choices resulting from trade-os and integrative thinking. The centration eect observed in young participants made them appear selsh and consistent in the Social domain and risk-loving and consistent in the Risk domain. These attitudes gradually changed with age. Finally, one could argue that a main result of the paper, namely the idea that partici- pants learn to know what they like most in the Social domain whereas they learn to know what they liked least in the Risk domain, was simply due to the fact that there was a clear best option in Social and a clear worst option in Risk (despite our attempts to have reasonably balanced options). This simple explanation would be inconsistent with the fact that the most and least preferred options in Social and Risk change signicantly with age. Instead, we claim the existence of dierent developmental signatures across domains regarding the subjects' learning about their own preferences. SI-3.5. Attention trials Remember that after all 21 paired comparisons in each Choice task we included an \attention trial", that is, a last choice between the subject's most- and least-favorite options, as revealed by their implicit ranking. In each attention trial, subjects received an inattention score of 1, 0.5, or 0 if they selected their least-favorite option, the indierence button, or their most-favorite option, respectively. We found that most children were attentive: 70% in K-1st, 76% in 2nd-3rd, 84% in 4th-5th and 100% U answered all attention trials correctly. Most of those who failed got a total inattention score of 0.5, and no children failed all 3 trials. We also found that performance on attention trials and TV 79 was correlated in all three tasks: Goods-Choice task (Pearson = 0.54, Spearman = 0.43), Social-Choice task (Pearson = 0.40, Spearman = 0.20), and Risk-Choice task (Pearson = 0.36, Spearman = 0.32), all p-values < 0:01. It suggests a relationship between ability to choose consistently and attention mechanisms. In the same lines, discrepancies between explicit and implicit rankings were correlated in all domains with attention trials: Goods domain (Pearson = 0.38, Spearman = 0.34) Social domain (Pearson = 0.32, Spearman = 0.21) and Risk domain (Pearson = 0.27, Spearman = 0.30), all p-values < 0:01. These results imply that attentiveness as measured by attention trials was a predictor of intransitivity and it was also associated with the ability to choose according to explicit rankings. SI-3.6. Transitive reasoning We counted for each participant the number of mistakes accumulated in each level of diculty of the transitive reasoning task. In all three levels, the K-1st group, 2nd-3rd group, and 4th-5th group accrued signicantly more errors than the U group (p-value < 0:001, p-value < 0:05, and p-value < 0:05, respectively). Participants in the K-1st group made more mistakes on the most dicult reasoning trials than they did on the easy or medium trials (p-value = 0:02 and p-value = 0:0001, respectively). Within the other age groups the average error counts were similar across trial diculty (with the exception of the 2nd-3rd group which had more mistakes on easy than on dicult trials (p-value = 0:02)). Performance in the transitive reasoning task was correlated with attentiveness. This was true for the most dicult trials (Pearson = 0.20, p-value < 0:01) as well as for all levels of diculty together (Pearson = 0.22, p-value < 0:01). We also found that it was correlated with the level of discrepancies between implicit and explicit rankings in all domains: Goods domain (Pearson = 0.33, p-value < 0:0001), Social domain (Pearson = 0.20, p-value < 0:01) and Risk domain (Pearson = 0.17, p-value < 0:05). Overall, transitive reasoning was associated with the same explanatory variables as transitive decision-making. SI-3.7. The determinants of transitive choices Relationship between TV and demographic variables. Given TV covaries across do- mains, we looked for possible common explanatory variables of intransitivity. To this end, we ran OLS regressions treating TV in each domain as the variable to be explained by a set of characteristics. Those included age group, gender, number of younger siblings, and number of older siblings. We found that the only signicant explanatory variable for TV was age group. Moving from one age group to the next was associated with a decreased 80 number of TV in the Goods-Choice task (p-value < 0:001) and in the Social-Choice task (p-value < 0:001) but not in the Risk-Choice task (p-value = 0:921). The results were unchanged when we removed subjects who used simple policies. Relationship between TV and developmental variables. We ran OLS regressions on the full sample to assess the explanatory power of mistakes in transitive reasoning on transitivity violations in each domain. We found that mistakes in transitive reasoning (`TR mistakes') were not associated with TV in the Risk domain. They were correlated with TV in the Goods and Social domains, but signicance levels dropped as we controlled for other explanatory variables such as age group (`Dummies K-1, 2-3, 4-5') and attentiveness (`Attention trials'). These however were highly signicant as well as the tendency to use simple policies (`Simple policy'). The results are reported in Tables 11, 12 and 13. Model 1 Model 2 Model 3 Model 4 TR mistakes 0.878 0.393 0.297 0.188 Dummy K-1 3.049 2.497 1.376 Dummy 2-3 1.383 0.824 0.221 Dummy 4-5 0.146 -0.139 -0.275 Attention trials 3.460 2.565 Discrepancies 2.397 Constant 1.359 0.693 0.708 -0.403 # obs 185 185 185 185 R 2 0.127 0.242 0.359 0.485 Signicance levels: = 0.05, = 0.01, = 0.001. Table 11: OLS regression of transitivity violations in the Goods domain Overall, TV was best predicted in all domains by the performance in attention trials and the ability to make choices consistent with explicit rankings (specically, to have similar implicit and explicit rankings, as captured by `Discrepancies'). It was further predicted by centration (`Simple policy') in the Social and Risk domains. Transitive reasoning, though correlated with TV, failed to predict any TV result after controlling for these other explanatory variables. 81 Model 1 Model 2 Model 3 Model 4 Model 5 TR mistakes 0.359 0.115 0.069 0.016 -0.018 Dummy K-1 1.492 1.176 0.775 1.010 Dummy 2-3 0.738 0.418 -0.034 0.023 Dummy 4-5 0.215 0.048 0.031 0.122 Attention trials 1.937 1.330 1.311 Discrepancies 1.654 1.250 Simple policy -1.003 Constant 0.894 0.531 0.538 -0.213 0.331 # obs 184 184 184 184 184 R 2 0.045 0.103 0.187 0.364 0.379 Signicance levels: = 0.05, = 0.01, = 0.001. Table 12: OLS regression of transitivity violations in the Social domain Model 1 Model 2 Model 3 Model 4 Model 5 TR mistakes 0.216 0.265 0.179 0.027 0.006 Dummy K-1 -0.358 -0.957 -0.947 0.234 Dummy 2-3 0.069 -0.523 -0.149 0.657 Dummy 4-5 0.765 0.457 0.598 1.007 Attention trials 3.579 1.796 1.690 Discrepancies 2.237 1.559 Simple policy -2.255 Constant 2.510 2.390 2.403 0.592 1.328 # obs 183 183 183 183 183 R 2 0.008 0.022 0.157 0.416 0.471 Signicance levels: = 0.05, = 0.01, = 0.001. Table 13: OLS regression of transitivity violations in the Risk domain 82 3 Bundling Options in Value-Based Decision-Making: Attention, Calculation, and Working Memory ∗ Isabelle Brocas University of Southern California and CEPR Juan D. Carrillo University of Southern California and CEPR T. Dalton Combs University of Southern California Niree Kodaverdian University of Southern California John Monterosso University of Southern California Abstract One of the core questions of Neuroeconomics is how humans value options. To date, most studies have focused on simple options and identied the ventromedial prefrontal cortex (vmPFC) as the primary value tracking region. In this study, we ask participants to make pairwise comparisons involving options of varying complex- ity: single items, bundles made of the same two single items, or bundles made of two dierent single items. We found that novel patterns of activation were uniquely associated with choices involving bundles made of dierent items. In those choices, we found that the Default Mode Network was deactivated, brain regions associated with calculation (intraparietal sulcus) were engaged, and there was increased connec- tivity between value tracking regions and the region responsible for working memory (dorsolateral prefrontal cortex). Taken together, these results indicated that complex option valuation was supported by networks involved in attention, computation, and working memory. ∗ We are grateful to members of the Los Angeles Behavioral Economics Laboratory (LABEL) for their insights and comments in the various phases of the project. All remaining errors are ours. The study was conducted with the University of Southern California IRB approval number UP-13-00235. Address for correspondence: Isabelle Brocas, Department of Economics, University of Southern California, 3620 S. Vermont Ave., Los Angeles, CA 90089, USA, <brocas@usc.edu>. 3.1 Introduction Evidence from many lesion and fMRI studies converges in identifying the medial orbito- frontal cortex (mOFC), or sometimes more narrowly, the ventromedial prefrontal cortex (vmPFC) as a critical region in valuation when deciding between alternatives (Rangel, Camerer, and Montague (2008); Henri-Bhargava, Simioni, and Fellows (2012); Fellows (2011); Fellows and Farah (2007)) or how much to pay for a good or item (Chib, Rangel, Shimojo, and O'Doherty (2009); Hare, O'Doherty, Camerer, Schultz, and Rangel (2008); Plassmann, O'Doherty, and Rangel (2007)). This nding has been consistently reported in studies involving food items, trinkets and money (see Clithero and Rangel (2013) for a meta-analysis). Most studies however have focused on choices involving single items, as opposed to complex bundles of multiple items. Among the few studies involving bundles, the vmPFC has been associated with the ability to make consistent choices between bun- dles (Camille, Griths, Vo, Fellows, and Kable (2011)) and the mOFC has been shown to re ect the dierence in subjective value between monetary options and bundled options (FitzGerald, Seymour, and Dolan (2009)). Other forms of complex options have been studied as well, in the form of multi-attribute options. Here again, activity in the vmPFC re ected the value of the combined items (Kahnt, Heinzle, Park, and Haynes (2011)). It seems intuitive that complex options are dicult to evaluate. Hence, the ability to make more complex value comparisons is likely to also involve the working memory system, responsible for the short-term mental maintenance and manipulation of infor- mation. In the case of value-based decision-making, working memory has been reported to be associated with consistent choices involving complex bundles in older adults (Bro- cas et al. (2016)). It is already also known that during tasks that tax executive func- tion, activation is evoked in the dorsolateral prefrontal cortex (dlPFC) and the posterior parietal cortex (PCC), as demonstrated by many studies (Goldberg, Berman, Fleming, Ostrem, Van Horn, Esposito, Mattay, Gold, and Weinberger (1998); Osherson, Perani, Cappa, Schnur, Grassi, and Fazio (1998); Goel, Gold, Kapur, and Houle (1997); Prab- hakaran, Smith, Desmond, Glover, and Gabrieli (1997); Prabhakaran, Narayanan, Zhao, and Gabrieli (2000); Baker, Frith, Frackowiak, and Dolan (1996); Berman (1995); Nichelli, Grafman, Pietrini, Alway, Carton, and Miletich (1994); Petrides (1994)). Activation stud- ies have shown that dorsal frontal regions are activated during tasks that are experienced as dicult (Braver, Cohen, Nystrom, Jonides, Smith, and Noll (1997); Cohen, Forman, Braver, Casey, Servan-Schreiber, and Noll (1994); Cohen, Perlstein, Braver, Nystrom, Noll, Jonides, and Smith (1997); Monterosso, Ainslle, Xu, Cordova, Domier, and London (2007); Luo, Ainslie, Pollini, Giragosian, and Monterosso (2011)), during task switching (Dove, Pollmann, Schubert, Wiggins, and Yves Von Cramon (2000)), and the dlPFC is dif- 84 ferentially recruited as tasks become more complex (Carlson, Martinkauppi, R am a, Salli, Korvenoja, and Aronen (1998); Braver et al. (1997); Cohen et al. (1997); Baker et al. (1996); Demb, J. B., Desmond, J. E., Wagner, A. D., Vaidya, C. J., Glover, G. H., and Gabrieli (1995); Christo, Prabhakaran, Dorfman, Zhao, Kroger, Holyoak, and Gabrieli (2001)). This relationship extends to tasks requiring the explicit representation and ma- nipulation of knowledge, where the ability to reason relationally is essential (Kroger, Saab, Fales, Bookheimer, Cohen, and Holyoak (2002)). The role of dlPFC in value-based decision making has not been clearly established. It is sometimes reported to be activated and, if so, its involvement is interpreted in the context of the question of interest. For instance, the dlPFC has been found to encode the variability of multi-attribute objects (Kahnt et al. (2011)) and to be more activated when trade- os between attributes are required (McFadden, Lusk, Crespi, Cherry, Martin, Aupperle, and Bruce (2015)). In food choices, the dlPFC has been reported to modulate value (Camus, Halelamien, Plassmann, Shimojo, O'Doherty, Camerer, and Rangel (2009); Hare, Malmaud, and Rangel (2011); Sokol-Hessner, Hutcherson, Hare, and Rangel (2012)) and craving (Fregni, Orsati, Pedrosa, Fecteau, Tome, Nitsche, Mecca, Macedo, Pascual-Leone, and Boggio (2008)), and to be involved in self-regulation and self-control (Hutcherson, Plassmann, Gross, and Rangel (2012); Harris, Hare, and Rangel (2013)). The dlPFC has also been found to be functionally connected with the value coding regions in self-control paradigms (Hare, Rangel, and Camerer (2009)) and in multi-attribute paradigms (Rudorf and Hare (2014)). Last, the dlPFC was found signicantly more activated in studies in which options involved a con ict to be resolved (Baumgartner, Knoch, Hotz, Eisenegger, and Fehr (2011); de Wit, Corlett, Aitken, Dickinson, and Fletcher (2009)). The dlPFC is however not usually reported to be active in studies involving single uni-attribute items. Taken together, these ndings indicate that the potential role of dlPFC is to support value calculation (perhaps in various ways) when options are complex. Here we report the results of an fMRI study in which participants were asked to choose between real food options involving single item options and bundled options. Bundles varied in complexity and were either composed of the same two single items or of two dierent single items. Our design allows us to address the following questions: (1) Is there a common value tracking region when options are simple and complex? (2) What are the neural mechanisms used to compute value as a function of complexity? (3) Does complexity involve brain networks implicated in attention and working memory? 85 3.2 Materials and methods 3.2.1 Subjects Twenty-six healthy adults (9 male, 17 female, average age of 21.9 years, all right-handed) were recruited from the Los Angeles Behavioral Economics Laboratory's subject pool at the University of Southern California. Subjects could participate if they satised the standard eligibility criteria for fMRI studies. We also excluded subjects who reported to have food allergies, food restrictions or to be picky eaters. All participants received a $50 show-up fee for participating. They were also rewarded with one of their choices, selected randomly at the end of the session. One participant was excluded because of incomplete data collection. The Institutional Review Board of USC approved the study. 3.2.2 Procedures Participants were instructed to not eat for at least 4 hours before the experimental session. They were also instructed that they would have to stay after the session to consume what they had obtained and that they could not take any of the food items with them when they leave. This was implemented to make sure participants were hungry and thinking carefully about their choices during the session. The procedure was explained beforehand so that each participant knew that choices were real, and they should make their best decision in every trial. There were three tasks. In the pre-fMRI task, each participant was asked to rank 30 single item (CONTROL) options by order of preference. This ranking was used to create 40 bundles, 20 combinations of 2 same single items (SCALING options), and 20 combinations of 2 dierent single items (BUNDLING options). The participant was then asked to include those options in their previous ranking. We then selected 11 CONTROL options, 10 SCALING options and 10 BUNDLING options to include in the following tasks. These items were selected to so that each set of options (CONTROL, SCALING, and BUNDLING) has the same distribution of option value; there were some low-value, medium value, and high value options in each set. This enabled us to make each task regressor orthogonal to the value regressor. In the fMRI task, each participant made binary choices in the scanner. The median option of the 11 CONTROL options was a reference option, denoted hereafter by REF. Trials were divided into three conditions (Figure 26): CONTROL, SCALING and BUNDLING. In each of the CONTROL trials, the participant had to choose between REF and one of the 10 remaining single item. In each of the SCALING trials, the participant had to choose between REF and one of the 10 combinations of 2 same single items. In each of the BUNDLING trials, the participant had to choose between REF and one of the 10 combinations of 2 dierent single items. 86 In all cases, REF was o-screen, it was the same for each trial and was shown to the participant at the beginning of the experiment. The other option was on-screen and it was displayed at the beginning of each trial. Each individual trial was repeated 9 times for a total of 90 trials in each condition. The circles at the bottom of the screen told the participant what button selected which option, the solid circle always representing REF. The button mappings were randomly assigned for each trial. 2 When the participant responded, the circle representing the chosen option was framed in a square to let the participant know that the their answer was recorded. The screen then advanced to a xation cross for the remainder of the trial. Trial order and inter-stimulus intervals were optimized by Optsec2 for task regressor estimation eciency (Dale (1999)) and organized into 5 runs. We also chose the options in order to ensure that each of the three conditions CONTROL, SCALING and BUNDLING had symmetrical sets of low, medium of high on-screen value options centered around REF. We also made sure that the distribution of value was similar across conditions. This was done to separate value-specic activity from task-specic activity. Figure 26: Three types of trials. The rst is a single item, the second is two portions of the same item, the third is one portion each of two items. As will be explained below, we estimated the value of each option from the actual choices of each participant. The fMRI task however did not allow us to extract infor- mation regarding the relative value between any 2 on-screen options because these were not presented for a choice. To improve the estimation of value, we designed a third task implemented after the scanning session. In this post-fMRI task, participants were shown two options, one on the left and one on the right. When the participant responded, by taping the preferred option, the screen advanced to a xation cross for 0.25 seconds, and then to the next trial. The number of trials ranged from 300 to 500 and diered across 2 Subjects had button boxes in each hand when they were in the scanner. They were instructed to make choices by pressing a button in the hand corresponding to the option, as represented by the circle, they wanted. For example, if they wanted the reference option and the solid circle was on the right side of the screen in that trial, they could select it by pressing a button in their right hand. If they wanted the on-screen option instead, they could select it by pressing a button corresponding to the hollow circle, which in that case would be a button in their left hand. 87 subjects. The options were selected to improve the value estimates of each option in the fMRI task. The procedures of the last two periods are represented in Figure 27. Figure 27: Experimental Design. In the fMRI task, each participant had to choose between an o-screen reference (REF) option and an on-screen option. In the post-fMRI task, each participant had to choose between two on-screen options. All trials were self- paced. 3.2.3 MRI data acquisition Neuroimaging data was collected using the 3T Siemens MAGNETOM Tim/Trio scanner at the Dana and David Dornsife Cognitive Neuroscience Imaging Center at USC with a 32-channel head-coil. Participants laid supine on a scanner bed, viewing stimuli through a mirror mounted on head coil. Blood oxygen level-dependent (BOLD) response were measured by echo planar imaging (EPI) sequence with PACE (prospective acquisition correction) (TR = 2 s; TE = 25 ms; ip angle= 90; resolution = 3 mm isotropic; 64 x 64 matrix in FOV = 192 mm). A total of 41 axial slices, each 3 mm in thickness were acquired in an ascending interleaved fashion with no interslice gap to cover the whole brain. The slices were tilted on a subject-by-subject basis { typically 30 deg from the AC-PC plane { to minimize signal dropout in the orbitofrontal cortex (Deichmann, Gottfried, Hutton, and Turner (2003)). Anatomical images were collected using a T1-weighted three-dimensional magnetization prepared rapid gradient echo (MP-RAGE with TI = 900 ms; TR=1.95 s; TE: 2260 ms; ip angle = 9; resolution = 1 mm isotropic; 256 256 matrix in FOV = 256-mm) primarily for localization and normalization of functional data. These scans were co-registered with the participant's mean EPI images. These images were averaged together to permit anatomical localization of the functional activations at the group level. 88 3.2.4 MRI data preprocessing Image analysis was performed using FSL (Jenkinson, Beckmann, Behrens, Woolrich, and Smith (2012)) algorithms organized in a nipype pipeline (Gorgolewski, Burns, Madison, Clark, Halchenko, Waskom, and Ghosh (2011)). The structural images were skull-striped then aligned to the standard Montreal Neurological Institute (MNI) EPI template using non-linear warping FSL-FNIRT (Andersson, Jenkinson, and Smith (2007)). The functional images were motion and time corrected. They were spatially smoothed using a Gaussian Kernel with a full width at half-maximum of 5mm. We also applied a high-pass temporal ltering using a lter width of 120s. 3.2.5 Behavioral analysis In both the fMRI and the post-fMRI tasks, participants were asked to make a series of choices between two snack options. In the fMRI task, the rst option was always the same o-screen reference option denoted by REF while the second option was an on- screen variable option VAR j (j =f1;:::;Ng). We constructed a Random Utility Model (McFadden (1973); Train (2003); Clithero and Rangel (2013)) in which we assumed that the utility derived by optionVAR j depends on the value of the food snack and a stochastic unobserved error component j . Formally,u(REF ) = 0 + 0 andu(VAR j ) = j + j . The probability of choosing optionVAR j is thereforeP j (q) =Pr[ 0 j < j q 0 ]. Assuming that the error terms are independent, identically distributed, and follow an extreme value distribution with cumulative density function F ( k ) = exp(e k ) for all k = 0;j, then the probability that the participant chooses option VAR j is the logistic function P j (q) = 1 1 +e j 0 The same procedure was applied to trials in the post-fMRI task where options were now all varying. In that task, the probability that the participant chooses option VAR j over VAR i is the logistic function P ji (q) = 1 1 +e j i We then constructed a likelihood function and we used Maximum Likelihood Estimation techniques to retrieve parameters j given the observed choices. This procedure was implemented in Matlab with standard algorithms. For each individual, we also assigned an implicit ranking of all options based on these retrieved values. 89 3.2.6 Analysis of reaction times We recorded reaction times between the onset of the stimulus and the time at which a choice was made in each trial. We analyzed individual and group dierences across conditions as well as trial-specic eects. In particular, we looked at whether trials deemed to be more dicult, as measured by a smaller distance between the value of the on-screen and o-screen options, were also taking longer. 3.2.7 MRI data analysis We estimated several general linear models (GLMs) of BOLD responses. Each aspect of the task was encoded in a regressor for the GLM. To identify what signal was associated with a particular condition, we constructed indicator regressors that take value 1 whenever the participant is performing a trial within a condition and 0 otherwise. To identify the neural activity associated with the subjective value of the on-screen option, we created a parametric regressor that is equal to the value of the on-screen option and changes every time the on-screen option changes. When there is nothing on the screen, both regressors are 0. The models also include motion parameters (regressors for translation and rotation as well as artifact regressors controlling for quick jerking movements) and regressors for each run as nuisance regressors. All regressors were convolved with the canonical form of the hemodynamic response. The values in the regressors were applied from the onset of the stimulus until a choice was made (average duration, 1.47s). All of our GLMs took the general form: Y i = [H 1 (R a )] a i +R b b i +e i Where Y i is the time-series of BOLD signal at each voxel i, H 1 is the hemodynamic response function (HDF) used by FSL Smith, Jenkinson, Woolrich, Beckmann, Behrens, Johansen-Berg, Bannister, De Luca, Drobnjak, Flitney, Niazy, Saunders, Vickers, Zhang, De Stefano, Brady, and Matthews (2004); Jenkinson et al. (2012) applied to the primary regressor matrix R a (each column is a primary regressor) and e i is a gaussian noise. The GLM solves for a i and b i to minimize the error e i . To analyze the in uence of an indicator regressor the coecients a i are contrasted against each other (CONTROL vs. SCALING for instance). These -contrasts are used to generate interpretable statistics. We considered the following models: Model (1) was a GLM used to identify the regions involved in value computation at the time of decision. To do this, we searched for areas in which the BOLD responses were correlated with value. This GLM consists of the following 4 regressors of interest: R 1 toR 3 were indicator regressors denoting CONTROL trials, SCALING trials and BUNDLING trials while R 4 was a parametric regressor tracking the value of the on-screen option. 90 Model (2) was a GLM used to conduct a conjunction analysis (Price and Friston (1997); Friston, Penny, and Glaser (2005)) to assess dierence in value-tracking across conditions. This model consists of 3 parametric regressors, one to track the value of the on-screen option in each condition, and 3 indicator regressors for CONTROL, SCALING and BUNDLING trials. Model (3) was a a general psychophysiological interaction (gPPI) model (McLaren, Ries, Xu, and Johnson (2012); De Martino, Fleming, Garrett, and Dolan (2013)) designed to test dierences in condition-dependent functional connectivity. We dened a region of interest (ROI) which we used as the seed for our analysis. The gPPI model searched for how and when other regions connected to that seed region during a specic condition, but not in any other condition. It has the general form: Y s =H 2 (x s ) Y s is the mean bold activity in a region of interest (ROI). If we preform a deconvolution ofY s withH 2 , Afni's hemodynamic response function (Cox (1996); Cox and Hyde (1997)), we get x s , the implied neural activity in the seed region. Then the BOLD signal Y i in each voxel i is modeled as the linear combination of seed BOLD activity the convolved indicator regressors, the psychophysiological regressors, and the nuisance regressors. Y i =Y s s i +::: Mean BOLD signal in seed region [H 2 (R a )] a i +::: Psychological contribution [H 2 (R a x s )] c i +::: Psychophysiological contribution R b b i +::: Nuisance contribution e i Error The physiological regressor (Y i ) absorbs all of signal associated with constant func- tional connection with the seed ROI. The \psychological" regressors are same as the task indicator regressors form model (1). They absorbs all signal associated with the tasks. Each \psychophysiological" regressor is an element-wise product of a \psychological" re- gressor (columns in R a ) and the vector of neural activity regressor x s . These regressors detect which brain activity correlates with the seed region brain activity during a specic task, but not at other times. We call this the task-dependent functional connectivity. Our seed region was a 6mm diameter sphere centered on MNI coordinates [-6,51,-6]. This was the point of peak vmPFC activity for the value regressor in a fMRI task analogous to ours (Kahnt et al. (2011)). We preferred to use this independent seed rather than our peak activity value region to avoid inference problems due to circular analysis (Kriegeskorte, Simmons, Bellgowan, and Baker (2009)). 91 3.2.8 Postprocessing After each GLM was t to the image time-series, the -contrasts were combined at the subject level using a Fixed Eects Model, then combined in a Mixed Eects Model to create group level voxel-wise t-statistics. These t-statistics were corrected for false discovery rate (FDR) using threshold-free cluster enhancement (Smith, Fox, Miller, Glahn, Fox, Mackay, Filippini, Watkins, Toro, Laird, and Beckmann (2009)). Unless otherwise stated, all images are thresholded at p<0.02 FDR corrected. 3.3 Results 3.3.1 Behavioral results We counted very few missed trials resulting in no choice (1.48% of the trials) indicating that participants were attentive and had enough time to select their preferred option. Estimated values and implicit ranking. We estimated the value of each option for each individual. The values were the best estimates given the observed choices. In principle, if a subject's choices are well represented by the Random Utility Model, we should observe that most choices are consistent with estimates. For each individual, we generated the choices they should have made in all trials if they were choosing according to the value estimates and we compared with their actual choices. More precisely, we counted all choices that were not consistent with the value estimates (and henceforth with the implicit ranking) and we computed the percentage of these inconsistent choices. We found that 87% of the choices of all subjects were consistent with implicit rankings in the fMRI task ( 88% in the CONTROL condition, 88% in the SCALING condition and 87% in the BUNDLING condition). Also 87% of choices were consistent in the post-fMRI task. These results indicate that value estimates were good proxies of values during choices in the scanner. Discrepancies between implicit and explicit ranking. Recall that the snacks that we included in the fMRI and post-fMRI sessions were selected in such a way that the dis- tributions of expected choices between REF and VAR in the three conditions would be balanced and comparable. This selection however was made based on the explicit rank- ings obtained in the pre-fMRI period and there was no guarantee that these rankings would line up with implicit rankings. We found that discrepancies existed generally between the two rankings, suggesting that explicit rankings were not the best predic- tors of choices. However, overall, the two rankings were signicantly correlated (Spear- man=0.56 (p-value<0.0001) in CONTROL, 0.67 (p-value<0.0001) in SCALING and 0.60 (p-value<0.0001) in BUNDLING) suggesting that the explicit ranking was a good predictor of choices in the scanner. Values. We found that 7 (respectively 6) participants out of 25 had linear preferences in 92 the SCALING condition (respectively BUNDLING). All others valued the combination of items more than the sum of items, suggesting that value was super-additive. By regressing the value of each bundle on the values of the individual items, we found that the constant term was positive and signicant (p < 0:05) for most participants with super-additive preferences, suggesting that bundled items were positively valued independently of their content. The value of a bundle in the SCALING condition was predicted by the value of the single item (p < 0:05). In the BUNDLING condition, it was often predicted by the value of only one of the single items, most of the time by the highest valued item. 3.3.2 Reaction times We found that it took on average longer to make decisions in the BUNDLING condition (mean=1.52s) compared to CONTROL (mean=1.43s) and SCALING (mean=1.37s). A t-test and a KS-test conrmed that reaction times were signicantly longer in BUNDLING compared to CONTROL and SCALING (in both cases KS-test, p<0.001; t-test, p<0.0001). SCALING trials were taking slightly more time than CONTROL trials only according to KS-test (p<0.007). We also found that choosing the on-screen option was taking less time compared to the o-screen option in all conditions (in each condition KS-test, p<0.001; t-test, p<0.0001). 3 To assess any specic eect of value on reaction times, we computed the absolute value of the dierence between the rank of the o-screen and on-screen op- tions and we correlated it with reaction times. We found that reaction times were short in trials where the on-screen and the o-screen option were ranked very dierently and long in trials where they were close in ranks (Pearson= -0.15 in CONTROL, Pearson= -0.20 in SCALING and Pearson= -0.25 in BUNDLING, p-value<0.0001). This eect was also found at the individual level, even though sometimes less signicant in particular in the CONTROL and SCALING conditions. In the BUNDLING conditions, subjects were spending signicantly more time when options were close. 3.3.3 Regions correlating with subjective value We had three potential value regressors: one based on the estimated value of each option, one based on the implicit rank of each option and one based on the explicit rank of each option. Because estimated value and implicit rank contain the same information, both are equally good candidates for the exercise. Because the explicit rank had an inferior t with the actual choices, it was dominated by the other two. We chose to use the implicit ranking, and we will report below results obtained under the alternative specications. 3 Choices made with the right hand were signicantly faster only in CONTROL trials. 93 Region Side P-value MNI-X MNI-Y MNI-Z Precuneus 0.002 2 -32 40 0.002 -4 -36 40 0.002 0 -38 34 mOFC 0.004 -8 10 -14 vmPFC 0.004 -8 42 -10 0.005 8 42 -12 0.005 12 42 4 N. accumbens L 0.004 -10 10 -2 L 0.004 -10 6 -4 R 0.005 6 10 -2 SFG L 0.005 -20 10 48 dlPFC L 0.005 -20 58 6 L 0.005 -36 38 14 L 0.007 -36 4 32 L 0.007 -42 38 8 Amygdala L 0.005 -34 -14 -14 Table 14: Local minima in corrected p-value parametric value regressor. Thresholded forp 0:005 corrected. Minima in occipital cortex or white matter excluded. Where cluster spanned multiple functional regions, a regional mask was used to locate the peak voxel within a region. mOFC; medial OrbitoFrontal Cortex, vmPFC; ventroMedial PreFrontal Cortex, SFG; Superior Frontal Gyrus, dlPFC; DorsoLateral PreFrontal Cortex We identied value regions by estimating Model (1) based on the parametric regressor tracking the value of the on-screen option. We found signicant activity in the mOFC, the vmPFC, the Precuneus, the Nucleus Accumbens (NA) and the left dlPFC that correlated with the estimated value of the on screen option, which conrmed earlier ndings. We then estimated Model (2) and made a conjunction analysis to see whether the three conditions activated the same regions. We found that only the CONTROL value regressor showed signicant activity in the vmPFC and only the BUNDLING value regressor showed activity in many regions implicated in value tracking, including dlPFC, SFG, fusiform, precuneus, mOFC, ACC, PCC (see tables 16 & 17). The SCALING value regressor showed no signicant activation (min p-value = .32). This result was unexpected. However, it strongly suggested that dierent calculations are performed under complex value-based decision-making. The activation patterns in relation with value are presented in Figure 28. 94 Region Side P-value MNI-X MNI-Y MNI-Z vmPFC 0.003 2 54 -10 0.003 -4 44 -12 0.003 8 40 -16 0.004 2 46 -16 0.004 -8 34 -16 0.004 2 50 -18 MTG L 0.005 -62 -22 -18 Table 15: Local minima in corrected p-value parametric value regressor for CONTROL trials. Thresholded for p 0:005 corrected. Minima in occipital cortex or white matter excluded. Where cluster spanned multiple functional regions, a regional mask was used to locate the peak voxel within a region. vmPFC ;ventroMedial PreFrontal Cortex, MTG; MidTemporal Gyrus 3.3.4 Regions involved in complex conditions (SCALING and BUNDLING) We analyzed the eect of complexity by contrasting CONTROL trials with SCALING trials and BUNDLING trials separately. We found that the Default Mode Network (pre- cuneus, TPJ) was deactivated during complex trials (Smith et al. (2009)) for a canonical map). This suggested that more attentiveness was required during those two conditions (Figure 29). 3.3.5 Regions involved in complex calculations (BUNDLING vs. SCALING) We analyzed the condition-specic eect of BUNDLING by contrasting these trials with SCALING trials. We found that the left and right infraparietal sulcus was signicantly activated, as shown in Figure 31. Because this image is the result of a contrast between SCALING and BUNDLING, the activity cannot simply be caused by the tracking of two items on the screen as both conditions have the same number of items on the screen. This region is usually associated with calculation, arithmetic and numbers, suggesting that BUNDLING was requiring a computation ability. This eect was task specic and not sensitive to value. Unexpectedly, We also found signicant activity in the fusiform gyrus (peakpvalue 0:005 see table 18). Similar activity has been reported in calculation tasks, and is at- tributed to letter-form recognition. Our fusiform activity was clearly not the result of letter form recognition, although it might be caused by the fact that more items require independent recognition in the BUNDLING task compared to the SCALING task. 95 Region Side P-value MNI-X MNI-Y MNI-Z dlPFC L 0.001 -36 8 42 L 0.001 -42 10 38 L 0.002 -48 4 44 L 0.002 -26 -6 44 L 0.002 -32 -4 42 L 0.002 -32 8 40 L 0.004 -42 38 2 L 0.006 44 34 12 L 0.006 38 36 6 L 0.009 40 26 12 R 0.006 56 0 28 R 0.008 64 2 26 R 0.008 44 -2 26 SFG L 0.003 -18 10 52 L 0.003 -14 24 44 L 0.003 -18 18 44 L 0.003 -18 30 22 L 0.003 -16 42 20 L 0.003 -20 36 18 Fusiform L 0.003 -30 -44 -24 L 0.003 -34 -44 -26 L 0.004 -26 -54 -14 L 0.004 -30 -64 -20 L 0.004 -26 -48 -22 Precuneus 0.003 -8 -56 62 0.004 -8 -50 56 IPS L 0.003 -22 -86 40 L 0.004 -14 -84 36 L 0.005 -18 -56 60 R 0.004 16 -60 52 R 0.004 24 -66 50 R 0.005 24 -62 62 R 0.005 30 -58 60 R 0.005 30 -54 50 R 0.005 22 -56 50 R 0.005 20 -78 44 R 0.005 30 -72 36 Table 16: Local minima in corrected p-value parametric value regressor for BUNDLING trials Part1. 96 Region Side P-value MNI-X MNI-Y MNI-Z Post-CS R 0.003 40 -28 42 R 0.004 52 -38 34 R 0.005 36 -38 54 R 0.005 38 -36 50 R 0.005 38 -42 48 ACC 0.003 -4 4 28 0.003 4 4 26 0.005 -12 -4 46 0.005 0 -2 50 0.009 -4 24 22 0.010 -2 28 30 Pre-CS R 0.005 58 -14 30 R 0.010 48 -12 26 Sup. Insual R 0.005 34 -8 12 R 0.005 38 0 8 R 0.005 42 -2 4 Frontal Pole L 0.006 -20 62 8 L 0.008 -18 56 4 Hypocampus L 0.006 -28 -2 -22 L 0.008 -30 -8 -22 V.Str. L 0.007 -30 -14 -6 L 0.010 -16 -24 16 mOFC 0.008 -4 24 -22 0.008 -6 12 -24 0.008 -2 8 -26 Amygdala L 0.008 -24 -8 -10 L 0.008 -24 -4 -16 L 0.008 -28 -18 -16 Table 17: Local minima in corrected p-value parametric value regressor for BUNDLING trials Part2. These tables contain many more minima because a lower threshold was needed to report the mOFC activation. dlPFC; DorsoLateral PreFrontal Cortex, SFG; Superior Frontal Gyrus, IPS;IntraParietal Sulcus, CS; Central Sulcus, ACC; antirior cingulate cortex, V.str.: Ventral Striatum, mOFC;medial OrbitoFrontal Cortex, vmPFC;ventroMedial PreFrontal.Cortex 97 Figure 28: Value tracking regions. A) Regions tracking value in the fMRI task (Model (1)). B) Regions tracking value in CONTROL (Model (2)). C) Regions tracking value in BUNDLING (Model 2). See table 14 for all active regions. 3.3.6 Connectivity analysis With Model (3) we asses if any brain regions are dierentially recruited to calculate value during dierent task. The center of the vmPFC ROI Kahnt et al. (2011) is 10.05 mm from the vmPFC peak of our Value regressor (principally in the anterior direction). Our value regressor is highly signicant at the center of the ROI (p = 0.01). We found that the dlPFC is functionally connected to the vmPFC, but only in the more complex conditions. The dlPFC was signicantly activated when we contrasted BUNDLING and CONTROL, showing that as activity in the seed increased, activity in dlPFC increased as well (Figure 32). A similar nding was made when we contrasted SCALING-CONTROL but to a lesser level of signicance (p 0:25 corr). 3.3.7 Other analyses of interest We considered several variants of Model (1). First, we replaced the "implicit ranking" value regressor by the estimated value regressor. We also repeated the analysis by considering the explicit ranking instead. The results presented here were qualitatively similar. We also considered a variant of Model (1) in which we replaced the on-screen value regressor 98 Figure 29: Attentiveness is required in complex conditions. Compared to CON- TROL, both SCALING and BUNDLING trials have signicantly less activity in the pos- terior nodes of the Default Mode Network. A canonical map of default more network (A) from Smith et al. (2009) appear similar to corrected contrasts of CONTROL>SCALING (B) and CONTROL>BUNDLING (C). There are no regions are signicant for the con- trasts SCALING>CONTROL or BUNDLING>CONTROL. by a chosen-value regressor. We did not nd any signicantly dierent pattern of activity. We last constructed a variant of Model (1) that also included a regressor for task diculty, measured by the absolute value of the dierence between the on-screen and the o-screen option. No eect was associated with diculty. 3.4 Discussion Our study provides novel insights about the computation of value in value-based decision- making involving bundles of goods. Value tracking of complex options. The most unexpected nding is that value tracking regions are not tracking value similarly across conditions. Complexity requires attentiveness. The default mode network is active when a person is not focused on a task (Raichle, MacLeod, Snyder, Powers, Gusnard, and Shulman (2001); Raichle (2015); Davis, Hauf, Wu, and Everhart (2011)) and it has also been shown to be negatively correlated with attention networks (Greicius, Krasnow, Reiss, and Menon (2003)) and the dlPFC in working memory tasks (Piccoli, Valente, Linden, Re, 99 Figure 30: The contrast between the single-item-trial regressor and the re- gressors for scaled-option-trials and bundled-option-trials correlates with the canonical default mode network fromSmith et al. (2009). A voxelwise correlation between the correlation coecient of 0.29 was calculated using fslcc. The p-value on a persons correlation is <10 308 . However, a Pearson correlation test assumes independent observations, which is never true in neuroimaging, especially after smoothing. addito salis grano, that's still a good correlation and tiny p-value. Esposito, Sack, and Salle (2001); Esposito, Bertolino, Scarabino, Latorre, Blasi, Popolizio, Tedeschi, Cirillo, Goebel, and Di Salle (2006)). The relative inactivation in regions that correlate with the canonical DMN in the SCALING and BUNDLING with respect to CONTROL, implies that subjects need to be more attentive when preforming those tasks. In the independent component analysis(ICA) studies that rst identied networks like the DMN, they typically nd opponent networks(Greicius et al. (2003); Raichle (2015)) which are active when the DMN is inactive; commonly referred to as the task positive network. We did not nd any regions that were signicantly more active in SCALING or BUNDLING compared to CONTROL. An absence of signicant activity is always dicult to interpret, but it is unsurprising in this case if we suppose that subjects engaged with dierent SCALING or BUNDLING trials with dierent strategies. This would create a lot of heterogeneity `task positive' response. Complexity requires calculation. The IPS is associated with many visio-spacial tasks including calculation, saccades, visual attention, language, reach, and grasp (Simon, Mangin, Cohen, Le Bihan, and Dehaene (2002)). It is also known that the exact brain 100 Region Side P-value MNI-X MNI-Y MNI-Z fusiform L 0.002 -32 -58 -18 L 0.002 -32 -52 -20 R 0.005 30 -48 -20 IPS R 0.007 30 -62 44 R 0.007 30 -66 30 L 0.008 -24 -62 42 Table 18: Local minima in corrected p-value for BUNLDING>SCALING. Thresholded for p 0:01. Region Side P-value MNI-X MNI-Y MNI-Z dlPFC R 0.029 50 10 34 L 0.093 -44 24 30 Table 19: Local minima in corrected p-value for BUNDLING CONNECTIVITY>CONTROL CONNECTIVTY. Thresholded for p 0:1 corrected. region associated with each of these abilities systematically varies between subjects based expertise (Desco, Navas-Sanchez, Sanchez-Gonz alez, Reig, Robles, Franco, Guzm an-De- Villoria, Garc a-Barreno, and Arango (2011)). Ideally, we could have run localizer tasks for each of these abilities in each subject and seen if which best t with the parietal region implicated in the BUNDLING>SCALING contrast. But our task was already pushing the limits of what could be expected from a subject. Instead, we'll compare that activity with a broad meta-analysis of these tasks. Figure 31 shows a between a neurosynth meta-analysis for `calculation' (A), a contrast of meta- analysis for `calculation'>`saccades' (C), and the BUNDLING>SCALING contrast (B) (Yarkoni, Poldrack, Nichols, Van Essen, and Wager (2011)). The contrast is much more similar to the meta analysis for `calculation', implying that some psychological primitive of calculation of what is driving the activity in the BUNDLING>SCALING contrast. 4 Complexity requires working memory. A relationship between value tracking regions and dlPFC has been demonstrated in many studies. These studies all share a common denominator: participants need to choose between objects involving some degree of complexity and value cannot be simply computed by just looking at the objects. In our study, complexity refers to the SCALING condition and more so in the BUNDLING 4 NeuroSynth meta-analyses for `arrhythmic' and `number' also bare a strong resemblance with the BUNDLING>SCALING contrast. 101 Figure 31: Relative to SCALING, BUNDLING recruits regions associated with calculation. IPS activation in a neurosynth meta analysis for `calculation'(A) bares a striking resemblance to the BUNDLING>SCALING contrast(B). A contrast of meta analyses for `calculation' and `saccades' (C) also supports the claim that the IPS activity in the BUNDLING>SCALING contrast is caused by calculation-like cogitation. In (C) red regions are more associated with calculation and blue regions are more associated with saccades. condition. The observed longer reaction times suggests that value is computed gradually in those conditions. Furthermore, the fact that mOFC and dlPFC are functionally connected in the complex conditions (and even more so when complexity increases) indicates that dlPFC may be supporting the longer accumulation of information to compute complex value. This nding also sheds light on the reason why dlPFC is not found consistently to activate in value-based decision-making paradigms. Its involvement requires a minimum level of complexity. 102 Figure 32: The vmPFC is more connected to the dlPFC during BUNDLING trials than during CONTROL trials from gPPI. Both the right dlPFC (peak p- value = 0.029 corrected) and left dlPFC (peak p-value = 0.093 corrected) are signicantly greater in the contrast of the BUNDLING connectivity regressor than in the CONTROL connectivity regressor. 3.5 References References Andersson, J. L. R., M. Jenkinson, and S. Smith (2007). Non-linear registration aka Spatial normalisation. Oxford University (June), 22. Baker, S. C., C. D. Frith, R. S. Frackowiak, and R. J. Dolan (1996). Active representation of shape and spatial location in man. Cerebral Cortex 6 (4), 612{619. Baumgartner, T., D. Knoch, P. Hotz, C. Eisenegger, and E. Fehr (2011). Dorsolat- eral and ventromedial prefrontal cortex orchestrate normative choice. Nature neuro- science 14(11), 1468{74. Berman, K. (1995). Physiological activation of a cortical network during performance of the Wisconsin Card Sorting Test: A positron emission tomography study. Neuropsy- chologia 33(8), 1027{1046. Braver, T. S., J. D. Cohen, L. E. Nystrom, J. Jonides, E. E. Smith, and D. C. Noll (1997). A parametric study of prefrontal cortex involvement in human working memory. NeuroImage 5 (1), 49{62. Camille, N., C. A. Griths, K. Vo, L. K. Fellows, and J. W. Kable (2011). Ventro- medial frontal lobe damage disrupts value maximization in humans. The Journal of Neuroscience 31 (20), 7527{7532. 103 Camus, M., N. Halelamien, H. Plassmann, S. Shimojo, J. O'Doherty, C. Camerer, and A. Rangel (2009). Repetitive transcranial magnetic stimulation over the right dorso- lateral prefrontal cortex decreases valuations during food choices. European Journal of Neuroscience 30 (10), 1980{1988. Carlson, S., S. Martinkauppi, P. R am a, E. Salli, A. Korvenoja, and H. J. Aronen (1998). Distribution of cortical activation during visuospatial n-back tasks as revealed by func- tional magnetic resonance imaging. Cerebral cortex 8, 743{752. Chib, V. S., A. Rangel, S. Shimojo, and J. P. O'Doherty (2009). Evidence for a common representation of decision values for dissimilar goods in human ventromedial prefrontal cortex. The Journal of neuroscience 29 (39), 12315{12320. Christo, K., V. Prabhakaran, J. Dorfman, Z. Zhao, J. K. Kroger, K. J. Holyoak, and J. D. Gabrieli (2001). Rostrolateral prefrontal cortex involvement in relational integration during reasoning. NeuroImage 14 (5), 1136{1149. Clithero, J. A. and A. Rangel (2013). Combining response times and choice data us- ing a neuroeconomic model of the decision process improves out-of-sample predictions. Quarterly Journal of Economics 91125, 3. Cohen, J. D., S. D. Forman, T. S. Braver, B. J. Casey, D. Servan-Schreiber, and D. C. Noll (1994). Activation of the prefrontal cortex in a nonspatial working memory task with functional MRI. Cohen, J. D., W. M. Perlstein, T. S. Braver, L. E. Nystrom, D. C. Noll, J. Jonides, and E. E. Smith (1997). Temporal dynamics of brain activation during a working memory task.pdf. Cox, R. W. (1996). AFNI: Software for Analysis and Visualization of Functional Magnetic Resonance Neuroimages. Computers and Biomedical Research 29 (3), 162{173. Cox, R. W. and J. S. Hyde (1997). Software tools for analysis and visualization of fMRI data. Nuclear Magnetic Resonance in Biomedicine 10, 171{178. Dale, A. M. (1999). Optimal experimental design for event-related fMRI. Human Brain Mapping 8(2-3), 109{114. Davis, C. E., J. D. Hauf, D. Q. Wu, and D. E. Everhart (2011). Brain function with complex decision making using electroencephalography. International Journal of Psy- chophysiology 79 (2), 175{183. 104 De Martino, B., S. M. Fleming, N. Garrett, and R. J. Dolan (2013). Condence in value- based choice. Nature Neuroscience 16 (1), 105{110. de Wit, S., P. R. Corlett, M. R. Aitken, A. Dickinson, and P. C. Fletcher (2009). Dier- ential engagement of the ventromedial prefrontal cortex by goal-directed and habitual behavior toward food pictures in humans. The Journal of neuroscience : the ocial journal of the Society for Neuroscience 29 (36), 11330{8. Deichmann, R., J. A. Gottfried, C. Hutton, and R. Turner (2003). Optimized EPI for fMRI studies of the orbitofrontal cortex. NeuroImage 19 (2), 430{441. Demb, J. B., Desmond, J. E., Wagner, A. D., Vaidya, C. J., Glover, G. H., and Gabrieli, J. D. (1995). Semantic encoding and retrieval in the left inferior prefrontal - cortex: a functional MRI study of task diculty and process specicity. Journal of Neuro- science, 15(9), 5870{5878. Desco, M., F. J. Navas-Sanchez, J. Sanchez-Gonz alez, S. Reig, O. Robles, C. Franco, J. A. Guzm an-De-Villoria, P. Garc a-Barreno, and C. Arango (2011). Mathematically gifted adolescents use more extensive and more bilateral areas of the fronto-parietal network than controls during executive functioning and uid reasoning tasks. NeuroImage 57 (1), 281{292. Dove, A., S. Pollmann, T. Schubert, C. J. Wiggins, and D. Yves Von Cramon (2000). Prefrontal cortex activation in task switching: An event-related fMRI study. Cognitive Brain Research 9 (1), 103{109. Esposito, F., A. Bertolino, T. Scarabino, V. Latorre, G. Blasi, T. Popolizio, G. Tedeschi, S. Cirillo, R. Goebel, and F. Di Salle (2006). Independent component model of the default-mode brain function: Assessing the impact of active thinking. Brain Research Bulletin 70 (4-6), 263{269. Fehr, E., H. Bernhard, and B. Rockenbach (2008). Egalatarianism in young children. Nature 454(28), 1079{1084. Fellows, L. K. (2011). Orbitofrontal contributions to value-based decision making: Ev- idence from humans with frontal lobe damage. Annals of the New York Academy of Sciences 1239(1), 51{58. Fellows, L. K. and M. J. Farah (2007). The role of ventromedial prefrontal cortex in deci- sion making: Judgment under uncertainty or judgment per se? Cerebral Cortex 17 (11), 2669{2674. 105 FitzGerald, T. H. B., B. Seymour, and R. J. Dolan (2009). The Role of Human Or- bitofrontal Cortex in Value Comparison for Incommensurable Objects. Journal of Neu- roscience 29 (26), 8388{8395. Fregni, F., F. Orsati, W. Pedrosa, S. Fecteau, F. A. M. Tome, M. A. Nitsche, T. Mecca, E. C. Macedo, A. Pascual-Leone, and P. S. Boggio (2008). Transcranial direct cur- rent stimulation of the prefrontal cortex modulates the desire for specic foods. Ap- petite 51(1), 34{41. Friston, K. J., W. D. Penny, and D. E. Glaser (2005). Conjunction revisited. NeuroIm- age 25(3), 661{667. Goel, V., B. Gold, S. Kapur, and S. Houle (1997). The seats of reason? An imaging study of deductive and inductive reasoning. Neuroreport 8 (5), 1305{1310. Goldberg, T. E., K. F. Berman, K. Fleming, J. Ostrem, J. D. Van Horn, G. Esposito, V. S. Mattay, J. M. Gold, and D. R. Weinberger (1998). Uncoupling cognitive workload and prefrontal cortical physiology: a PET rCBF study. NeuroImage 7 (4 Pt 1), 296{303. Gorgolewski, K., C. D. Burns, C. Madison, D. Clark, Y. O. Halchenko, M. L. Waskom, and S. S. Ghosh (2011). Nipype: a exible, lightweight and extensible neuroimaging data processing framework in python. Front Neuroinform 5. Greicius, M. D., B. Krasnow, A. L. Reiss, and V. Menon (2003). Functional connectivity in the resting brain: a network analysis of the default mode hypothesis. PNAS 100(1), 253{258. Hare, T. a., J. Malmaud, and A. Rangel (2011). Focusing attention on the health aspects of foods changes value signals in vmPFC and improves dietary choice. The Journal of neuroscience : the ocial journal of the Society for Neuroscience 31 (30), 11077{87. Hare, T. A., J. P. O'Doherty, C. F. Camerer, W. Schultz, and A. Rangel (2008). Disso- ciating the Role of the Orbitofrontal Cortex and the Striatum in the Computation of Goal Values and Prediction Errors. The Journal of Neuroscience 28 (22), 5623{5630. Hare, T. A., A. Rangel, and C. F. Camerer (2009). Self-control in decision-making involves modulation of the vmPFC valuation system. Science 324(5927), 646{8. Harris, A., T. Hare, and A. Rangel (2013). Temporally Dissociable Mechanisms of Self- Control: Early Attentional Filtering Versus Late Value Modulation. Journal of Neuro- science 33(48), 18917{18931. 106 Henri-Bhargava, A., A. Simioni, and L. K. Fellows (2012). Ventromedial frontal lobe damage disrupts the accuracy, but not the speed, of value-based preference judgments. Neuropsychologia 50 (7), 1536{1542. Hutcherson, C. A., H. Plassmann, J. J. Gross, and A. Rangel (2012). Cognitive Reg- ulation during Decision Making Shifts Behavioral Control between Ventromedial and Dorsolateral Prefrontal Value Systems. Journal of Neuroscience 32 (39), 13543{13554. Jenkinson, M., C. F. Beckmann, T. E. J. Behrens, M. W. Woolrich, and S. M. Smith (2012). Fsl. NeuroImage 62 (2), 782{790. Kahnt, T., J. Heinzle, S. Q. Park, and J. D. Haynes (2011). Decoding dierent roles for vmPFC and dlPFC in multi-attribute decision making. NeuroImage 56 (2), 709{715. Kriegeskorte, N., W. K. Simmons, P. S. F. Bellgowan, and C. I. Baker (2009). Circular analysis in systems neuroscience: the dangers of double dipping. Kroger, J. K., F. W. Saab, C. L. Fales, S. Y. Bookheimer, M. S. Cohen, and K. J. Holyoak (2002). Recruitment of Anterior Dorsolateral Prefrontal Cortex in Human Reasoning: A Parametric Study of Relational Complexity. Cerebral Cortex 12 (5), 477{485. Luo, S., G. Ainslie, D. Pollini, L. Giragosian, and J. R. Monterosso (2011). Moderators of the association between brain activation and farsighted choice. NeuroImage 59 (2), 1469{1477. McFadden, B. R., J. L. Lusk, J. M. Crespi, J. B. C. Cherry, L. E. Martin, R. L. Aupperle, and A. S. Bruce (2015). Can neural activation in dorsolateral prefrontal cortex predict responsiveness to information? An application to egg production systems and campaign advertising. PloS one 10(5), e0125243. McFadden, D. (1973). Conditional logit analysis of qualitative choice behavior. In Frontiers in Econometrics, Chapter 4, pp. 105{142. McLaren, D. G., M. L. Ries, G. Xu, and S. C. Johnson (2012). A generalized form of context-dependent psychophysiological interactions (gPPI): A comparison to standard approaches. NeuroImage 61 (4), 1277{1286. Monterosso, J. R., G. Ainslle, J. Xu, X. Cordova, C. P. Domier, and E. D. London (2007). Frontoparietal cortical activity of methamphetamine-dependent and comparison sub- jects performing a delay discounting task. Human Brain Mapping 28 (5), 383{393. 107 Nichelli, P., J. Grafman, P. Pietrini, D. Alway, J. C. Carton, and R. Miletich (1994). Brain Activity in Chess Playing. Nature 369(6477), 191. Osherson, D., D. Perani, S. Cappa, T. Schnur, F. Grassi, and F. Fazio (1998). Distinct brain loci in deductive versus probabilistic reasoning. Neuropsychologia 36 (4), 369{376. Petrides, M. (1994). Frontal lobes and behaviour. Current Opinion in Neurobiology 4 (2), 207{211. Piccoli, T., G. Valente, D. E. J. Linden, M. Re, F. Esposito, A. T. Sack, and F. D. Salle (2001). The default mode network and the working memory network are not anti-correlated during all phases of a working memory task. PLoS ONE 10(4), 1{16. Plassmann, H., J. O'Doherty, and A. Rangel (2007). Orbitofrontal Cortex Encodes Will- ingness to Pay in Everyday Economic Transactions. Journal of Neuroscience 27 (37), 9984{9988. Prabhakaran, V., K. Narayanan, Z. Zhao, and J. D. Gabrieli (2000). Integration of diverse information in working memory within the frontal lobe. Nature neuroscience 3 (1), 85{90. Prabhakaran, V., J. a. Smith, J. E. Desmond, G. H. Glover, and J. D. Gabrieli (1997). Neural substrates of uid reasoning: an fMRI study of neocortical activation during performance of the Raven's Progressive Matrices Test. Cognitive psychology 33 (1), 43{63. Price, C. J. and K. J. Friston (1997). Cognitive conjunction: a new approach to brain activation experiments. NeuroImage 5 (4 Pt 1), 261{70. Raichle, M. E. (2015). The Brain's Default Mode Network. Annual review of neuro- science (April), 413{427. Raichle, M. E., a. M. MacLeod, a. Z. Snyder, W. J. Powers, D. a. Gusnard, and G. L. Shulman (2001). A default mode of brain function. Proceedings of the National Academy of Sciences of the United States of America 98 (2), 676{682. Rangel, A., C. Camerer, and P. R. Montague (2008). A framework for studying the neurobiology of value-based decision making. Nature Reviews Neuroscience 9 (7), 545{ 556. Rudorf, S. and T. A. Hare (2014). Interactions between Dorsolateral and Ventromedial Prefrontal Cortex Underlie Context-Dependent Stimulus Valuation in Goal-Directed Choice. Journal of Neuroscience 34 (48), 15988{15996. 108 Simon, O., J. F. Mangin, L. Cohen, D. Le Bihan, and S. Dehaene (2002). Topographical layout of hand, eye, calculation, and language-related areas in the human parietal lobe. Neuron 33(3), 475{487. Smith, S. M., P. T. Fox, K. L. Miller, D. C. Glahn, P. M. Fox, C. E. Mackay, N. Filippini, K. E. Watkins, R. Toro, A. R. Laird, and C. F. Beckmann (2009). Correspondence of the brain's functional architecture during activation and rest. Proceedings of the National Academy of Sciences of the United States of America 106 (31), 13040{5. Smith, S. M., M. Jenkinson, M. W. Woolrich, C. F. Beckmann, T. E. J. Behrens, H. Johansen-Berg, P. R. Bannister, M. De Luca, I. Drobnjak, D. E. Flitney, R. Niazy, J. Saunders, J. Vickers, Y. Zhang, N. De Stefano, J. M. Brady, and P. M. Matthews (2004). Advances in functional and structural MR image analysis and implementation as FSL. NeuroImage 23 (S1), 208{219. Sokol-Hessner, P., C. Hutcherson, T. Hare, and A. Rangel (2012). Decision value compu- tation in DLPFC and VMPFC adjusts to the available decision time. European Journal of Neuroscience 35 (7), 1065{1074. Train, K. E. (2003). Discrete Choice Methods with Simulation. Yarkoni, T., R. A. Poldrack, T. E. Nichols, D. C. Van Essen, and T. D. Wager (2011). Large-scale automated synthesis of human functional neuroimaging data. Nature Meth- ods 8(8), 665{670. 109 4 Altruism and strategic giving in children and adolescents ¶ Isabelle Brocas University of Southern California and CEPR Juan D. Carrillo University of Southern California and CEPR Niree Kodaverdian University of Southern California Abstract We conduct a laboratory experiment to investigate the evolution of altruism and strategic giving from childhood to adulthood. 334 school-age children and adolescents (from K to 12th grade) and 48 college students participated in a one-shot dictator game and a repeated alternating version of the same dictator game. Each dictator game featured the choice between a fair split (4; 4) and a selsh split (6; 1) between oneself and an anonymous partner. We nd that altruism (fair split in the one-shot game) increases with age in children and drops after adolescence, and cannot alone account for the development of cooperation in the repeated game. Older subjects reciprocate more and also better anticipate the potential gains of initiating a cooperative play. Overall, children younger than 7 years of age are neither altruistic nor strategic while college students strategically cooperate despite a relatively low level of altruism. Participants in the intermediate age range gradually learn to anticipate the long term benets of cooperation and to adapt their behavior to that of their partner. A turning point after which cooperation can be sustained occurs at about 11-12 years of age. ¶ We are grateful to members of the Los Angeles Behavioral Economics Laboratory (LABEL) for their insights and comments and to the Lyc ee International de Los Angeles (LILA) {in particular Emmanuelle Acker, Nordine Bouriche and Anneli Harvey{ for their help and support running the experiments in their school. The study was conducted with the University of Southern California IRB approval UP-12-00528. We acknowledge the nancial support of the National Science Foundation grant SES-1425062. Address for correspondence: Juan D. Carrillo, Department of Economics, University of Southern California, 3620 S. Vermont Ave., Los Angeles, CA 90089, USA, <juandc@usc.edu>. 4.1 Introduction Social interactions are a crucial component of our daily lives. In many instances, the best long term course of action requires a short term sacrice. Social interactions are known to be shaped by other-regarding preferences and strategic motives but it is unclear how and when we acquire the ability to coordinate on mutually benecial cooperative actions. It is also well-known that social preferences (Fabes and Eisenberg, 1998; Engel, 2011) and strategic thinking (Sher et al., 2014; Czermak, Feri, Gl atzle-R utzler, and Sutter, 2016) gradually evolve from childhood to adulthood. However, their relative importance for social decision-making at a given age is not well-understood. If a child shares a toy with his sibling, does it display altruism or is it a strategic decision with reciprocal expectations? Conversely, if he keeps it for himself, is it a myopic choice or is it due to the belief that the sibling will not realize the implicit expectation? Does the behavior of this child change as he grows as a consequence of a socialization process that makes him more empathic? Or does it change because his reasoning becomes more sophisticated? Answering these questions (and more generally, studying developmental decision-making) is critical not only to understand how school-age subjects behave in groups, a topic of practical relevance for children advocates, but also to realize how development shapes the motivations of adults and their ability to select mutually benecial decisions. The objective of this paper is to study the evolution from childhood to adulthood of the behavior, motivation and payo consequences of ecient but costly sharing in dynamic relationships. Our primary goal is to disentangle between altruism {which we dene as the willingness to sacrice own payo to benet others{ and strategic giving {which we dene as the willingness to forego a current payo as a means to encourage a mutually protable long term relationship. 1 A major challenge for such a study is to design tasks that are short, simple and engaging, so that children as young as 5 years of age are able to understand and be engaged with them, but also challenging and subtle enough to maintain the attention of adults. In our experiment, we consider two tasks. First, a standard one- shot, anonymous dictator game with two options of \tokens for me" and \tokens for the other", where sharing is privately costly but socially ecient. The options we present are (6,1) and (4,4). Second, the same dictator game played multiple times with a xed and anonymous partner and with alternating roles, which we call a supergame. We consider school-age subjects (5 to 17 years old) and a control group of USC students (on average, 1 We realize that the literature has provided slightly dierent denitions of \generosity", \altruism" and \prosociality" (see e.g., Blake and Rand (2010); Fehr et al. (2008); Fehr, Gl atzle-R utzler, and Sutter (2013); Dreber, Fudenberg, and Rand (2014)). We do not take a strong stance on semantics. Instead, we ask the reader to think of our denition of altruism in terms of a costly transfer or a payo sacrice for the benet of others. Also, we do not explore the motivations for such behavior (pure altruism, warm glow, etc.). 111 21.3 years old). We observe the following. Using the likelihood that the subject chooses (4,4) in the one- shot game as a proxy for altruism, we nd that altruism is hump-shaped: it monotonically increases with age for school-age subjects (from 0.14 to 0.59) and decreases for our control group (0.40) (Result 1). To study strategic giving, we consider two sets of measures. First, we study how subjects react to the choices of their partners, which we call \strategic adaptation." We nd a signicant increase with age in the probability of reciprocating (choosing (4,4) as a response to (4,4) by the partner), from 0.20 to 0.95. By contrast, the probability of choosing (4,4) as a response to (6,1) by the partner is low and relatively constant across age groups (between 0.12 and 0.28) (Result 2). Second, we look for evidence of the ability to anticipate future choices, which we call \strategic anticipation." A simple (and, arguably, pure) way to assess that ability is to look at whether a subject chooses (4,4) in the rst round of the rst supergame after choosing (6,1) in the one-shot game. Subjects who do so are unambiguously sacricing some payo in the rst round to promote mutual goodwill in the hope of increasing their long run payo, and not as a display of altruism. We nd a sustained increase across age groups in the probability of choosing (4,4) in the rst round among non-altruistic subjects, starting at 0.06 in the youngest population and ending at 0.80 in the control group (Result 3). Finally, despite the dierences in behavior across age groups, we nd that open-handed tit-for-tat maximizes payos in all age groups given the empirical behavior of the group. However, the same (optimal) strategy implies very dierent actions and payos for dierent age groups. Indeed, it results in subjects playing (4,4) around 35% of the time in our younger school-age subjects, around 75% in our older school-age subjects and around 85% in our control group, with the corresponding dierence in expected payos (Result 4). Overall, the increase in cooperation in the supergames results from the combination of three factors: the evolution of altruism, the evolution of strategic thinking, and the eect of the group subjects are in. Our younger subjects are typically selsh and myopic. As they grow, they steadily become more altruistic but, even more signicantly, they learn to anticipate the strategic gains of cooperation. They also gradually realize the possibility of prompting their partner to a mutually advantageous implicit agreement. Finally, our control group is the most eective at ecient coordination despite their lack of altruism. That group's behavior suggests that strategic thinking is more important than altruism in order to reach sustained cooperation. Importantly, the dierences in choices across ages are magnied by peer eects. Indeed, even if a subject is strategic, his behavior depends on the age group he belongs to. A subject in the youngest age group who plays (4,4) is exploited by his partner whereas the same behavior is rewarded in the older age groups with continued cooperation. Thus, the peer eect is a potent self-reinforcing factor that 112 exacerbates dierences in motivations across ages. Our results are consistent with and expand on social and cognitive developmental paradigms. Acting strategically requires people to put themselves in the shoes of others, an ability referred to as Theory of Mind, and to think logically about their own as well others' courses of action. Very young children are self-centered and unable to take the perspective of others. Around 5 years of age, their Theory of Mind ability starts to develop (Premack and Woodru, 1978). Children become less self-centered and start adapting their behavior to norms and rules in their environment. They move from a situation in which they neither infer nor care about what others think to a situation in which they attribute beliefs to others and empathize with them. This is consistent with the observed increase in altruism among school-age children in our study. The development of logical thinking occurs in stages (Piaget, 1972). Children develop the ability to think logically about what they observe (inductive logic) between the ages of 8 and 12 (Feeney and Heit, 2007). This ability is required for the development of strategic adaptation that we observe in our population. Children start developing the ability to reason abstractly (hypothetical and counterfactual thinking) around 12 years of age (Piaget, 1972; Rafetseder, Schwitalla, and Perner, 2013), an ability necessary for the strategic anticipation of the gains of cooperation. The age at which we notice an improvement of strategic anticipation corresponds closely to the time at which hypothetical thinking is known to start developing. Before proceeding with the analysis, we brie y review the research most closely re- lated to our paper, namely the experimental literatures on repeated prisoner's dilemma games and on decision making by children. There is a burgeoning experimental strand of research on repeated prisoner's dilemma games, revived by Dal B o (2005) and surveyed in detail by Dal B o and Fr echette (2014). The literature shows, among other things, that cooperation is enhanced when future interactions are more likely (discount factor closer to 1), subjects gain experience (number of supergames increases), cooperation is risk domi- nant and cooperation is robust to strategic uncertainty. Unlike this literature, our goal is not to study the determinants of cooperation. Instead, we are concerned about the evo- lution of altruism and strategic giving from childhood to adulthood (in other words, our main treatment variable is `age'). 2 Notice also that, instead of building on the standard prisoner's dilemma paradigm, we design a slightly dierent game that better captures the 2 Some of this literature partly shares our focus, and correlates altruism with strategic cooperation by looking at behavior in dictator games and repeated prisoner's dilemma (Fudenberg, Rand, and Dreber, 2012; Dreber et al., 2014). Authors typically nd weak or no correlation between the amount of giving in the dictator game and the level of initial cooperation in the repeated prisoner's dilemma, and they conclude that altruism is not a main driving force of cooperation and forgiveness. We also nd that strategic considerations are more critical than intrinsic altruism in explaining the increase in cooperation across ages. 113 parameters of interest and that children can understand easily. The literature in psychology and economics has recently investigated changes in prefer- ences and strategic thinking from childhood to adulthood. Studies have analyzed the evo- lution of trust (Sutter and Kocher, 2007), prosociality (Blake and Rand, 2010; Fehr et al., 2008, 2013), reciprocity (House, Henrich, Sarnecka, and Silk, 2013) and third-party pun- ishment (Jordan, McAulie, and Warneken, 2014; Lergetporer, Angerer, Gl atzle-R utzler, and Sutter, 2014), emphasizing dierent developmental stages in other-regarding concerns and norm following. Other works have explored the development of strategic thinking (Brosig-Koch, Heinrich, and Helbach, 2012; Sher et al., 2014; Czermak et al., 2016; Bro- cas and Carrillo, 2016, 2017), focusing on the gradual acquisition of dierent aspects of logical reasoning (inductive, deductive, hypothetical and recursive thinking). This paper integrates both strands by studying age-related changes in altruism and strategic giving in a unied setting, as well as the interplay between the two. To our knowledge, Blake, Rand, Tingley, and Warneken (2015) is the only work with a similar approach. The au- thors propose an innovative design to study children's cooperation in a one-shot and a repeated prisoner's dilemma game. The study looks at children in 5 th and 6 th grade, and children play either the one-shot or a nite version of the repeated prisoner's dilemma (ve rounds), but not both. The authors show that gender and conduct problems aect children's tendency to cooperate in those games. Compared to the present article, the limited age-range in Blake et al. (2015) impedes a study of the evolution of choices from childhood to adulthood. Also, the between-subject design does not make it suitable to analyze the relative importance of altruism vs. strategic thinking in determining behavior, and the interaction between the two motives for giving. 4.2 Experimental design Participants. We recruited 334 school-age subjects from grades K to 11 th at the Lyc ee International of Los Angeles (LILA), a bilingual private school in Los Angeles, with cam- puses in Los Feliz (pre-K to 5 th ) and Burbank (6 th to 12 th ). We ran 35 sessions that lasted between 60 and 90 minutes. Sessions were conducted in a classroom at the school using touchscreen PC tablets and the tasks were programmed in z-Tree. Sessions had between 8 and 10 subjects. For each session, we tried to have male and female subjects from the same grade, but for logistical reasons sometimes had to mix subjects of two consecutive grades. As a control, we ran 6 sessions with 48 USC students (U). These were conducted at the Los Angeles Behavioral Economics Laboratory (LABEL) in the department of Eco- nomics at the University of Southern California, using identical procedures. For the USC population, participants were recruited from the LABEL subject pool. The number of subjects by grade is reported in Table 20. 114 Location LILA Los Feliz LILA Burbank USC Grade K 1 st 2 nd 3 rd 4 th 5 th 6 th 7 th 8 th 9 th 10 th 11 th U # subjects 30 19 30 29 29 19 43 40 31 31 21 12 48 Table 20: Subjects by grade. Tasks. The experiment consists of three tasks always performed in the same order. The rst task is a series of one-shot binary-choice dictator games. Subjects in the session are randomly and anonymously matched in pairs. One subject, the dictator, decides between a split (x;y) and a split (x 0 ;y 0 ), where the rst element is the number of tokens for oneself and the second element is the number of tokens for the other subject, the recipient. After the decision, new pairs are randomly formed, with no information revealed between games. Each subject plays the four dictator games described in Table 21 two times, once as a dictator and once as a recipient. Games and order of play are presented randomly. At the end of the eight games, subjects learn only their total accumulated payo (tokens kept as a dictator and tokens received as a recipient). (a) (b) (c) (d) (6,1) vs. (4,4) (2,0) vs. (2,2) (2,4) vs. (2,2) (4,0) vs. (2,2) sharing & eciency prosociality envy sharing Table 21: One-shot anonymous dictator games Game (a) has sharing and eciency components, which constitute the core elements of our analysis. Games (b), (c), (d) are identical (up to scaling) to Fehr et al. (2008, 2013) and allow for a comparison with the literature regarding the tendency of our subjects towards generous, spiteful and egalitarian behavior. The second task consists of two binary-choice alternating dictator supergames, with and anonymous partner xed within each supergame. More precisely, subjects in a session are anonymously paired and assigned a role as player 1 or player 2. In round 1, player 1 (the dictator) chooses a split between tokens for himself and tokens for player 2 (the recipient), where the options are (6; 1) and (4; 4), just like in game (a) of the rst task. At the end of round 1, player 2 observes player 1's choice. In round 2, player 2 becomes the dictator and player 1 becomes the recipient. Player 2 decides between the same two options. Subjects keep alternating roles between dictator and recipient for the 16 rounds that comprise the rst supergame. Subjects know that the supergame consists of \many alternating rounds" (literal words by the experimenter) with a xed partner but are not told the exact number. 115 At the end of the supergame, subjects are randomly and anonymously rematched with a dierent subject and play a second supergame, this time comprised of 12 rounds and again not knowing in advance the total length. This alternating individual choice problem is considerably easier to explain to 5 year- old children than the simultaneous two-player, two-action prisoner's dilemma game. Yet, it captures a similar {though certainly not identical{ trade-o between short term loss and long term gain of cooperation (one can easily notice that every pair of rounds is identical to a sequential symmetric prisoner's dilemma where the `temptation', `cooperation', `de- fection' and `sucker' payos are, respectively, 10, 8, 7 and 5). 3;4 Furthermore, this design also allows for a clean comparison between altruism and strategic giving by looking at the one-shot game (a) and the rst round of the dynamic alternating dictator game. 5 Figure 33 provides screenshots of the supergame. The left screenshot presents the information observed by the dictator. The left panel randomly displays the two options (6; 1) and (4; 4), one above the other. The dictator is instructed to tap on the preferred alternative and press \ok". In the screenshot of Figure 33, the dictator has selected (6; 1) in the current (5 th ) round. The middle panel displays the history of the supergame, with the subject's own accumulated tokens in each round. The tokens obtained as a dictator (either 4 or 6) are displayed in blue (rounds 1, 3 and 5) whereas the tokens obtained as a recipient (either 4 or 1) are displayed in red (rounds 2 and 4). This panel lls up in real time as the game progresses. The total number of tokens accumulated is displayed at the bottom. Finally, the right panel of the dictator's screen is blank. The right screenshot presents the information observed by the recipient. The left panel is blank. The middle panel displays the same information as for 3 There are two other important dierences with the recent innitely repeated prisoner's dilemma liter- ature (Dal B o and Fr echette, 2014). First, ending is unknown. This is less rigorous than random ending but more natural and signicantly easier to explain to young children (see Fr echette and Yuksel (2013) for a comparison between dierent laboratory implementations of innitely repeated games). It precludes comparative statics on the horizon length and presupposes a belief that the horizon is long enough that cooperation can be mutually protable. Second, our subjects play only two supergames. This precludes studying the eect of experience but, again, we felt it was the right choice as the attention span of our population is limited. Finally, notice that we wanted to avoid a last period eect. For that reason, we reduced the length of the second supergame from 16 to 12 in the (unlikely but conceivable) event that some of the older subjects recalled the length of the rst supergame and expected the same number of rounds in the second one. 4 Payo-incentives are in the range of the prisoner's dilemma literature. Applying the payo- normalization of Dal B o and Fr echette (2014) to our modied prisoner's dilemma game, we get that, every two rounds, g =l = 2. 5 As it is well-known, subjects cooperate in the one-shot prisoner's dilemma for a variety of reasons, including an imperfect understanding of the other player's choice set, incentives and motivations. We therefore favor this design over Blake et al. (2015) when the goal is to disentangle between altruism and strategic thinking as the two possible motives for giving. 116 Figure 33: Screenshot of dictator (left) and recipient (right) the dictator (past history and accumulated tokens), except that this time it is presented from the recipient's own perspective. The right panel displays an hourglass picture while the recipient waits. When a choice has been made, it displays the split selected by the dictator from the recipient's own perspective, in this particular case (1; 6). The third task is a learning exercise. Balls are sequentially drawn from an urn with green and yellow balls. Subjects are asked to guess the color of each upcoming ball and are rewarded for correct guesses. Since this a dierent task from the previous two, and designed to study a dierent paradigm, we relegate the analysis of this task to a dierent paper. Payos. During the experiment, subjects accumulate tokens. We implemented two dif- ferent conversions depending on the subjects' ages. USC students and subjects at LILA Burbank (grades 6 th to 11 th ) had tokens converted into money, paid with an Amazon gift card at the end of the experiment. 6 For subjects at LILA Los Feliz (grades K to 5 th ) we set up a shop with 20 to 30 pre-screened, age and gender appropriate toys. 7 Dierent toys had dierent token prices. Before the experiment, children were taken to the shop and showed the toys they were playing for. They were also instructed about the token prices of each toy and, for the youngest subjects, we explicitly stated that more tokens would result in more toys. At the end of the experiment, subjects learned their token earnings and were accompanied to the shop to exchange tokens for toys. We made sure 6 The conversion rate for USC subjects ($0.15/token) is higher than for LILA Burbank subjects ($0.07/ token) to correct for dierences in marginal value of money and opportunity cost of time. It implied large dierences in average earnings ($22.3 vs. $10.0) despite similar average number of tokens obtained (149 vs. 143). In compliance with LABEL policies, USC subjects were also paid a $5 show-up fee. 7 These included gel pens, friendship bracelets and erasers for young girls, gurines, die-cast cars and trading cards for young boys, and apps, calculators and earbuds for older kids. However, children were free to choose any item they liked within their budget. 117 that every child earned enough tokens to obtain at least three toys. At the same time, no child had excess tokens after choosing all the toys they liked. 8 At the end of the experi- ment, we also collected demographic information consisting of \gender", \age", \grade", \number of younger siblings" and \number of older siblings". A transcript of the read aloud instructions is included in Appendix B. For analysis, we cluster our subjects into ve age groups: K-1 st -2 nd (G1), 3 rd -4 th -5 th (G2), 6 th -7 th -8 th (G3), 9 th -10 th -11 th (G4) and the control population (G5). Although the cut is somewhat arbitrary, it allows us to reduce the number of groups while maintaining some age homogeneity. 9 Also and unless otherwise noted, when comparing aggregate choices we perform two-sided t-tests of mean dierences. Standard errors are clustered at the individual level whenever appropriate. We use a p-value of 0.05 as the benchmark threshold for statistical signicance. 4.3 Analysis of actions and strategies 4.3.1 One-shot (OS) games: altruism Our rst step is an analysis of the behavior in the rst task, that is, the four independent one-shot games. Figure 34 presents the choices in each game by age group. 79 77 114 64 48 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proportion Sharing/efficiency (a) (4,4) when alternative is (6,1) 79 77 114 64 48 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proportion Prosociality (b) (2,2) when alternative is (2,0) 79 77 114 64 48 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proportion Envy (c) (2,2) when alternative is (2,4) 79 77 114 64 48 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proportion Sharing (d) (2,2) when alternative is (4,0) (graphs show standard error bars and number of subjects per age group from which proportions are determined) Figure 34: Aggregate choices in the one-shot (OS) dictator games by age group Altruism, or the willingness to sacrice own payos to benet others as re ected in (a) and (d), increases with age for school-age subjects (dierences signicant except for G3 vs. G4 in (a) and G1 vs. G2 and G3 vs. G4 in (d)). It then drops signicantly between 8 The procedure emphasized the value of earning tokens but, at the same time, ensured an enjoyable and exciting experience. 9 We did a similar analysis grouping only two grades together and obtained similar results (but lower statistical power). 118 G4 and our control population G5. 10 Interestingly, sharing within G5 is marginally higher (p-value = 0.057) when the sacrice is collectively ecient ((a) rather than (d)), that is, when the relative price of giving is smaller. A dierence between choices in (a) and (d) is not present in any other age group. This is a rst indication that our control population is more strategic in their decision to sacrice payos for others than the older school-age subjects. Below, we summarize the results of the one-shot sharing games, as they form the basis of comparison for the dynamic game. Result 1 Altruism monotonically increases from G1 to G4 and drops between G4 and G5. As for the other games, we nd that prosociality (b) in our sample increases with age (dierences signicant except for G3 vs. G4) whereas envy (c) decreases with age (dierences signicant except for G1 vs. G2, G1 vs. G3 and G3 vs. G4). Overall, we nd sustained and systematic developmental changes. Choices between our two groups of younger school-age subjects (G1 and G2) are dierent but not widely so, and the same is true between our two groups of older school-age subjects (G3 and G4). The control population (G5) is an extreme version of older school-age subjects in terms of prosociality and envy but rather dierent in terms of their willingness to share. The results are in line with developmental theories of prosocial behavior (Homan, 2001; Fabes and Eisenberg, 1998; Carpendale and Lewis, 2004). Young children are typ- ically self-centered and lack other-regarding concerns. This corresponds to the behavior we observe in G1. With age, children learn to adopt a prosocial behavior, initially in response to norms and rules placed around them (elementary school children), then in response to their own judgment and principles (high-school students and adults). It is therefore plausible that children in groups G2 to G4 aim at behaving in \stereotypically good" ways, while students in G5 solve a more complex trade-o between the moral costs and benets of behaving nicely. The ndings are also consistent with recent studies on one-shot dictator games. In Appendix A we analyze the choices of our subjects in games (b), (c) and (d) and compare them to the existing literature. In short, we show that the behavior of our subjects regarding generosity, egalitarianism and spitefulness are broadly in line with Fehr et al. (2008). Furthermore, the evolution with age emphasized in Fehr et al. (2013) generally extends to our older school-age population. 10 While we do not nd gender dierences when we consider all ages together, we note that females in G1 are more altruistic than males (p-value = 0.041 for game (a) and 0.034 for game (d)) whereas males in G5 are more altruistic than females (p-value = 0.023 for game (a) and 0.013 for game (d)). 119 4.3.2 Alternating dictator supergames: strategic adaptation By analogy to the prisoner's dilemma and with a slight abuse of language, from now on we call \cooperate" (C) the strategy (4,4), which involves a short term loss in the hope of a long term gain. We call \defect" (D) the strategy (6,1), which involves the myopic maximization of current payo. In Figure 35 we report Pr(C t ), the average proportion of cooperative play over all rounds t by age group in the rst and second supergame. 632 474 616 462 912 684 512 384 384 288 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proportion supergame 1 supergame 2 Pr(C t ) (graph shows standard error bars clustered at the individual level and number of observations per age group from which the proportions are determined) Figure 35: Cooperation in rst and second supergame by age group In both supergames, we observe a strong and sustained increase in cooperation with age (p-value< 0:01), with the exception of age groups G3 and G4 which are not dierent from each other. Within a given age group, the level of cooperation is virtually identical in the rst and second supergame. Therefore, from now on and unless otherwise noted, we will pool together observations from both supergames. Also, the proportion of cooperation is signicantly above 0 for the youngest group and signicantly below 1 for the control group. Table 22 presents the proportion of subjects within each age group who select C in every round of both supergames (all 14 choices) or D in every round. G1 G2 G3 G4 G5 C every round 0.00 0.03 0.14 0.17 0.65 D every round 0.49 0.27 0.07 0.05 0.08 Table 22: Proportion of subjects with xed actions by age group Table 22 suggests that a signicant fraction of subjects do not change their behavior 120 during the experiment. Most notably, one-half of our youngest subjects never cooperate whereas two-thirds of our control subjects always cooperate. As we will see below, this is in part a reaction to the behavior of their partners. Naturally, Pr(C t ) is a crude measure. Actions crucially depend on the past behavior of the partners. In Figure 36 we present the probability of cooperation in a given round t ( 2) of the supergame conditional on the action taken immediately before (round t 1) by the partner: C t1 (left) or D t1 (right). 133 266 815 490 505 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proprotion Pr(C t I C t−1 ) 895 734 667 342 119 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proprotion Pr(C t I D t- 1 ) (graphs show standard error bars clustered at the individual level and number of observations per age group from which the proportions are determined) Figure 36: Conditional cooperation by age group All subjects respond to the behavior of their partners. In particular, dierences be- tween Pr(C t jC t1 ) and Pr(C t jD t1 ) are highly signicant (p< 0:001) for all age groups except G1. This suggests that unconditional altruism is not a main driving force of be- havior at any age. However, dierent groups respond dierently. Reciprocity, dened as Pr(C t jC t1 ), follows the same sustained increase with age as the unconditional coopera- tion Pr(C t ) though, not surprisingly, levels are statistically higher (Pr(C t jC t1 )> Pr(C t ) for all age groups except G1). Forgiveness, loosely dened as Pr(C t jD t1 ), is low and similar in all age groups (between 0.12 and 0.28), although dierences across age groups are still statistically signicant between the younger and older school-age subjects (G1 vs. G3, G1 vs. G4, G2 vs. G4). 11;12 11 Figures 35 and 36 look very similar when we cluster age groups dierently (two grades together) or when we compare the rst 4 and last 4 rounds of each supergame (data available upon request). 12 As in the one-shot games, gender dierences are mostly concentrated on the youngest and oldest age groups and move in opposite directions: females in G1 reciprocate more (p-value = 0.001) and forgive more (p-value < 0:001) than males, while males in G5 forgive more (p-value = 0.002) than females. There are no statistically signicant gender dierences in the other age groups. 121 Overall, a Markov process seems to nicely capture some basic aspects of the similarities and dierences in aggregate behavior across age groups. This would suggest that the main driving eect of age is a change in the willingness to maintain the cooperative agreement, once it is reached. If a subject deviates, reversion to C by the partner is uncommon in all age groups. However, it would be simplistic to assume that our subjects do not take into con- sideration behaviors beyond their partner's last move. Indeed, if we consider an ex- tended memory-2 process (the subject's choice as a function of their partner's last two choices), we notice that: Pr(C t jC t1 ;C t3 )> Pr(C t jC t1 ;D t3 ) for age groups G2, G3, G4 and G5; Pr(C t jC t1 ;D t3 ) > Pr(C t jD t1 ;C t3 ) for age groups G3 and G4; and Pr(C t jD t1 ;C t3 ) > Pr(C t jD t1 ;D t3 ) for age groups G2, G3 and G4. 13 These dier- ences suggest that while Markov strategies can help explain dierences across ages, the history beyond the last move also matters. It will therefore be instructive to study the dynamic strategies of our participants. Such analysis is relegated to section 4.4.2. Finally, one may wonder whether cooperation can be prompted, induced or taught. To address this question, we present Pr(C t jC t1 ;D t2 ) and Pr(C t jD t1 ;D t2 ) in Figure 37. These two probabilities capture the likelihood that a subject who did not cooperate in round t 2 reverts to cooperation in round t as a function of the partner's choice in t 1. 99 728 126 551 151 458 85 228 24 87 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proprotion Pr(C t I C t−1 , D t−2 ) Pr(C t I D t−1 , D t−2 ) (graph shows standard error bars and number of observations per age group from which proportions are determined) Figure 37: Learning to cooperate Subjects in G2, and even more signicantly in G3 and G4, are willing to reverse their 13 Notice, however, that the number of observations in some categories is small and there is a selection eect in the way these variables are constructed. 122 non-cooperative strategy if they realize that their partner wants to cooperate. By contrast, our youngest and control populations cannot be convinced to become cooperative if they have decided not to. This is expected in G1 where cooperation levels are pervasively low, but it is surprising in G5 where reciprocal cooperation is extremely high (0.94). It suggests a bi-modal behavior of our control population: sustained cooperation or sustained defection. The conclusions of this section are summarized in the following result. Result 2 Strategic adaptation to the partner's choice evolves with age: (i) reciprocity strongly increases with age; (ii) forgiveness weakly increases with age; and (iii) subjects in G2 to G4 can be prompted to cooperate whereas subjects in G1 and G5 cannot. 4.3.3 The relationship between altruism and strategic adaptation Motivated by the results in sections 4.3.1 and 4.3.2, we now study the relationship between altruism and strategic adaptation. Intuitively, altruistic subjects are likely to be more prone to cooperate. It is unclear, however, whether they will be more or less reactive to the choice of their partner. In Figure 38, we present the fraction of conditional cooperation by age group as a function of the subjects' behavior in the one-shot game. 9 124 122 144 482 333 347 143 209 296 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proportion C in OS D in OS Pr(C t I C t−1 ) 132 763 226 508 263 404 148 194 37 82 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proportion C in OS D in OS Pr(C t I D t−1 ) (graphs show standard error bars clustered at the individual level and number of observations per age group from which the proportions are determined) Figure 38: Conditional cooperation as a function of choice in OS Consistent with the fact that behavior in OS game (a) partly re ects altruism, in age groups G1, G2 and G3 we observe a higher tendency to cooperate {in terms of both reciprocity and forgiveness{, by subjects who playedC in OS than by those who playedD in OS (the dierence is not statistically signicant for reciprocity in G1, a group with only 9 observations). However, even altruistic subjects react to their partner's choice: all age 123 groups except G1 are signicantly more likely to reciprocate than to forgive. Therefore, while subjects dier in their intrinsic preference for giving, they all respond to the behavior of others. Dierences between our oldest school-age subjects and the control population are more subtle. All G5 subjects seem to realize the strategic gains of reciprocity but those who playC in OS are signicantly more forgiving than those who playD in OS. By contrast, all G4 subjects are equally forgiving but those who playD in OS are signicantly less likely to reciprocate than those who playC in OS, indicating a lower degree of strategic reasoning. 4.3.4 One-shot game vs. alternating supergame: strategic anticipation We have shown that participants adapt to observed play. Our next step is to investigate whether they anticipate future possible benecial outcomes. A simple test for strategic anticipation consists of analyzing the dierences in behavior in OS game (a) and in the rst round of the rst supergame. The categories are [CC], [CD], [DC] and [DD]. So, for example, [CD] is an individual who played C in OS and then started the rst supergame playing D. Figure 39 presents the results of this exercise. The left graph depicts each of the four probabilities by age group. The right table reports , the likelihood of playing C in the rst round of the rst supergame among the subjects who played D in OS. 0.00 0.25 0.50 0.75 1.00 G1 G2 G3 G4 G5 Age group Proportion DD DC CD CC RP1 Pr[DC] Pr[DC]+Pr[DD] G1 0.06 G2 0.08 G3 0.50 G4 0.69 G5 0.80 Figure 39: Choice in OS game (a) and rst round of rst supergame by age group There is a clear evolution in the \strategic anticipation" of the gains of initiating cooperation, with a highly signicant increase in [DC] between the younger school-age subjects (G1 and G2) and the older school-age subjects (G3 and G4). These are individuals who are not willing to sacrice money to benet their partner in OS game (a) but realize the potential of starting a cooperative agreement in the supergame. Evidence of this 124 strategic giving behavior is still more noticeable in our control population, where [DC] is higher than in any other age group (the dierence is statistically signicant with all groups except G4). 14 The right table conrms those ndings. Indeed, among the non-altruistic subjects (those who playD in OS), the percentage of strategic givers dramatically increases across age groups (all dierences statistically signicant except for G1 vs. G2, G3 vs. G4 and G4 vs. G5). Result 3 Strategic anticipation of the gains of initiating cooperation increases with age. 4.3.5 The dual eect of altruism and rst decisions The previous sections have shown that altruism, strategic adaptation to past play and strategic anticipation of future play aect behavior in the supergames. Here we present a regression analysis to better assess the individual eects of these factors on cooperation across groups. We rst investigate how reciprocity relates to altruism and to the rst decision in the supergame. For this, we run an Ordinary Least Squares (OLS) regression of a player's Reciprocity (Pr(C t jC t1 )) on his choice in the one-shot game (a) (Altruism - coded 1 if the subject cooperated and 0 otherwise) and a dummy variable indicating if the rst decision in the repeated game is C (Choice C rst trial). We also control for age by including age dummies (Dummy G2 to Dummy G5 ), and for any dierence between the supergames, by adding a dummy taking value 1 for observations from the second supergame (Dummy Supergame 2). We later include demographic dummies (Gender Female and Number of siblings). The results are reported in the rst 2 columns of Table 23. Consistent with previous evidence, older and more altruistic subjects reciprocate more. Subjects also reciprocate more if the game starts with a cooperative move, suggesting a key role of the rst choice in the pair. To check this prediction, we compute for each pair of players in each supergame the Cooperation rate within the pair (Pr(C t )), and we regress it on the rst decision in that pair controlling for the altruism of each player in the pair. The results are reported in the third column of Table 23. They indicate that, while altruism is important, the rst choice is critical to establish cooperation. Overall, altruism and strategic anticipation (together with age) are the main drivers of cooperative behavior. 14 Conrming the previous results regarding our school-age subjects, the graph also shows that [CC] increases with age (all dierences signicant except for G1 vs. G2 and G3 vs. G4) and [DD] decreases with age (all dierences signicant except for G3 vs. G4). G5 behaves statistically like G3 and G4 in both cases. For subjects in G2, we also observe a signicant fraction of subjects playing [CD]. Such puzzling behavior may be due to initial confusion or learning about one's preferences but, unfortunately, we do not have enough data to test this hypothesis. 125 Reciprocity Reciprocity Cooperation Altruism 0.126 0.126 (0.030) (0.030) Altruism First Mover 0.110 (0.028) Altruism Second Mover 0.139 (0.028) Choice C rst trial 0.231 0.232 0.307 (0.036) (0.036) (0.033) Dummy G2 0.119 0.116 0.036 (0.056) (0.056) (0.040) Dummy G3 0.293 0.291 0.141 (0.055) (0.055) (0.042) Dummy G4 0.271 0.269 0.131 (0.060) (0.061) (0.049) Dummy G5 0.452 0.451 0.365 (0.064) (0.065) (0.053) Dummy Supergame 2 0.056 0.056 0.014 (0.028) (0.028) (0.025) Gender Female -0.009 (0.029) Number of siblings 0.003 (0.012) Constant 0.143 0.145 0.073 (0.045) (0.050) (0.031) Adj. R 2 0.340 0.340 0.551 # observations 553 553 382 (standard errors in parentheses) , , = signicant at 5%, 1% and 0.1% level Table 23: OLS regression of reciprocity and group cooperation rates across supergames To better disentangle between the importance of altruism and strategic considerations, we consider the rst supergame and restrict our attention to the rst choice of each player in the pair. We then run Probit regressions, which we report in Table 24. The rst set of regressions (columns 1 and 2) looks at the 1 st choice of 1 st mover as a function of his altruism and age group. The second set of regressions (columns 2 to 6) looks at the 1 st choice of 2 nd mover also as a function of his altruism and age group, as well as the 1 st choice of the 1 st mover. We can see from the table that while the initial decision of the rst mover in a supergame depends signicantly on his altruism (and age), the initial decision of the second mover is mostly driven by the choice of the rst mover (and his age). Indeed, altruism loses signicance in explaining the second mover's choice after controlling 126 1 st choice of 1 st mover 1 st choice of 2 nd mover Altruism 0.667 0.693 0.576 0.518 0.296 0.283 (0.223) (0.227) (0.211) (0.217) (0.318) (0.319) 1 st choice 0.837 0.661 0.642 (0.241) (0.303) (0.306) Altruism 1 st choice 0.417 0.422 (0.434) (0.437) Dummy G2 0.252 0.322 0.398 0.317 0.361 0.395 (0.397) (0.410) (0.360) (0.365) (0.366) (0.370) Dummy G3 1.612 1.668 1.308 0.890 0.935 0.959 (0.350) (0.360) (0.327) (0.349) (0.354) (0.358) Dummy G4 1.987 2.127 1.525 1.000 1.014 1.054 (0.398) (0.413) (0.367) (0.398) (0.402) (0.408) Dummy G5 2.329 2.374 1.798 1.224 1.265 1.268 (0.432) (0.436) (0.390) (0.429) (0.433) (0.434) Female -0.045 0.194 (0.226) (0.212) Siblings -0.181 -0.012 (0.097) (0.097) Constant -1.571 -1.388 -1.357 -1.417 -1.369 -1.465 (0.307) (0.354) (0.271) (0.272) (0.275) (0.317) Pseudo R 2 0.335 0.349 0.226 0.272 0.276 0.279 # observations 191 191 191 191 191 191 , , = signicant at 5%, 1% and 0.1% level; (standard errors in parentheses) Table 24: Probit regression of rst choice by First and Second mover for the choice of the rst mover. Besides, altruism does not interact signicantly with cooperative rst play, suggesting that altruistic second movers are not responding more positively to cooperative rst play than non altruistic second movers. The main conclusion of this analysis is that, even though altruism and strategic con- siderations are both driving mechanisms of cooperative behavior, the ability to anticipate the benets of cooperation and to initiate cooperation is crucial in establishing and sus- taining cooperation in a group. Second movers respond positively to cooperative rst play independently of their level of altruism. 4.3.6 Summary Behavioral dierences across age groups are the result of (at least) three factors: altruism, strategic adaptation to the partner's choice (reciprocity, forgiveness and learning to co- operate), and strategic anticipation of the benets of cooperation. Our youngest subjects 127 (G1) are neither altruistic nor strategic. They rarely cooperate in the one-shot games or the supergames. G2 is a more positive version of G1: some of them are altruistic and some can be prompted to cooperate. G3 and G4 are similar to each other. Half of them are altruistic and many are strategic: they are reciprocal, learn to cooperate if prompted and anticipate the gains of cooperation in repeated interactions. Finally, G5 follows the homo economicus template: they are typically not altruistic but recognize and fully exploit the mutual benets of cooperation. The results in this section are consistent with theories of cognitive development. Chil- dren between 2 and 7 years of age tend to be egocentric (Piaget and Inhelder, 1956; Hughes, 1975; Perner, 1991) and are not yet able to perform logical reasoning. Gradually, they acquire Theory of Mind (Wellman, Cross, and Watson, 2001; Wellman and Liu, 2004) and the ability to use logical reasoning (Sher et al., 2014). It is only after 12 years of age that children become able to reason hypothetically and in an abstract manner (Piaget, 1960, 1972; Rafetseder et al., 2013). In our framework, strategic thinking in the supergames requires the ability (i) to take the perspective of partners, (ii) to reason logically about their past moves and (iii) to use this information to reason abstractly about their future moves. Young children in G1, who are not yet thinking logically and are still not able to take a dierent perspective, are not equipped to think strategically. At the other extreme, students in G5 have acquired Theory of Mind and can think abstractly about the expected gains of cooperation. Children in groups G2 to G4 are gradually acquiring those skills and become progressively better at thinking logically about the past behavior of their partners (strategic adaptation) and at thinking hypothetically about their future behavior (strategic anticipation). Our ndings suggest that cognitive abilities and social preferences have dierent developmental courses. Young children do not have the social or cognitive abilities to behave cooperatively. As they grow, children are motivated by prosociality to engage in cooperation, until they reach an age at which strategic reasoning becomes predominant and can alone support cooperation. 4.4 Empirical best response Since choices are dierent at dierent ages, individuals who are sophisticated and antici- pate the behavior of their peers are likely to make dierent (optimal) choices depending on their age group. In this section, we explore how to optimally behave against subjects of dierent ages. In section 4.4.1 we assume a minimally strategic choice rule of other players (Markov strategy) and determine the best response. In section 4.4.2, we consider a more sophisticated contingent planning by subjects in the experiment and, again, determine the best response to such behavior. 128 4.4.1 Best response to Markov behavior Our rst approach consists of studying the best response strategy of an individual who plays against the empirical Markov strategy of his age group. Notwithstanding the limita- tions of Markov choices (we have shown that our subjects respond to decisions beyond the partner's last move) as well as the rationality and extensive knowledge requirements for such a best response behavior, the exercise is nevertheless instructive. Indeed, it explores the possibility that non-equilibrium behavior is a best response to other's choices rather than a non-strategic choice. Assume that subjects in age group i (2fG1;:::; G5g) play a Markov strategy given by Pr(C t jC t1 ) p i and Pr(C t jD t1 ) q i . Facing a Markov partner, it is optimal to play `always C' or `always D.' Indeed, it is not in the subject's best interest to condition the choice in round on the partner's choice in 1 since, by the Markov assumption, the partner in + 1 will decide between C and D only as a function of one's play in (independently of how the decision was reached). By playing always C and always D the expected payo every two rounds for a subject in age group i is, respectively: V i C = 4 + h 4p i + (1p i ) i and V i D = 6 + h 4q i + (1q i ) i where the rst term is the payo when the subject chooses and the second and third terms are the expected payo when the partner chooses given his Markov behavior. 15 It is then immediate that: V i C TV i D , p i T 2 3 +q i (1) In Figure 40 we depict the empirical Markov behavior of each age group, pooling data from both supergames. The x-axis is Pr(C t jC t1 )p i and the y-axis is Pr(C t jD t1 )q i . Each magnied dot represents the position of an age group in the (p i ;q i ) space, with the vertical and horizontal lines representing the error bars in the corresponding dimension (data taken from Figure 36). The diagonal line represents all the pairs where the subject is indierent between C and D (V i C = V i D ). The best response behavior is C in the lower right corner below the diagonal and D otherwise, see (1). Intuitively, C is optimal only when the partner is likely to reciprocate (Pr(C t jC t1 ) high) and unlikely to forgive (Pr(C t jD t1 ) low). Finally, the table in the upper left corner of the graph presents the per-round dierence in payos between playing `alwaysC' and playing `alwaysD' against a subject in age groupi. This dierence is positive when (p i ;q i ) is in the lower right corner 15 For completeness, the payo of playing the strategy Pr(CtjCt1) = 1 and Pr(CtjDt1) = 0 is V i CD = q i q i +(1p i ) V i C + (1p i ) q i +(1p i ) V i D and the payo of Pr(CtjCt1) = 0 and Pr(CtjDt1) = 1 is V i DC = (1q i ) (1q i )+p i V i C + p i (1q i )+p i V i D . V i CD and V i DC are always dominated by maxfV i C ;V i D g. 129 below the diagonal (as for G5) and negative otherwise (as for G1 to G4). Its absolute value increases as we move away from the diagonal. G1 G2 G3 G4 G5 V C > V D V C < V D Age group Payoff C−D G1 G2 G3 G4 G5 −0.88 −0.5 −0.2 −0.24 0.09 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 Pr(C t I C t−1 ) Pr(C t I D t−1 ) RP1 & RP2 Figure 40: Best response to Markov strategy by age group For our younger school-age subjects (G1 and G2) there is a signicant per-round payo loss of cooperation, whereas for our older school-age subjects (G3 and G4), the loss is more moderate. For our control population (G5), the best response involves cooperation. 16 The results suggest that while we know that the younger subjects are intrinsically less strategic and forward looking than their older peers, the observed dierences in cooperation might be exacerbated due to group membership of the partner: the same self-interested, strategic, forward looking individual should optimally choose D when interacting with young children (G1 or G2) and C when interacting with adults (G5). 4.4.2 Best response to simple strategies Since the observed actions strongly depend on past behavior, a natural step in the analysis is to determine the dynamic strategies employed by our subjects, and study the best response to such strategies. We consider up to nine possible simple strategies, many of them frequently discussed in the recent repeated prisoner's dilemma literature: (1) closed-handed tit-for-tat (defect if rst mover, then tit-for-tat); (2) open-handed tit-for-tat (cooperate if rst mover, then tit-for-tat); (3) always defect; (4) always cooperate; (5) grim trigger (cooperate until partner defects, then defect forever); (6) alternating (alternate 16 Needless to say, this is not an equilibrium: if subjects in an age group i always cooperate (pi = 1 and qi = 1, upper right corner) the best response is to always defect. 130 between defect and cooperate); (7) tit-for-two-consecutive-tats (cooperate unless opponent defects in each of the last two rounds, then defect once and revert to cooperation); (8) reverse tit-for-tat (choose opposite of the partner's preceding choice); and (9) reverse grim trigger (defect until partner cooperates, then cooperate forever). 17 Only six subjects (1:6%) play one of the last ve strategies (this does not include subjects who play strategies that are consistent with one of the last ve and also with one of the rst four). We will therefore restrict our attention to the rst four strategies, which we label as cT , oT , aD and aC. These strategies (together with grim trigger) are also the most commonly observed in the repeated prisoner's dilemma literature (Dal B o and Fr echette, 2014). With only two observations per subject (one string of actions in each supergame), the ensuing analysis is bound to be incomplete. 18 However, it can be instructive in determining whether subjects in dierent age groups behave according to simple strategies and, if so, which ones. 19 The left graph of Figure 41 presents a Venn diagram that depicts the number of subjects who play according to each of the four simple strategies described above. To be classied as using a strategy, the subject must conform to the same strategy in both supergames. To allow small \mistakes", we include subjects who perfectly conform to a strategy and those who deviate once. Finally, we note that some sequences of actions are compatible with more than one strategy (for example, two subjects who always cooperate with each other may be playingaC oroT ). The Venn diagram accounts for this possibility by locating the agent at the intersection of all the strategies compatible with the choices. 20;21 The table to the right of the diagram summarizes the percentage of subjects within each age group who conforms to these strategies, ordered from least to most cooperative, that is, aD to aC. 22 Approximately 65% and 85% of subjects in G1 and G5 respectively conform to a precise strategy (often compatible withaD andaC respectively) whereas the percentage is 17 Tit-for-tat has a slightly dierent interpretation in this game of perfect information than in the tra- ditional (game of imperfect information) prisoner's dilemma. Naturally, (1) and (2) are indistinguishable for second movers. 18 Insucient data due to a low number of supergames may partly account for the absence of grim trigger behavior in our sample. 19 For an informative and interesting (but substantially more complex) strategy elicitation method in repeated prisoner's dilemma, see Romero and Rosokha (2016). 20 Subjects are classied based on the closest strategy. So, if a behavior is compatible with one strategy given no deviation and another strategy given one deviation, then it classies the agent only in the strategy given no deviation. 21 We use this simple method because we do not have enough supergames to conduct a more sophisticated econometric estimation of strategies like the ones proposed by Dal B o and Fr echette (2011) or Camera, Casari, and Bigoni (2012) for example. 22 We performed the same analysis separately on each supergame and obtained similar results. 131 9 11 23 24 9 0 12 42 64 0 0 0 0 14 0 aD aC cT oT G1 G2 G3 G4 G5 aD .34 .25 .10 .06 .06 cT , aD .20 .09 .00 .00 .02 oT , cT , aD .08 .03 .00 .02 .00 cT .04 .01 .04 .02 .00 oT , cT .00 .00 .06 .05 .02 oT .00 .03 .08 .11 .10 cT , oT , aC .00 .00 .02 .03 .17 oT , aC .00 .03 .09 .13 .46 aC .01 .03 .06 .05 .02 other .33 .55 .56 .55 .15 Figure 41: Strategies (1 deviation allowed) signicantly smaller (around 45%) for the other age groups. This is not surprising, since we already knew from Figure 37 that G2, G3 and G4 have the most malleable behavior, that is, they are most willing to change their action in response to a change in their partner's choice. Interestingly, among the children who do not conform to any strategy, those in age group G3 forgive signicantly more (p-value = 0:007) and also prompt signicantly more (p-value = 0:036) than those in G2. At that age, a developmental turning point occurs that makes cooperation become sustainable. More generally, there is a gradual transition during childhood from a selsh/myopic to a cooperative/forward looking strategy. Most subjects in G1 play a strategy compatible with aD, as mentioned above. G2 is a weaker version of G1, with a small fraction of subjects choosing a strategy compatible with aC. Neither of these age groups has a signicant fraction of subjects whose behavior is consistent withoT and/or cT (but not aD or aC). We nd the opposite tendency in G3, G4 and G5, where only a few subjects have strategies compatible with aD and the majority have strategies compatible with oT and/or cT . Having determined the strategies of our subjects, we can now study the best response to these empirical choices. The analysis is trickier than it seems at rst, and requires some judgment calls. For each age groupi we compute the proportion of subjects i s who use each of the four main strategies s2 S =faD;cT;oT;aCg. Since we do not have a good theory on how to treat subjects who do not fall in any of these categories, we ignore other strategies and assume that partners in age group i play strategy s with probability i s i s P s2S i s . 23 Also, when a subject falls at the intersection of several strategies, we assign him to the one which is most responsive to the partner's choice. This means that 23 This is especially problematic in age-groups G2 to G4 where the aforementioned strategies only account for about one-half of our subjects. 132 oT andcT take precedence overaC andaD; in the case that bothoT andcT are consistent with the subject's choice, we assume that each of them is played with equal probability. 24 Table 25 presents for each age group the payo of a subject who follows one of the strategies given that the partner plays each strategy with the empirically observed proba- bilities i aD ; i cT ; i oT ; i aC . We report per-round payos assuming that the subject is the rst or second mover with equal probability and that he uses the same strategy in both cases. We also report in brackets the proportion of times the subject plays C and the proportion of times the partner plays C. 25 G1 G2 G3 G4 G5 aD 3.53 [.00, .02] 3.60 [.00, .07] 3.75 [.00, .17] 3.71 [.00, .14] 3.61 [.00, .07] cT 3.52 [.05, .05] 3.57 [.12, .13] 3.70 [.37, .38] 3.71 [.41, .41] 3.71 [.41, .41] oT 3.61 [.31, .28] 3.63 [.36, .33] 3.83 [.71, .70] 3.89 [.80, .79] 3.93 [.86, .86] aC 3.20 [1.0, .46] 3.16 [1.0, .44] 3.65 [1.0, .77] 3.78 [1.0, .85] 3.88 [1.0, .92] Per-round payo [prob. subject plays C, prob. partner plays C] (best response highlighted in bold) Table 25: Best response to simple strategies by age group. There are several interesting conclusions from this table. The best response strategy is oT for all age groups. The result is in line with the seminal ndings of Axelrod (2006), who highlights the desirable properties of tit-for-tat when confronted by a heterogeneous population: although it is not subgame perfect, it is a strategy that promotes cooperation, punishes deviation and forgives easily. Despite the fact that the optimal strategy is the same in all age groups, outcomes vary widely. Playing oT against our younger school-age subjects (G1 and G2) would result in the cooperative outcome only 28% to 36% percent of the time, and therefore to payos above but not far from those under sustained defection. By contrast, playing oT against our our older school-age and control groups (G3, G4 and G5) would result in the cooperative outcome 70% to 86% of the time, reaching payos below but reasonably close to those under full cooperation. It is also worth noting that payo dierences between oT and cT are signicant in all age groups, even though these strategies are only distinct one-half of the time (when the subject is a second mover). This payo dierence supports the nding in Table 24, which emphasized the key importance of the rst decision in the supergame. Finally, we can see from the table in Figure 41 that the proportion of subjects who uses oT , the best response strategy to the empirical behavior 24 There are many other possibilities: equal likelihood of all strategies consistent with behavior, assign subjects to the least responsive strategy, etc. We opted for the most responsive as it magnies the eect of the subject's action on the partner's choice. 25 Naturally per-round payos can be computed from those proportions alone, since each time a subject plays C (D) he obtains 4 (6) and each time the partner plays C (D) he obtains 4 (1). 133 of others, increases with age: 8% of participants in G1, 9% in G2, 25% in G3, 35% in G4 and 75% in G5 adopt a strategy compatible with oT . This suggests that participants develop correct beliefs about others over time and gradually learn to best respond to their behavior. 4.4.3 Summary This section highlights the similarities and dierences between age groups. While the best response strategy is identical in all cases (oT ), the resulting behavior is widely dierent: the same selsh, rational forward-looking subject should (optimally) defect when paired with a younger partner and cooperate when paired with an older partner. 26 The observed dierences in the levels of cooperation across age groups are therefore the consequence of two mutually reinforcing factors: (i) the dierences in preferences and level of strategic reasoning and (ii) the anticipation of the dierences in the partners' preferences and level of strategic reasoning. Stated dierently, younger children are less strategic and forward than older children and adults, which explains their lower levels of cooperation. However, even the young children who are strategic and forward looking should optimally behave less cooperatively than their older peers. This conclusion is summarized below. Result 4 Dierences in cooperation across ages are magnied by group eects, that is, by the anticipation that partners in dierent age groups have dierent tendencies to cooperate. 4.5 Payos We next study earnings. Figure 42 displays the average per-round payo by age group both unconditional (left graph) and conditional on the subject's behavior in the one-shot game (right graph), pooling data from both supergames. We also report as benchmark horizontal lines the theoretical per-round payos that subjects would obtain under sustained mutual cooperation (4.0) and sustained mutual defection (3.5). Since cooperation increases with age (Figure 35), it is not surprising that payos also increase with age (dierences signicant except for G3 vs. G4). Payos within an age group depend on the subject's behavior in the one-shot game. Within our youngest school-age subjects (G1), those who play C in the one-shot game earn less than those who play D: altruism is exploited by their peers. Within our oldest school-age group (G4), the pattern 26 This result is based on indirect evidence. To obtain direct evidence, one should include a design that mixes subjects of dierent age groups. Although a fascinating possibility, implementing such design has its challenges. Indeed, our experience is that kids behave dierently against partners of dierent ages for reasons that extend beyond strict game theoretic considerations (for example, they can be shy or impressed when they face an older kid). For this reason, we decided against a mixed-age treatment. 134 79 77 114 64 48 3.0 3.5 4.0 4.5 G1 G2 G3 G4 G5 Age group Per−round payoff 11 27 57 38 19 68 50 57 26 29 3.0 3.5 4.0 4.5 G1 G2 G3 G4 G5 Age group Per−round payoff C in OS D in OS (graphs show standard error bars and number of subjects per age group) Figure 42: Per-round payos by age group: unconditional (left) and as a function of choice in OS (right) is reversed: subjects who play C in the one-shot game are more likely to start a mutually advantageous agreement, earning more than those who play D. Finally, for our control adult group, behavior in the one-shot game has no predictive power on overall gains in the supergames: subjects who play C in the one-shot game earn as much as those who play D. This is similar to the ndings in Dreber et al. (2014). In our case, the reason is that independently of altruism, the vast majority of the adults succeed in coordinating on the mutually advantageous strategy. While the behavior in the one-shot game is an indicator of the subject's altruism, the behavior in the rst round of the rst supergame captures (to a certain extent) the willingness to initiate cooperation. To study how the choice in the rst round aects the long run payo of subjects, we perform the following analysis. We divide the sample into supergames where the rst mover chose C in the rst round and those where he chose D. We then compute the per-round payo of the subjects from round 3 on (3 to 16 for the rst supergame and 3 to 12 for the second) pooling both supergames together. 27 Figure 43 presents these average per-round payos from the perspective of the rst (left graph) and second (right graph) mover. For school-age subjects who moved rst, dierences in per-round payos after round 3 are not statistically signicant between those who played C and those who played D in round 1. However, it is interesting to see that subjects in G1 and G2, if anything, earn 27 We remove round 1 to avoid an articial dierence in average payos due to the dierence in behavior on the variable we are conditioning on (choice in round 1). We also remove round 2 to make sure that we count the same number of rounds as dictator and recipient for all subjects, otherwise the payos of rst and second movers are not comparable. 135 4 74 17 61 73 41 47 17 42 6 3.0 3.5 4.0 4.5 G1 G2 G3 G4 G5 Age group Per−round payoff 1st round C D First mover 4 76 17 59 73 41 47 17 42 6 3.0 3.5 4.0 4.5 G1 G2 G3 G4 G5 Age group Per−round payoff 1st round C D Second mover (graphs show standard error bars and number of subjects per age group) Figure 43: Per-round payos of rst (left) and second (right) mover from round 3 on, and as a function of choice in round 1 less when they start playingC, as they get exploited (though, the number of observations is small and the dierence is not signicant at the conventional 5% level). By contrast, the rst action is crucial in G5: starting with C results almost invariably in sustained cooperation whereas starting withD results also almost invariably in sustained defection. This reinforces the results of section 4.3.2 where we found that subjects in our control population could not be prompted to cooperate (Figure 37). The result is also consistent with the developmental turning point around G3, as mentioned earlier: playing C in the rst round pays o when children are capable of adapting their behavior to the behavior of others. Dierences are starker for second movers, where subjects in all age groups signicantly benet from a partner who starts by playing C, either by taking advantage of them (younger school-age subjects) or eciently coordinating in mutual cooperation (older school-age subjects and control group). More generally, the analysis of payos supports the ndings of section 4.4, where we showed that identical actions have dierent consequences depending on the age of the partner. More specically, subjects who are altruistic or strategic givers tend to obtain high rents against older partners but low rents against younger ones. To better disentangle the impact of altruism and strategic motives on rents, we conduct an OLS regression of the Per-round payo of all subjects from round 3 on. Altruism is again measured by the choice in the one-shot game (a) and strategic motives are captured by the choice in the rst round of the supergame. The results are reported in Table 26. As we can see from the rst column, altruism has a moderate positive impact on payos, 136 Per-round payo Altruism 0.065 -0.011 -0.010 (0.025) (0.025) (0.025) Choice C rst trial 0.174 0.174 (0.030) (0.030) Dummy G2 0.029 0.028 (0.037) (0.037) Dummy G3 0.097 0.098 (0.038) (0.038) Dummy G4 0.102 0.105 (0.044) (0.045) Dummy G5 0.191 0.194 (0.048) (0.048) Dummy Supergame 2 0.010 0.007 0.007 (0.025) (0.023) (0.023) Gender Female -0.020 (0.024) Number of siblings -0.005 (0.010) Constant 3.690 3.562 3.578 (0.020) (0.028) (0.032) Adj. R 2 0.009 0.141 0.142 obs. 764 764 764 (standard errors in parentheses) , , = signicant at 5%, 1% and 0.1% level Table 26: OLS regression of Per-round payos across supergames through the likelihood of engaging in a long term cooperative agreement. However, when we include age and strategic considerations, we observe that being in an older age group and being in a pair that cooperates from the outset become the main determinants of payos. Once we control for these factors, the eect of altruism on rents disappears. 4.6 Concluding remarks In this study, we have investigated developmental aspects of ecient but costly sharing in dynamic relationships. We have identied three main drivers: altruism, strategic adapta- tion to partner's decisions and strategic anticipation of cooperative gains. It is interesting to note that only in the older age groups a signicant fraction of subjects best respond to the empirical distribution of play in their group. These ndings are reminiscent of recent studies showing that observed behavior diers as a function of 137 the players' expertise (Palacios-Huerta and Volij, 2009) or IQ (Proto, Rustichini, and Soanos, 2016). As in this literature, subjects in dierent ability categories (in our case, age groups) play dierently. Those in the highest ability category are the most strategic, both in adapting to the choices of others as well as in anticipating what the behavior of the partners will be. This in turn suggests that there might exist a cognitive link between the ability to think hypothetically about future consequences and the ability to form beliefs about others. Children who simply adapt to past play are not only unable to assess hypothetical future actions but also to assess the likelihood that others might choose those actions. We conjecture that, as hypothetical thinking develops, the abilities to foresee, best respond and form correct beliefs about others develop jointly. While the connection between Theory of Mind and game theory is well established (Singer and Fehr, 2005), our understanding of the logic required to perform well in games is still imperfect. First, logical thinking is multi-facetted and social interactions require dierent types of reasoning (to make correct inferences and deductions, to anticipate future outcomes, and to logically best-respond to what is inferred and deduced). Second, logical thinking in strategic settings varies in complexity. As summarized for example in Camerer (2003), the body of experimental evidence indicates that decision-makers are able to play close to Nash easily in some games (e.g., coordination games), if they are given enough learning opportunities in some others (e.g., guessing games), and rarely in yet other cases (e.g., games of asymmetric information). This dierence in the likelihood of reaching the equilibrium is likely due to dierences in the logic required to solve those games and the complexity involved. It is, however, dicult to assess which type of logical ability is lacking in adults, because behavior re ects the interplay of all of one's abilities, some of them perhaps not fully acquired. By studying the development of strategic thinking, it becomes possible to assess the contributions of dierent logical abilities to strategic behavior. From our study, it seems that hypothetical thinking is key for sustaining cooperation and plays a bigger role than altruism. Further studies on strategic thinking in children should prove helpful to build behavioral models capable of both explaining and predicting the heterogeneity in behavior observed in experimental studies. Last but not least, understanding how children and adolescents reason about choices and make strategic decisions is crucial for designing policies around school-age children and adolescents, and for enhancing a favorable educational environment. Perhaps the most important ndings relate to the heterogeneity of behavior across ages due to the dierences in perspectives, motivations and logical abilities to devise strategies. This means for instance that interventions aimed at regulating interactions between children (e.g., bullying) should control for age, as we cannot assign the same intentions to a young child than to an adolescent. In our study, non-cooperative behavior can be due to a 138 lack of altruism or an inability to anticipate the gains of cooperation. Adults tend to associate anti-social behavior to an impaired capacity to relate emotionally to others, and they usually frame that behavior negatively. Our study suggests that young children may behave in an anti-social manner simply because they are not able to draw logical conclusions about what may happen under dierent scenarii. Understanding motivations behind the actions of children may be helpful for determining whether intervention in the regulation of a con ict should take the form of a stick (a punishment for not relating emotionally to others) or a carrot (an explanation for why thinking thoroughly would yield a superior outcome). 139 4.7 References Axelrod, R. M. (2006). The evolution of cooperation. Basic books. Blake, P. R. and D. G. Rand (2010). Currency value moderates equity preference among young children. Evolution and human behavior 31 (3), 210{218. Blake, P. R., D. G. Rand, D. Tingley, and F. Warneken (2015). The shadow of the future promotes cooperation in a repeated prisoner's dilemma for children. Scientic reports 5. Brocas, I. and J. D. Carrillo (2016). Preschoolers can think strategically. Working Paper. Brocas, I. and J. D. Carrillo (2017). The development of rational thinking from kinder- garten to adulthood. Working Paper. Brosig-Koch, J., T. Heinrich, and C. Helbach (2012). Exploring the capability to backward induct{an experimental study with children and young adults. Ruhr Economic Paper. Camera, G., M. Casari, and M. Bigoni (2012). Cooperative strategies in anonymous economies: An experiment. Games and Economic Behavior 75 (2), 570{586. Camerer, C. (2003). Behavioral game theory: Experiments in strategic interaction. Prince- ton University Press. Carpendale, J. I. and C. Lewis (2004). Constructing an understanding of mind: The development of children's social understanding within social interaction. Behavioral and brain sciences 27 (01), 79{96. Czermak, S., F. Feri, D. Gl atzle-R utzler, and M. Sutter (2016). How strategic are children and adolescents? experimental evidence from normal-form games. Journal of Economic Behavior & Organization. Dal B o, P. (2005). Cooperation under the shadow of the future: experimental evidence from innitely repeated games. The American Economic Review 95 (5), 1591{1604. Dal B o, P. and G. R. Fr echette (2011). The evolution of cooperation in innitely repeated games: Experimental evidence. The American Economic Review 101 (1), 411{429. Dal B o, P. and G. R. Fr echette (2014). On the determinants of cooperation in innitely repeated games: A survey. Available at SSRN 2535963 . 140 Dreber, A., D. Fudenberg, and D. G. Rand (2014). Who cooperates in repeated games: The role of altruism, inequity aversion, and demographics. Journal of Economic Behavior & Organization 98, 41{55. Engel, C. (2011). Dictator games: A meta study. Experimental Economics 14 (4), 583{610. Fabes, R. A. and N. Eisenberg (1998). Meta-analyses of age and sex dierences in children's and adolescents' prosocial behavior. Handbook of Child Psychology, 3. Feeney, A. and E. Heit (2007). Inductive reasoning: Experimental, developmental, and computational approaches. Cambridge University Press. Fehr, E., H. Bernhard, and B. Rockenbach (2008). Egalatarianism in young children. Nature 454(28), 1079{1084. Fehr, E., D. Gl atzle-R utzler, and M. Sutter (2013). The development of egalitarianism, altruism, spite and parochialism in childhood and adolescence. European Economic Review 64, 369{383. Fr echette, G. R. and S. Yuksel (2013). Innitely repeated games in the laboratory: Four perspectives on discounting and random termination. Available at SSRN 2225331 . Fudenberg, D., D. G. Rand, and A. Dreber (2012). Slow to anger and fast to forgive: Cooperation in an uncertain world. The American Economic Review 102 (2), 720{749. Homan, M. L. (2001). Empathy and moral development: Implications for caring and justice. Cambridge University Press. House, B., J. Henrich, B. Sarnecka, and J. B. Silk (2013). The development of contingent reciprocity in children. Evolution and Human Behavior 34 (2), 86{93. Hughes, M. (1975). Egocentrism in preschool children. Ph. D. thesis, Edinburgh University. Jordan, J. J., K. McAulie, and F. Warneken (2014). Development of in-group favoritism in children's third-party punishment of selshness. Proceedings of the National Academy of Sciences 111 (35), 12710{12715. Lergetporer, P., S. Angerer, D. Gl atzle-R utzler, and M. Sutter (2014). Third-party pun- ishment increases cooperation in children through (misaligned) expectations and con- ditional cooperation. Proceedings of the National Academy of Sciences 111 (19), 6916{ 6921. 141 Palacios-Huerta, I. and O. Volij (2009). Field centipedes. The American Economic Re- view 99(4), 1619{1635. Perner, J. (1991). Understanding the representational mind. The MIT Press. Piaget, J. (1960). The Psychology of Intelligence. Totowa, NJ: Littleeld Adams & Co. Piaget, J. (1972). Intellectual evolution from adolescence to adulthood. Human develop- ment 15(1), 1{12. Piaget, J. and B. Inhelder (1956). The Child's Conception of Space. Routledge and Kegan Paul, Ltd., London, England. Premack, D. and G. Woodru (1978). Does the chimpanzee have a theory of mind? Behavioral and brain sciences 1 (04), 515{526. Proto, E., A. Rustichini, and A. Soanos (2016). Intelligence, personality and gains from cooperation in repeated interactions. Working Paper. Rafetseder, E., M. Schwitalla, and J. Perner (2013). Counterfactual reasoning: From childhood to adulthood. Journal of experimental child psychology 114 (3), 389{404. Romero, J. and Y. Rosokha (2016). Constructing strategies in indenitely repeated pris- oner's dilemma. Working Paper. Sher, I., M. Koenig, and A. Rustichini (2014). Children's strategic theory of mind. Pro- ceedings of the National Academy of Sciences 111 (37), 13307{13312. Singer, T. and E. Fehr (2005). The neuroeconomics of mind reading and empathy. Amer- ican Economic Review 95 (2), 340{345. Sutter, M. and M. G. Kocher (2007). Trust and trustworthiness across dierent age groups. Games and Economic Behavior 59 (2), 364{382. Wellman, H. M., D. Cross, and J. Watson (2001). Meta-analysis of theory-of-mind devel- opment: the truth about false belief. Child development 72 (3), 655{684. Wellman, H. M. and D. Liu (2004). Scaling of theory-of-mind tasks. Child develop- ment 75(2), 523{541. 142 4.8 Appendix 4.8.1 Appendix A: analysis of one-shot games (b), (c) and (d) We can use the one-shot games to study other-regarding preferences. We dene the same ve types as in Fehr et al. (2008, 2013) depending on the choices in games (b), (c) and (d): \strongly egalitarian" (choices (2,2); (2,2); (2,2)), \weakly egalitarian" (choices (2,2); (2,2); (4,0)), \strongly generous" (choices (2,2); (2,4); (2,2)), \weakly generous" (choices (2,2); (2,4); (4,0)), and \spiteful" (choices (2,0); (2,2); (4,0)). Fig.44, reports the pro- portion of subjects who belong to each of these ve categories in Fehr et al. (2008, 2013) (upper graphs) and in our study (lower graph). 0 10 20 30 40 50 60 70 80 90 100 3y-4y 5y-6y 7y-8y Fehr et al. (2008) 0 10 20 30 40 50 60 70 80 90 100 8y-9y 10y-11y 12y-13y 14y-15y 16y-17y Fehr et al. (2013) 0 10 20 30 40 50 60 70 80 90 100 G1 G2 G3 G4 G5 5y-7y 8y-10y 11y-13y 14y-17y 18y-24y This study Spiteful Weakly generous Strongly generous Weakly egalitarian Strongly egalitarian Figure 44: Evolution of other-regarding preferences. The results between the papers are not directly comparable since the methods and age categories are not identical. Also, notice that behavior in the overlapping categories 7y-8y (Fehr et al., 2008) and 8y-9y (Fehr et al., 2013) are somewhat dierent. Despite the caveats, we nd a remarkable consistency across studies in the evolution of behavior 143 with age (if not in the levels). In all three studies, spite is found to decrease with age (though less noticeably in Fehr et al. (2008)). Children's spite can be interpreted as a re ection of their tendency to focus on one aspect of the decision (tokens for me) instead of incorporating both aspects (tokens for me and tokens for other), consistent with the centration hypothesis. As they grow, they develop some integrative reasoning and become more egalitarian and less spiteful. Starting at age 12-13, egalitarianism is progressively replaced by generosity. Weak generosity is dominant among young adults. 28 4.8.2 Appendix B: transcript of instructions (school-age subjects) Introduction Hi everyone, my name is Niree and these are my helpers. We are scientists from USC and we are here today with games for you to play. These games will help us learn more about how people your age make choices. Your teachers and your parents have said that you can play these games if you want, but you don't have to. If you don't want to play our games, let us know and we'll take you back to class. So what do you think? Do you want to play our games? Please go ahead and read the consent form that is on your desk. When you are done reading it and would like to play our games, go ahead and complete the section in the back with your full name, the date, and your signature. Okay. You'll be playing a few games on your computers. In all of the games, you have a chance to win tokens. When we are all done today, the computer will count how many tokens you earned all together. • For kindergarten and elementary school participants: At the end, you will exchange your tokens for toys! The more tokens you have, the more toys you can get. • For middle and high school participants: But, we won't give you tokens at the end - we'll give you real money to spend on Amazon! We will give you an Amazon gift card with the money you earned on it. More tokens always means more money. Just so you have an idea about the amount, you should be able to buy a book or some songs, or something else in that range. Are there any questions before we begin? One-shot game: \Split game" In this rst game, your job is to tell us how you want to split tokens among you and someone else. You will see a screen like this: [insert SCREEN 1] The hand pointing out of the screen means \you" and the hand pointing to the side means \someone else." One thing you could tell us is that you and the other student should split 3 tokens such that you get 1 token and the other student gets 2. Or you could tell us that you and the other student should split 2 tokens such that you get 2 tokens and the other student gets 0 tokens. 28 We notice some behavioral dierences across genders: younger school-age females are more pro-social than males (p-value = 0.074 in G1 and 0.004 in G2) while older school-age females are more envious than males (p-value = 0.042 in G3 and 0.047 in G4). No other gender dierences are statistically signicant. 144 If you want to split the tokens like this (point to top option), tap the screen anywhere in this box (point to top box) and if you want to split the tokens like this (point to top option), tap anywhere in this box (point to bottom box). You will not know who the \other" is that you are splitting with and the computer won't know either. After you make each choice, the computer will randomly pick someone else to receive the coins you gave away. That's going to happen after every choice you make. So, after every choice you make, the computer picks someone randomly and gives them the tokens you chose to give away. Are there any questions? Now, turn to your tablets and look at the rst choice. When you come to the end, you will see a stop sign. Supergame 1: \partner game" Now for the next game. In this game, you will be partnered up with someone else in this room but neither of you will know you're partners. You will be partners with same person for the whole game. No one will know who they're partnered with and it's not the point of the game to nd out. In this game, you can have two roles: you can be choosing or you can be waiting. The computer will decide which of you is choosing rst and then you will switch roles after that. Your job is to tell us how you and your partner should split tokens. If it's your turn to choose, you'll see a screen like this: [insert SCREEN 2] You will always see these two boxes when it's your turn to chose and your job is to tell us which one you like better. One thing you could tell us is that you should get 6 tokens and your partner should get 1. Or you could tell us that you should get 4 tokens and your partner should get 4 as well. If you want to split the tokens like this (point to top option), tap the screen anywhere in this box (point to top box) and if you want to split the tokens like this (point to top option), tap anywhere in this box (point to bottom box). You can change your mind if you like, but when you tap \OK" your choice will be locked-in. Now, when you are choosing, your partner will be waiting, and their screen will look like this: [insert SCREEN 3] Let's say that you choose 6 tokens for yourself and 1 for your partner. Your screen will then look like this: [insert SCREEN 4] This means that you chose to keep 6 tokens and give your partner 1. Your partner will see a screen like this: [insert SCREEN 5] The tokens you give yourself will show in blue on the middle of your screen (point to history box) and the amount your partner sends you will show up in red right here. Below that, you can nd the total tokens you have so far from this game. For the next choice, you will wait and your partner will choose. You will keep switching between choosing and waiting every round. After, let's say ve rounds, your screen might look like this: [insert SCREEN 6] Can someone remind me who the blue tokens are from? What about the red ones? OK, let's go over what happened in each round. The bottom row corresponds to the rst round. In the rst round, you gave yourself 6 tokens. Then what happened in the second round? (your partner gave you 1 token) What about in the 145 third round? (you gave yourself 4 tokens) What happened in the fourth round? (your partner gave you 4 tokens) What happened in the fth round? (you gave yourself 6 tokens) This is just an example. During the experiment you can choose any option you want. Remember that you will keep the same partner for the whole game and that there will be many alternating rounds. Any questions? Now, turn to your tablets and look at the rst choice. When you come to the end, you will see a stop sign. Supergame 2: \partner game" OK. We are going to play this game one more time but this time, you are partnered up with someone else in this room. We'll be doing the same thing, but this time with a new partner. You will be playing with your new partner for the whole game. No one will know who their partner is and it's not the point of the game to nd out. Again, you can have two roles: you will either be choosing or waiting. The computer will decide which of you is choosing rst and after that you will keep switching roles with your partner. Again, there will be many alternating rounds in this game. Are there any questions? Now, turn to your tablets and look at the rst choice. When you come to the end, you will see a stop sign. (a) Screen 1 (b) Screen 2 (c) Screen 3 (d) Screen 4 (e) Screen 5 (f) Screen 6 Figure 45: Screenshots for the instructions 146
Abstract (if available)
Linked assets
University of Southern California Dissertations and Theses
Conceptually similar
PDF
Value-based decision-making in complex choice: brain regions involved and implications of age
Asset Metadata
Creator
Kodaverdian, Niree
(author)
Core Title
The evolution of decision-making quality over the life cycle: evidence from behavioral and neuroeconomic experiments with different age groups
School
College of Letters, Arts and Sciences
Degree
Doctor of Philosophy
Degree Program
Economics
Publication Date
05/08/2017
Defense Date
05/11/2017
Publisher
University of Southern California
(original),
University of Southern California. Libraries
(digital)
Tag
aging,altruism,behavioral economics,brain imaging,complexity,consistency,decision-making,developmental decision-making,dictator game,experimental economics,fMRI,laboratory experiment,neuroeconomics,Neuroscience,OAI-PMH Harvest,repeated games,revealed preferences,strategic giving,transitivity
Language
English
Contributor
Electronically uploaded by the author
(provenance)
Advisor
Brocas, Isabelle (
committee chair
), Carrillo, Juan (
committee member
), Monterosso, John (
committee member
)
Creator Email
kodaverd@usc.edu,nkodaverdian@gmail.com
Permanent Link (DOI)
https://doi.org/10.25549/usctheses-c40-372943
Unique identifier
UC11255993
Identifier
etd-Kodaverdia-5325.pdf (filename),usctheses-c40-372943 (legacy record id)
Legacy Identifier
etd-Kodaverdia-5325.pdf
Dmrecord
372943
Document Type
Dissertation
Rights
Kodaverdian, Niree
Type
texts
Source
University of Southern California
(contributing entity),
University of Southern California Dissertations and Theses
(collection)
Access Conditions
The author retains rights to his/her dissertation, thesis or other graduate work according to U.S. copyright law. Electronic access is being provided by the USC Libraries in agreement with the a...
Repository Name
University of Southern California Digital Library
Repository Location
USC Digital Library, University of Southern California, University Park Campus MC 2810, 3434 South Grand Avenue, 2nd Floor, Los Angeles, California 90089-2810, USA
Tags
altruism
behavioral economics
brain imaging
complexity
consistency
decision-making
developmental decision-making
dictator game
experimental economics
fMRI
laboratory experiment
neuroeconomics
repeated games
revealed preferences
strategic giving
transitivity